CN116311209B

CN116311209B - Window detection method, system and electronic equipment

Info

Publication number: CN116311209B
Application number: CN202310316512.8A
Authority: CN
Inventors: 张博; 支蕴倩; 潘霖; 李海峰
Original assignee: Beijing Deepctrl Co ltd
Current assignee: Beijing Deepctrl Co ltd
Priority date: 2023-03-28
Filing date: 2023-03-28
Publication date: 2024-01-19
Anticipated expiration: 2043-03-28
Also published as: CN116311209A

Abstract

The invention provides a window detection method, a system and electronic equipment, comprising the following steps: detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; screening a target intersection point set containing at least three intersection points from the intersection point set; a rectangular window is determined based on the set of target intersections. In the mode, the foreground window of the current display interface is accurately identified by acquiring the non-shielded target points in the foreground window and constructing a rectangular window through the target points, so that the error of window identification is reduced.

Description

Window detection method, system and electronic equipment

Technical Field

The present invention relates to the field of window detection technologies, and in particular, to a window detection method, system, and electronic device.

Background

When a computer is used in daily life, popup windows from different application programs are often received, and when bad contents such as privacy windows or advertisements are popped up on a screen shared in real time in video public playing occasions such as conference projection, video conference and public screen, the negative effects such as privacy leakage can be caused. The window can be detected in the window pop-up mode through the window detection algorithm, and then the window can be processed, so that not only is the broadcasting of bad content/private content avoided, but also normal playing interruption of a screen is avoided to the greatest extent.

The existing window detection algorithm is to identify the rectangular frame of the window, and when the window is overlapped and the lower window area is not fully displayed, the window cannot be accurately identified; when the window is larger (for example, the current window is an empty word document window), because a large single-color area is arranged in the area, the information of the area is excessively large in the large-view feature map, and extraction of the edge features of the window is not facilitated, so that the identified window has larger error.

Disclosure of Invention

Therefore, the invention aims to provide a window detection method, a system and electronic equipment, so that a foreground window of a current display interface is accurately identified, and errors of window identification are reduced.

In a first aspect, an embodiment of the present invention provides a window detection method, including: detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; screening a target intersection point set containing at least three intersection points from the intersection point set; a rectangular window is determined based on the set of target intersections.

Further, the labeling information comprises a label value, a target point abscissa, a target point ordinate, a labeling frame length value and a labeling frame width value; the label values include upper left corner, lower left corner, upper right corner, lower right corner, intersection-upper left, intersection-lower left, intersection-upper right, and intersection-lower right; the intersection points are the intersection points among the foreground windows, and when the intersection points are at the upper left part of the covered foreground windows, the label value of the labeling information of the intersection points is the intersection point-upper left part; when the intersection point is at the left lower part of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-left lower part; when the intersection point is at the upper right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-upper right; when the intersection point is at the lower right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-lower right.

Further, the step of generating the intersection point set based on the target point includes: acquiring first labeling information and second labeling information of any two target points; the first labeling information comprises a first label value, a first target point abscissa and a first target point ordinate; the second labeling information comprises a second label value, a second target point abscissa and a second target point ordinate; when the first label value is the upper left corner, the second label value is the upper right corner, the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, the first labeling information and the second labeling information are determined to belong to the same intersection point set; when the first label value is at the lower left corner, the second label value is at the lower right corner, the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, determining that the first labeling information and the second labeling information belong to the same intersection point set; when the first label value is the upper left corner, the second label value is the lower left corner, the abscissa of the first target point is equal to the abscissa of the second target point, and the ordinate of the first target point is smaller than the ordinate of the second target point, determining that the first labeling information and the second labeling information belong to the same intersection point set; when the first label value is the upper right corner, the second label value is the lower right corner, the abscissa of the first target point is equal to the abscissa of the second target point, and the ordinate of the first target point is smaller than the ordinate of the second target point, the first labeling information and the second labeling information are determined to belong to the same intersection point set.

Further, the step of determining a rectangular window based on the set of target intersection points includes: acquiring labeling information of all target points in the target intersection point set; determining the minimum target point of the abscissa and the ordinate of the target point as the upper left corner of the rectangular window; or determining the target point with the largest target point abscissa and target point ordinate as the right lower corner of the rectangular window; and determining the rectangular window based on the upper left corner or the lower right corner of the rectangular window and the rest target points in the target intersection point set.

Further, the target detection model is obtained through training in the following way: acquiring a pre-acquired foreground window image and a pre-acquired background image; the background image comprises a foreground window image with maximized window; constructing a plurality of display interfaces according to the background image and the foreground window image, and determining labeling information of the foreground window; wherein, the display interface comprises a background image and at least one foreground window; dividing the display interface and the corresponding labeling information into a first training set and a first verification set according to a first preset proportion; and training the initial target detection model by taking the display interface in the first training set as input and the labeling information in the first training set as output until the preset training requirement is met, so as to obtain the target detection model.

Further, the method further comprises: and inputting the rectangular window into a pre-trained window classification model, and outputting the window type of the rectangular window.

Further, the window classification model is obtained through training by the following method: acquiring a pre-acquired foreground window image and window types of the foreground window image; dividing the foreground window diagram and the window type corresponding to the foreground window diagram into a second training set and a second verification set according to a second preset proportion; and training the initial convolutional neural network model by taking the foreground window graph in the second training set as input and taking the window type in the second training set as output until the preset training requirement is met, so as to obtain a window classification model.

Further, the foreground window image is acquired by the following method: acquiring a pre-recorded screen use video; capturing images of a screen frame by using video to obtain a plurality of images to be processed; based on a preset foreground window rule, capturing a foreground window in the image to be processed as a foreground window image, and marking the window type of the foreground window image.

In a second aspect, an embodiment of the present invention provides a window detection system, including: the target point acquisition module is used for detecting a target point in the display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; the intersection point set generation module is used for generating an intersection point set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; the target intersection point set acquisition module is used for screening a target intersection point set containing at least three intersection points from the intersection point set; and the rectangular window determining module is used for determining a rectangular window based on the target intersection point set.

In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, and a processor, where the memory stores a computer program executable on the processor, and where the processor implements a method as described above when executing the computer program.

The embodiment of the invention provides a window detection method, a system and electronic equipment, comprising the following steps: detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; screening a target intersection point set containing at least three intersection points from the intersection point set; a rectangular window is determined based on the set of target intersections. In the mode, the foreground window of the current display interface is accurately identified by acquiring the non-shielded target points in the foreground window and constructing a rectangular window through the target points, so that the error of window identification is reduced.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a window detection method according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of a detection interface according to a first embodiment of the present invention;

FIG. 3 is a flowchart of training a target detection model according to a first embodiment of the present invention;

FIG. 4 is a flowchart illustrating a step of generating an intersection set based on a target point according to an embodiment of the present invention;

FIG. 5 is a flowchart illustrating steps for determining a rectangular window based on a set of target intersections according to an embodiment of the present invention;

FIG. 6 is a training flowchart of a window classification model according to an embodiment of the present invention;

fig. 7 is a diagram of a window detection system according to a second embodiment of the present invention.

Icon: 1-a target point acquisition module; 2-an intersection point set generating module; 3, a target intersection point set acquisition module; 4-rectangular window determination module.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The window detection algorithm can be used for content security and privacy protection in video public playing occasions such as conference projection, video conference, public screen and the like. Specifically, for video conferences, it may be detected in real time whether windows that are not desired to be played appear in the currently shared screen (e.g., a WeChat window, possibly revealing privacy), or popup advertisements appear (which are inappropriately advantageous in a net class scenario); for public screens, bad content broadcasting caused by misoperation of background personnel or hacking is detected, and negative influence is caused. Through the window detection algorithm, the window position and the type can be accurately positioned, and further, the window content which is not wanted to be played is accurately shielded (such as image blurring operation), so that the playing of bad content/private content is avoided, and meanwhile, the normal playing interruption of a screen is also avoided to the greatest extent (only a WeChat window or a popup window is shielded in a video conference, and normally shared content can be normally played).

In order to facilitate understanding of the present embodiment, the following describes embodiments of the present invention in detail.

Embodiment one:

fig. 1 is a flowchart of a window detection method according to an embodiment of the present invention.

Referring to fig. 1, the window detection method includes:

step S101, detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections.

Here, when only one foreground window exists in the display interface, the vertex of the foreground window is recorded as a target point; when the display interface comprises at least one foreground window, removing the covered vertexes on the basis of recording the vertexes of the foreground windows as target points, and newly adding the intersection points of the foreground windows as the target points. Referring to fig. 2, the labeling information is displayed in the form of a labeling frame in the detection interface.

In one embodiment, the labeling information includes a label value c, a target point abscissa x, a target point ordinate y, a labeling frame length value w, and a labeling frame width value h. The labeling information of the target point is marked as (c, x, y, w, h).

Here, w and h are the long value of the label frame and the width value of the label frame, and can be set to 20 pixels according to actual conditions.

The label value c includes an upper left corner, a lower left corner, an upper right corner, and a lower right corner.

Specifically, referring to fig. 2, the tag value c is a position where the current foreground window is, and the tag value c may be the same for a page including a plurality of foreground windows.

The tag value c also includes cross-point-top left, cross-point-bottom left, cross-point-top right, and cross-point-bottom right.

The intersection points are the intersection points among the foreground windows, and when the intersection points are at the upper left part of the covered foreground windows, the label value of the labeling information of the intersection points is the intersection point-upper left part; when the intersection point is at the left lower part of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-left lower part; when the intersection point is at the upper right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-upper right; when the intersection point is at the lower right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-lower right.

Specifically, referring to fig. 2, the label value c of an intersection point is the position of the intersection point in the covered foreground window, so the label value may be the same for the intersection point on the same covered foreground window.

Specifically, referring to fig. 2, the tag value c

In one embodiment, referring to FIG. 3, the object detection model is trained by:

step S201, acquiring a pre-acquired foreground window image and a pre-acquired background image; the background image comprises a foreground window image with maximized window.

Here, the foreground window map is acquired by the following method: acquiring a pre-recorded screen use video; capturing images of a screen frame by using video to obtain a plurality of images to be processed; based on a preset foreground window rule, capturing a foreground window in the image to be processed as a foreground window image, and marking the window type of the foreground window image.

Specifically, the background image may be a wallpaper image randomly collected from a network, or may be a foreground window image with maximized window. The foreground window image is obtained by manually capturing various foreground windows to be detected, and the types of the foreground window image comprise WeChat, word, browser and the like.

The foreground window image can be obtained in two modes. The first way is to use screen recording software to record the screen, open various windows during the screen recording, and randomly move the position or change the size of the windows. The recorded video is then manually annotated to each image after the image frames are captured. And marking the window area by using an interactive UI (interface) marking tool, wherein the marking tool outputs a window category label. And extracting the marked areas and independently storing the extracted areas as a foreground window image.

The second way is to use screen recording software to record a screen, only one program window is opened and maximized (such as word) for each recording, and various operations such as editing, toolbar selection, theme style switching and the like are performed in the maximized window. And then, extracting image frames of the recorded video, and directly storing the extracted image frames as a foreground window image.

Step S202, constructing a plurality of display interfaces according to a background image and a foreground window image, and determining labeling information of the foreground window; wherein the display interface comprises a background image and at least one foreground window.

Here, the foreground window image is randomly pasted on the background image as the foreground window to form a display interface, four angular coordinates of the foreground window are automatically recorded in the pasting process, and if a plurality of foreground windows are pasted on one background image, covered vertexes are required to be removed in the iterative process, and cross point coordinates are required to be newly added. Finally, generating target points with labeling information marked as (c, x, y, w and h). Data enhancement can be performed by random stacking coverage.

Step S203, dividing the display interface and the corresponding labeling information into a first training set and a first verification set according to a first preset proportion.

Here, the first preset ratio is set according to the actual situation, and may be 7:3.

And step S204, training the initial target detection model by taking a display interface in the first training set as input and marking information in the first training set as output until a preset training requirement is met, so as to obtain the target detection model.

Here, the initial object detection model may be a single-stage object detection model YOLOv5 (You only Look Once Version5, you only need to see the series 5 once) to ensure real-time performance of the calculation (detection speed of about 5 ms/frame using YOLOv 5). Other versions of models including SSD (Single Shot MultBox Detector, single step multiple frame object detection), retinaNet, mask-RCNN (split Mask) and YOLO series are also possible.

Step S102, generating an intersection point set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment.

Here, the preset rectangular line segment rule includes an upper left corner, a lower left corner, an upper right corner, and a lower right corner, and the upper left corner abscissa is the same as the lower left corner abscissa, the upper left corner ordinate is the same as the upper right corner ordinate, the lower left corner ordinate is the same as the lower right corner ordinate, and the upper right corner abscissa is the same as the lower right corner abscissa.

In one embodiment, referring to fig. 4, in step S102, the step of generating the intersection set based on the target point includes:

step S301, first labeling information and second labeling information of any two target points are obtained; the first labeling information comprises a first label value, a first target point abscissa and a first target point ordinate; the second labeling information includes a second label value, a second target point abscissa, and a second target point ordinate.

Here, a first label value ca, a first target point abscissa xa, and a first target point ordinate ya of the first target point a, a second label value cb, a second target point abscissa xb, and a second target point ordinate yb of the first label information (xa, ya, ca), and a second label information (xb, yb, cb) of the second target point b are acquired.

In step S302, when the first label value is the upper left corner and the second label value is the upper right corner, and the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, it is determined that the first labeling information and the second labeling information belong to the same intersection point set.

Specifically, when ca is the upper left corner, cb is the upper right corner, ya=yb, and xa < xb, it is determined that the first target point and the second target point belong to the same intersection set.

The determination of the coordinates equality may be kept by a certain offset thres, for example |ya-yb| < thres may be considered equality, and thres may be set to 5.

In step S303, when the first label value is at the lower left corner and the second label value is at the lower right corner, and the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, it is determined that the first labeling information and the second labeling information belong to the same intersection point set.

Specifically, when ca is the lower left corner, cb is the lower right corner, ya=yb, and xa < xb, it is determined that the first target point and the second target point belong to the same intersection set.

In step S304, when the first label value is the upper left corner and the second label value is the lower left corner, and the first target point abscissa is equal to the second target point abscissa, and the first target point ordinate is smaller than the second target point ordinate, it is determined that the first labeling information and the second labeling information belong to the same intersection set.

Specifically, when ca is the upper left corner, cb is the lower left corner, xa=xb, and ya < yb, it is determined that the first target point and the second target point belong to the same intersection set.

In step S305, when the first label value is the upper right corner and the second label value is the lower right corner, and the first target point abscissa is equal to the second target point abscissa and the first target point ordinate is less than the second target point ordinate, it is determined that the first labeling information and the second labeling information belong to the same intersection set.

Specifically, when ca is the upper right corner, cb is the lower right corner, xa=xb, and ya < yb, it is determined that the first target point and the second target point belong to the same intersection set.

Step S103, a target intersection point set containing at least three intersection points is screened out from the intersection point set.

Here, the target points in the intersection set are traversed, and the combination of the unmatched target points and the matching number smaller than 3 is discarded, so that a target intersection set containing at least three intersection points is obtained.

Step S104, determining a rectangular window based on the target intersection point set.

In one embodiment, referring to fig. 5, in step S104, the step of determining a rectangular window based on the set of target intersections includes:

step S401, labeling information of all target points in the target intersection point set is obtained.

Step S402, determining that a target point with the minimum target point abscissa and target point ordinate is the upper left corner of a rectangular window; or determining the target point with the largest target point abscissa and target point ordinate as the right lower corner of the rectangular window.

Step S403, determining a rectangular window based on the upper left corner or the lower right corner of the rectangular window and the remaining target points in the target intersection set.

Specifically, after the upper left corner of the rectangular window is acquired, the remaining target points that can form a rectangle with the upper left corner can be determined from the upper left corner. Alternatively, the lower right corner of the rectangular window is acquired, and the remaining target points that can form a rectangle with the lower right corner can be determined from the lower right corner. And determining the rectangular window according to the left upper corner or the right lower corner of the rectangular window and the rest target points in the target intersection point set.

In an embodiment, the method further comprises:

and inputting the rectangular window into a pre-trained window classification model, and outputting the window type of the rectangular window.

Here, in the video public playing occasion, the content of the privacy window such as WeChat can be hidden in time through judging the window type.

In one embodiment, referring to FIG. 6, the window classification model is trained by:

step S501, acquiring a pre-acquired foreground window image and window types of the foreground window image.

Here, the window types include a WeChat window, a word window, a browser window, and the like.

Step S502, dividing the foreground window graph and the window types corresponding to the foreground window graph into a second training set and a second verification set according to a second preset proportion.

Here, the second preset ratio may be set according to actual situations, and may be 7:3.

Step S503, training the initial convolutional neural network model by taking the foreground window diagram in the second training set as input and taking the window type in the second training set as output until the preset training requirement is met, and obtaining a window classification model.

Here, the window type of the foreground window is determined according to the window classification model.

The embodiment of the invention provides a window detection method, which comprises the following steps: detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; screening a target intersection point set containing at least three intersection points from the intersection point set; a rectangular window is determined based on the set of target intersections. In the mode, the foreground window of the current display interface is accurately identified by acquiring the non-shielded target points in the foreground window and constructing a rectangular window through the target points, so that the error of window identification is reduced.

Embodiment two:

Referring to fig. 7, a window detection system includes:

the target point acquisition module 1 is used for detecting a target point in the display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections.

An intersection set generating module 2, configured to generate an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment.

And the target intersection point set acquisition module 3 is used for screening a target intersection point set containing at least three intersection points from the intersection point sets.

A rectangular window determining module 4, configured to determine a rectangular window based on the target intersection set.

The embodiment of the invention provides a window detection system, which comprises: detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises the vertexes of the foreground windows and the intersections of the foreground windows; the labeling information comprises coordinates of a target point and a label value; the tag value is used to distinguish between vertices and intersections; generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment; screening a target intersection point set containing at least three intersection points from the intersection point set; a rectangular window is determined based on the set of target intersections. In the mode, the foreground window of the current display interface is accurately identified by acquiring the non-shielded target points in the foreground window and constructing a rectangular window through the target points, so that the error of window identification is reduced.

The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the window detection method provided by the embodiment when executing the computer program.

The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and the computer program is stored on the computer readable storage medium and executes the steps of the window detection method of the embodiment when being executed by a processor.

The computer program product provided by the embodiment of the present invention includes a computer readable storage medium storing a program code, where instructions included in the program code may be used to perform the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment and will not be described herein.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.

In addition, in the description of embodiments of the present invention, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A window detection method, comprising:

detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises a vertex of a foreground window and an intersection point of the foreground windows with each other; the labeling information comprises target point coordinates and label values; the tag value is used to distinguish between the vertex and the intersection;

generating an intersection set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment;

screening a target intersection point set containing at least three intersection points from the intersection point set;

a rectangular window is determined based on the set of target intersections.

2. The method according to claim 1, wherein the labeling information comprises a label value, a target point abscissa, a target point ordinate, a labeling frame length value and a labeling frame width value;

the label values include upper left corner, lower left corner, upper right corner, lower right corner, intersection-upper left, intersection-lower left, intersection-upper right, and intersection-lower right;

the intersection points are the intersection points of the foreground windows, and when the intersection points are at the upper left part of the covered foreground windows, the label value of the labeling information of the intersection points is the intersection points-upper left part; when the intersection point is at the left lower part of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-left lower part; when the intersection point is at the upper right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-upper right; and when the intersection point is at the lower right of the covered foreground window, the label value of the labeling information of the intersection point is the intersection point-lower right.

3. The method according to claim 2, wherein the step of generating a set of intersections based on the target points comprises:

acquiring first labeling information and second labeling information of any two target points; the first labeling information comprises a first label value, a first target point abscissa and a first target point ordinate; the second labeling information comprises a second label value, a second target point abscissa and a second target point ordinate;

when the first label value is at the upper left corner, the second label value is at the upper right corner, the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, determining that the first labeling information and the second labeling information belong to the same intersection point set;

when the first label value is at the lower left corner, the second label value is at the lower right corner, the ordinate of the first target point is equal to the ordinate of the second target point, and the abscissa of the first target point is smaller than the abscissa of the second target point, determining that the first labeling information and the second labeling information belong to the same intersection point set;

when the first label value is at the upper left corner, the second label value is at the lower left corner, the first target point abscissa is equal to the second target point abscissa, and the first target point ordinate is smaller than the second target point ordinate, determining that the first labeling information and the second labeling information belong to the same intersection point set;

when the first label value is at the upper right corner, the second label value is at the lower right corner, the first target point abscissa is equal to the second target point abscissa, and the first target point ordinate is smaller than the second target point ordinate, the first labeling information and the second labeling information are determined to belong to the same intersection point set.

4. The method of claim 2, wherein the step of determining a rectangular window based on the set of target intersections comprises:

acquiring the labeling information of all target points in the target intersection point set;

determining the target point with the minimum target point abscissa and target point ordinate as the upper left corner of the rectangular window; or determining the target point with the largest target point abscissa and target point ordinate as the right lower corner of the rectangular window;

and determining the rectangular window based on the upper left corner or the lower right corner of the rectangular window and the rest target points in the target intersection point set.

5. The method of claim 1, wherein the object detection model is trained by:

acquiring a pre-acquired foreground window image and a pre-acquired background image; wherein the background image comprises the foreground window image with maximized window;

constructing a plurality of display interfaces according to the background image and the foreground window image, and determining the labeling information of the foreground window; wherein the display interface comprises one background image and at least one foreground window;

dividing the display interface and the corresponding labeling information into a first training set and a first verification set according to a first preset proportion;

and training an initial target detection model by taking the display interface in the first training set as input and the marking information in the first training set as output until a preset training requirement is met, so as to obtain the target detection model.

6. The method according to claim 1, wherein the method further comprises:

7. The method of claim 6, wherein the window classification model is trained by:

acquiring a pre-acquired foreground window image and window types of the foreground window image;

dividing the foreground window graph and the window type corresponding to the foreground window graph into a second training set and a second verification set according to a second preset proportion;

and training an initial convolutional neural network model by taking the foreground window graph in the second training set as input and the window type in the second training set as output until a preset training requirement is met, so as to obtain the window classification model.

8. The method of claim 5 or claim 7, wherein the foreground window map is acquired by:

acquiring a pre-recorded screen use video;

capturing a video frame by frame for the screen to obtain a plurality of images to be processed;

based on a preset foreground window rule, capturing a foreground window in the image to be processed as the foreground window image, and marking the window type of the foreground window image.

9. A window detection system, comprising:

the target point acquisition module is used for detecting a target point in a display interface through a pre-trained target detection model to obtain labeling information of the target point; wherein the target point comprises a vertex of a foreground window and an intersection point of the foreground windows with each other; the labeling information comprises target point coordinates and label values; the tag value is used to distinguish between the vertex and the intersection;

the intersection point set generating module is used for generating an intersection point set based on the target point; the intersection point set is a set formed by points meeting the rule of a preset rectangular line segment;

the target intersection point set acquisition module is used for screening a target intersection point set containing at least three intersection points from the intersection point set;

and the rectangular window determining module is used for determining a rectangular window based on the target intersection point set.

10. An electronic device comprising a memory, a processor, the memory having stored thereon a computer program executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-8 when executing the computer program.