CN112785567B - Map detection method, map detection device, electronic equipment and storage medium - Google Patents

Map detection method, map detection device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112785567B
CN112785567B CN202110059261.0A CN202110059261A CN112785567B CN 112785567 B CN112785567 B CN 112785567B CN 202110059261 A CN202110059261 A CN 202110059261A CN 112785567 B CN112785567 B CN 112785567B
Authority
CN
China
Prior art keywords
map
image
target map
target
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110059261.0A
Other languages
Chinese (zh)
Other versions
CN112785567A (en
Inventor
张超
于天宝
王加明
贠挺
陈国庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110059261.0A priority Critical patent/CN112785567B/en
Publication of CN112785567A publication Critical patent/CN112785567A/en
Application granted granted Critical
Publication of CN112785567B publication Critical patent/CN112785567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Abstract

The disclosure discloses a map detection method, relates to the field of artificial intelligence, and particularly relates to the field of image processing. The specific implementation scheme is as follows: extracting a plurality of first images from the video and determining the positions of the plurality of first images in the video; determining a first image including a target map from the plurality of first images as a second image using the map classification model; determining a second image containing the erroneous target map as a target image using the map detection model; the position of the target image in the video is determined as the position of the wrong target map. The disclosure also discloses a map detection device, an electronic device and a storage medium.

Description

Map detection method, map detection device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to image processing techniques. More particularly, the present disclosure provides a map detection method, apparatus, electronic device, and storage medium.
Background
Along with the rapid development of the internet, users release videos and images on the internet every day, and people can be misled by error information on the internet at any time while enjoying the convenience of acquiring information, so that adverse effects are caused.
For example, in recent years, an event of a map of problems on the internet frequently occurs, which negatively affects society. Therefore, auditing compliance of maps in video and images is an essential task.
Disclosure of Invention
The disclosure provides a map detection method, a map detection device, map detection equipment and a storage medium.
According to an aspect of the present disclosure, there is provided a map detection method including: extracting a plurality of first images from the video and determining the positions of the plurality of first images in the video; determining a first image including a target map from the plurality of first images as a second image using the map classification model; determining a second image containing the erroneous target map as a target image using the map detection model; the position of the target image in the video is determined as the position of the wrong target map.
According to another aspect of the present disclosure, there is provided a map detection apparatus including: an extraction module for extracting a plurality of first images from the video and determining the positions of the plurality of first images in the video; a first determination module for determining a first image including a target map from a plurality of first images as a second image using a map classification model; a second determining module for determining a second image containing the erroneous target map as a target image using the map detection model; and the third determining module is used for determining the position of the target image in the video as the position of the error target map.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a method provided according to the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method provided according to the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an exemplary system architecture to which mapping methods and apparatus may be applied, according to one embodiment of the present disclosure;
FIG. 2 is a flow chart of a map detection method according to one embodiment of the present disclosure;
FIG. 3 is a network architecture schematic of a map classification model according to an embodiment of the present disclosure;
FIG. 4 is a flow diagram of a method of determining a second image containing a false target map as a target image using a map detection model according to one embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a method of using a map detection model to map a target map in a second image to identify a plurality of feature regions, according to one embodiment of the present disclosure;
FIG. 6 is a flow chart of a method of determining whether a target map in a second image is an erroneous target map based on calibrated feature areas according to one embodiment of the present disclosure;
FIG. 7 is a network architecture schematic of a map detection model according to one embodiment of the present disclosure;
FIG. 8 is a block diagram of a map detection apparatus according to one embodiment of the present disclosure;
fig. 9 is a block diagram of an electronic device of a map detection method according to one embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In recent years, some enterprises and organizations use non-normative maps in public places, which has bad influence on the public and the management department publishes and penalizes a plurality of enterprises which do not use complete and accurate maps. The internet enterprise has more users and high social attention, and negative effects caused by the problem map streaming to the internet are more obvious, so how to identify the non-compliant map in the massive images becomes a practical problem to be solved urgently. The technology for identifying the problem map can be applied to the fields of picture video websites, video software, security protection and the like, and is an important ring for reducing the safety risk of products and ensuring the safety of the products by identifying the problem map and taking measures to avoid the internet flow of the problem map.
The problem map is generally an error map in which there is a boundary line drawing incompleteness, a part of a critical region or island is missing, or the like, and the map that can be detected by the embodiment of the present disclosure may include a map at a country level, a map at an administrative region level, a world map, an asian map, a european map, and the like.
At present, auditing the map in the video often depends on manual work, and has the problems of high time consumption and easy omission. Specifically, because the occupation of the error map in the mass map images is smaller, the attention of manual auditing can be reduced along with the increase of standard time, and the labeling result can be different along with the difference of auditors and auditing environments, so that the interference of subjective factors is serious. In addition, the manual auditing firstly needs to train a standard map for auditing personnel, and then the auditing personnel can complete the identification and processing of the full-quantity images and videos, so that a large amount of manpower and material resources are consumed, and great labor cost pressure is caused.
FIG. 1 is a schematic diagram of an exemplary system architecture to which map detection methods and apparatus may be applied, according to one embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 1, a system architecture 100 according to this embodiment may include a plurality of terminal devices 101, a network 102, a server 103, and a server 104. The network 102 serves as a medium for providing communication links between the terminal device 101 and the servers 103, 104. Network 102 may include various connection types, such as wired and/or wireless communication links, and the like.
Terminal device 101 may be a variety of electronic devices including, but not limited to, smartphones, tablets, laptop portable computers, and the like. The server 103 may be an electronic device providing a map classification service, and the server 104 may be an electronic device providing a problem map detection service.
For example, the server 103 may collect a large number of target map positive samples, which may be the correct target map to draw, and target map negative samples, which may be non-target maps. Training the neural network model by using the target map positive sample and the target map negative sample, and obtaining a map classification model, wherein the map classification model can detect whether the map to be classified is a target map or a non-target map.
For example, the server 104 may collect a large number of target map samples, and use the target map samples to train the neural network model, so as to obtain a map detection model, where the map detection model may detect whether there are problems such as a boundary line with drawing errors, a missing region, and the like in the map to be audited.
According to the embodiment of the disclosure, the trained map classification model and the map detection model can classify and detect images existing or to be released on the internet, and can also classify and detect videos existing or to be released on the internet. Illustratively, a user issues a video through the terminal device 101, the terminal device 101 sends the video to the server 103, the server 103 may cut frames of the video, extract each frame of image in the video, then identify that each frame of image is capable of containing a map, identify whether a target map is included for the image containing the map using a trained map classification model, and mark the frame positions of the respective images for the identified image containing the target map. The identified image and frame locations containing the target map are then sent to the server 104.
The server 104 may detect each image containing the target map by using the map detection model, identify whether the target map has a problem of wrong drawing, missing drawing, etc., and if so, send the frame position information of the image to the terminal device 101 to prompt the user for an error map appearing at a certain frame position in the video, so as to achieve the purpose of automatically checking the error map. In particular, the map detection model may also be used to mark places in the error map where drawing errors are specific, such as marking missing areas, etc.
It should be noted that, the map classification model and the map detection model may be trained and used in different servers, or may be trained and used in the same server, which is not limited in this disclosure.
Fig. 2 is a flow chart of a map detection method according to one embodiment of the present disclosure.
As shown in fig. 2, the map detection method 200 may include operations S210 to S240.
In operation S210, a plurality of first images are extracted from a video and positions of the plurality of first images in the video are determined.
According to the embodiment of the disclosure, the video may be cut into frames, and a multi-frame image in the video may be extracted as a first image, and each frame of the first image may be 1 second. The frame position of each first image may also be marked as the position of the first image in the video.
In operation S220, a first image including a target map is determined as a second image from among a plurality of first images using a map classification model.
According to embodiments of the present disclosure, the map classification model may be trained using a large number of target map positive samples and target map negative samples. Whether each first image contains the target map or not can be identified by using the trained map classification model, the first image containing the target map is taken as a second image, and the frame position of each second image can be determined according to the frame position of each first image.
Specifically, a plurality of first images may be input into a trained map classification model, which may output a probability of including a target map in each first map, and if the probability of including the target map in the output first images is greater than a first threshold (e.g., 0.3), the first images may be determined to be first images including the target map.
It can be appreciated that if the number of the first images including the target map is greater than or equal to 1, it is determined that the video includes the target map, and the video needs to be checked.
In operation S230, a second image including the erroneous target map is determined as the target image using the map detection model.
According to an embodiment of the present disclosure, the map detection model may be trained using a large number of target map samples, and whether the target map in the second image is an error map may be identified using the map detection model. And taking the second images containing the error target map as target images, and determining the frame positions of the target images according to the frame positions of the second images.
Specifically, the plurality of second images may be input into a trained map detection model, which may identify a plurality of feature regions in the respective second images, each of which may correspond to a region or island or the like in the target map that is prone to drawing errors or missing drawings. The map detection model may output a score for each of the identified feature regions that characterizes the probability of rendering correctness for the feature region.
According to an embodiment of the present disclosure, if the number of feature regions in the identified target map is the same as the number of feature regions in the standard target map, and the score of each feature region is greater than a certain threshold (e.g., 0.3), it is indicated that the target map is drawn correctly. Illustratively, if the area a, the area B, the area C, and the area D of the target map are identified in the second image using the map detection model, for example, and the score of the area a is 0.4, the score of the area B is 0.5, the score of the area C is 0.6, and the score of the area D is 0.9. And also includes area a, area B, area C and area D in the standard target map, then it is indicated that the target map in the second image is drawn correctly.
According to the embodiment of the disclosure, if the number of the feature areas in the marked target map is inconsistent with the number of the feature areas in the standard target map, for example, is smaller than the number of the feature areas in the standard target map, it may be determined that the target map is not drawn with a certain feature area. Illustratively, if the area a, the area B, and the area C of the target map are identified in the second image using the map detection model, for example, and the score of the area a is 0.4, the score of the area B is 0.5, and the score of the area C is 0.6. And the standard target map comprises an area A, an area B, an area C and an area D, which indicates that the target map in the second image is not drawn with the area D.
According to the embodiment of the disclosure, if the number of the feature areas in the marked target map is consistent with the number of the feature areas in the standard target map, but the score of a certain feature area is smaller than a threshold value (for example, 0.3), the marked feature area is indicated to have a high drawing error probability, and the target map can be considered to be an error map. Illustratively, if the area a, the area B, the area C, and the area D of the target map are identified in the second image using the map detection model, for example, and the score of the area a is 0.1, the score of the area B is 0.5, the score of the area C is 0.6, and the score of the area D is 0.9. And the standard target map also comprises an area A, an area B, an area C and an area D, which indicates that the target map in the second image does not have the feature area drawn, but the boundary line of the area A has small drawing accuracy, and the target map in the second image can be considered as an error map.
In operation S240, the position of the target image in the video is determined as the position of the erroneous target map.
According to an embodiment of the present disclosure, a frame position of each of the extracted first images may be marked, for example, as 1 st to 100 th frames in operation S210. Using the map classification model to identify the first image including the target map as the second image in operation S220, the frame position of the second image may be determined according to the frame position of the marked first image, for example, the first image of the 5 th to 10 th frames and the 75 th to 78 th frames includes the target map, and the frame position of the second image is the 5 th to 10 th frames and the 75 th to 78 th frames. The second image including the erroneous target map is identified as the target image using the map detection model in operation S230, and the frame position of the target map may be determined according to the frame position of the second image, for example, the target map in the second image of the 5 th to 10 th frames is the correct target map, the target map in the second image of the 75 th to 78 th frames is the erroneous target map, the second image of the 75 th to 78 th frames is the target image, and the frame position of the target image is the 75 th to 78 th frames. The auditor can be informed that the 75 th to 78 th frames in the video contain the wrong target map, so that the map auditing efficiency is improved, and the manpower is saved.
According to the embodiment of the disclosure, the image is extracted from the video, the image containing the target map is determined by using the map classification model, the image containing the error target map is detected by using the map detection model, the position of the image containing the error target map in the video is determined, whether the image containing the error target map is contained in the video can be automatically detected, the position of the error target map is determined, the auditing efficiency of the target map is improved, the labor cost pressure is reduced, and the auditing accuracy is improved.
According to an embodiment of the present disclosure, before classifying the plurality of first images using the map classification model, the first images may be scaled to a first preset size, which may be 224×224, and the format of the first images is an RGB image. The map classification model classifies the first images with the sizes of 224×224 to obtain the probability that each first image contains the target map. The first image may also be scaled to other sizes depending on the performance of the map classification model.
According to the embodiment of the disclosure, when the map classification model is trained, the target map positive samples and the target map negative samples with different sizes can be used for training, so that the map classification model is subjected to training of multi-size features, and the robustness of the map classification model can be improved.
According to the embodiment of the disclosure, when the map classification model is trained, the training can be performed in a multi-scale feature pyramid mode, wherein the multi-scale feature pyramid is that a neural network comprises a plurality of convolution layers, and the sizes of feature graphs presented by images processed by different convolution layers are different. After the map classification model is trained by the multi-scale feature pyramid, the robustness of the model can be improved.
Fig. 3 is a network structure diagram of a map classification model according to an embodiment of the present disclosure.
As shown in fig. 3, the network structure of the map classification model includes a plurality of convolution layers 301, a plurality of maximum pooling layers 302, a full connection layer 303, and a classification layer 304. The respective processing layers of the map classification model will be described below with a plurality of first images in a video as input images of the map classification model.
As shown in fig. 3, the convolution layers 301 are used to extract features of an input image, an output of each convolution layer 301 is connected to a maximum pooling layer 302, and the maximum pooling layer 302 is used to select features of the output of the convolution layers 301, and select features with the greatest activation degree. After the feature extraction is performed on the map sample through each convolution layer 301, the feature with the largest activation degree is selected through a maximum value pooling layer 302, after the feature is processed through a plurality of roll layers 301 and the maximum value pooling layers 302, the full connection layer 303 integrates the feature selected by the maximum value pooling layers 302, the integrated feature is output to a classification layer 304, and the classification layer 304 is used for outputting classification results conforming to the features. The feature images output after each processing of the scroll base layer 301 and the maximum pooling layer 302 have different sizes, so that the trained model can adaptively process images with different sizes, and the robustness of the map classification model is improved.
Fig. 4 is a flowchart of a method of determining a second image containing a false target map as a target image using a map detection model according to one embodiment of the present disclosure.
As shown in fig. 4, the map detection method 400 may include operations S431 to S433.
In operation S431, a plurality of feature areas are identified in the target map in each of the second images using the map detection model.
According to an embodiment of the present disclosure, the plurality of feature regions may include a first region and at least one second region. For example, the target map may be a country map, the first region in the target map may be a rectangular region having a southwest boundary point and an easiest boundary point of a country border as vertices, and the second region may be a plurality of administrative regions of the same or different levels in the country map, for example, province a, province B, and city C.
Specifically, a rightmost point and a bottommost point on the boundary of the target map may be selected, and a preset-shaped region is determined as a first region according to the two points, where the two points are on the determined edges of the preset shape. The shape may be a regular shape such as a rectangle or a circle, or may be an irregular shape. The first region is marked for further determining that the map in the second image is the target map to be audited according to the marked first region. Since the first area is determined by two points on the boundary of the map in the second image, the first area contains at least part of the map, and whether the map in the second image is the target map to be audited can be further judged according to the characteristics of at least part of the map contained in the first area.
According to the embodiment of the disclosure, the second area corresponds to a geographic area in the target map, which meets a preset condition, for example, a position and an outline of the geographic area a in the standard target map are taken as a first preset condition, and a position and an outline of the geographic area B in the standard target map are taken as a second preset condition. And marking a geographic area A in the target map according to the first preset condition, and marking a geographic area B in the target map according to the second preset condition. The marked geographic area A and the marked geographic area B are taken as second areas. By marking the second area, it can be determined whether the target map is missing some geographical areas and whether the boundary line is wrong, and in the case of wrong boundary line, the second area may not be marked or may be marked incorrectly.
In operation S432, it is determined whether the target map in each of the second images is an erroneous target map according to the calibrated feature area.
According to embodiments of the present disclosure, it may be determined whether the target map in the second image is an error map according to the number of calibrated second areas. Specifically, the preset standard target map may be calibrated with a first area and a plurality of second areas including, for example, an area a, an area B, an area C, and an area D, for example. If the number of the second areas in the target map in the second image is inconsistent with the number of the second areas in the standard target map, for example, the number of the second areas in the standard target map is smaller than the number of the second areas in the standard target map, it can be determined that the target map is not drawn with a certain characteristic area. For example, if the first area of the target map and the areas a, B, and C of the second area are identified in the second image using the map detection model, since the second area includes the areas a, B, C, and D in the standard target map, it is explained that the target map in the second image is not drawn with the area D, and it is explained that the target map in the second image is an error map.
According to an embodiment of the disclosure, the map detection model may also output a score of each calibrated region, which characterizes a probability of correctness of rendering of the feature region, while the first region and the at least one second region are identified in the respective second images. It may be determined whether the target map in the second image is an error map according to the score of the calibrated area.
Specifically, if the first region of the target map, and the region a, the region B, the region C, and the region D in the second region are identified in the second image using the map detection model, and the score of the region a is 0.1, the score of the region B is 0.5, the score of the region C is 0.6, and the score of the region D is 0.9. Since the second area in the standard target map includes the area a, the area B, the area C and the area D, it is explained that the target map in the second image does not draw the second area, but the probability that the boundary line of the area a is drawn correctly is small, and the target map in the second image can be considered as an error map.
In operation S433, the second image including the error target map is taken as the target image.
According to the embodiment of the disclosure, the second image containing the error target map is taken as a target image, and the frame position of the target map is determined so as to remind auditors of the position of the error target map in the video.
Fig. 5 is a schematic diagram of a method of using a map detection model to identify a plurality of feature regions in a target map in a second image, according to one embodiment of the disclosure.
As shown in fig. 5, the map detection model identifies five feature areas, namely, a first area 501, a second area 502, a second area 503, a second area 504, and a second area 505, in a target map 500 in a second image. The map detection model may also score the correctness of each feature region, and the correctness score of each region may characterize the probability that the region draws correctly.
As shown in fig. 5, the first region 501 may be a rectangular frame determined according to the rightmost point and the bottommost point on the boundary of the target map 500, and the second regions 502 to 505 may be rectangular frames determined according to the boundary of the designated region in the target map.
Fig. 6 is a flowchart of a method of determining whether a target map in a second image is an erroneous target map according to a calibrated feature area according to one embodiment of the present disclosure.
As shown in fig. 6, the method includes operations S6321 to S6327.
In operation S6321, the first region and at least one second region are calibrated in the target map in the second image using the map detection model, and the correctness of the first region and each second region is scored.
In operation S6322, it is determined whether the score of the first area is greater than a second threshold (e.g., 0.96), and if the score of the first area is greater than 0.96, step S6323 is performed, otherwise it may be determined that the map in the second image is not the target map, and the flow ends.
In operation S6323, it is determined whether the score of at least one second area is greater than a third threshold (e.g., 0.3), if so, it may be determined that the target map in the second image is a correct target map, the flow ends, and if not, operation S6324 is performed if the score of any second area is not greater than the third threshold.
In operation S6324, an area ratio of the first region to the second image in the second image is calculated.
In operation S6325, it is determined whether the area ratio is greater than a fourth threshold (e.g., 0.5), if so, operation S6326 is performed, otherwise operation S6327 is performed.
In operation S6326, it is determined that the target map in the second image is a correct target map.
In operation S6327, the target map is cut out from the second image and scaled to a preset size (e.g., 660×400), and operation S6321 is returned.
It will be appreciated that if the area of the target map in the second image is small, the area of the feature region of the second image may be smaller, which may affect the calibration of the map detection model for each feature region in the second image. Therefore, in the embodiment of the present disclosure, when it is determined that the score of any one of the second areas that is calibrated is smaller than the third threshold (i.e., the probability of the drawing error of the target map is greater), the target map is cut out from the second image, specifically, the rectangular frame of the first area is enlarged to frame the entire target map, and then the target map is cut out according to the enlarged rectangular frame of the first area. For the second map that is cut out, the second map may be scaled to a second preset size, where the second preset size may be 660×400, the target image of the second preset size is input into a map detection model, the map detection model may calibrate and score the second area in the target map, and further determine whether the target map is an erroneous target map according to the score of the calibrated second area. Thus, the problem of detection errors caused by the fact that the target map is too small and the characteristic area is difficult to identify can be avoided, and the accuracy of detecting the wrong target map can be improved.
According to an embodiment of the present disclosure, before the map detection model is used to mark the plurality of feature areas in the target map in each second image, the second image may be further scaled to a second preset size, which may be 660×400, and the second image is in the format of an RGB image. The map detection model identifies a first region and at least one second region in a second image having a size of 660 x 400 and scores the first region and the at least one second region, and determines whether the target map is an error map according to the scores of the identified first region and the at least one second region. The second image may also be scaled to other sizes depending on the capabilities of the map detection model.
According to the embodiment of the disclosure, when the map detection model is trained, the target map samples with different sizes can be used for training, so that the map detection model is subjected to training of multi-size features, and the robustness of the map classification model can be improved.
According to an embodiment of the disclosure, the target map sample may be calibrated with a first area and at least one second area, and in particular the first area and the at least one second area may be marked by rectangular boxes.
Fig. 7 is a network structure diagram of a map detection model according to an embodiment of the present disclosure.
As shown in fig. 7, the network structure of the map detection model includes an image feature extraction layer 701, a feature region calibration layer 702, and an error region identification layer 703. The respective processing layers of the map detection model will be described below with the second image including the target map as an input image of the map detection model.
According to an embodiment of the present disclosure, the image feature extraction layer 701 may include a plurality of convolution layers and a max pooling layer, for extracting features of an input image and selecting features having the greatest activation degree to generate a feature image. The image feature extraction layer 701 outputs the feature image to the feature region calibration layer 702 and the error region identification layer 703, respectively, where the feature region calibration layer 702 is configured to calibrate a plurality of feature regions in the feature image, obtain calibration information of each feature region, and score each feature region, where the calibration information may be coordinate information of a rectangular frame used to mark the feature region. The feature region calibration layer 702 outputs the calibration information and the score of each feature region to the error region identification layer 703, and the error region identification layer 703 is configured to combine the feature image input by the image feature extraction layer 701 and the calibration information and the score of each feature region input by the feature region calibration layer 702, calibrate each feature region on the feature image, and determine whether each feature region draws an error according to the calibrated feature region and the score of each feature region.
Fig. 8 is a block diagram of a map detection apparatus according to one embodiment of the present disclosure.
As shown in fig. 8, the map detection apparatus 800 may include an extraction module 801, a first determination module 802, a second determination module 803, and a third determination module 804.
The extraction module 801 is configured to extract a plurality of first images from a video and determine positions of the plurality of first images in the video.
The first determining module 802 is configured to determine a first image including a target map from the plurality of first images as a second image using a map classification model.
The second determining module 803 is configured to determine a second image including the erroneous target map as the target image using the map detection model.
The third determining module 804 is configured to determine a location of the target image in the video as a location of the erroneous target map.
According to an embodiment of the present disclosure, the extraction module 801 is specifically configured to segment a video according to frames to obtain a plurality of first images, and determine a frame position of each first image in the video as a position of the first image in the video.
According to an embodiment of the disclosure, the first determining module 802 is specifically configured to classify the plurality of first images using a map classification model, so as to obtain a probability that each first image contains the target map; a first image having a probability greater than a first threshold is determined as a second image.
According to an embodiment of the present disclosure, the map detection apparatus 800 further includes a first scaling module.
The first scaling module is configured to scale the first image to a first preset size before the first determination module 802 classifies the plurality of first images using the map classification model.
According to an embodiment of the present disclosure, the second determination module 803 includes a calibration unit, a first determination unit, and a second determination unit.
The calibration unit is used for calibrating a plurality of characteristic areas in the target map in each second image by using the map detection model.
The first determining unit is used for determining whether the target map in each second image is an error target map according to the calibrated characteristic region.
The second determination unit is configured to take a second image including the error target map as a target image.
According to an embodiment of the present disclosure, the first determining unit is specifically configured to determine that the target map in the second image is an erroneous target map when the number of calibrated feature areas is not equal to the number of feature areas in the preset standard target map.
According to an embodiment of the present disclosure, the first determining unit is specifically further configured to determine a missing feature area in the error target map according to a feature area in a preset standard target map.
According to an embodiment of the present disclosure, the plurality of feature areas includes a first area and at least one second area, and the calibration unit is specifically configured to calibrate the first area in the target map based on at least two points on a boundary of the target map, and to identify at least one geographical area meeting a preset condition in the target map as the second area.
According to an embodiment of the present disclosure, the first determination unit includes: a scoring subunit, a computing subunit, and a determining subunit.
The marking subunit is used for marking the correctness of the calibrated first area and the calibrated second area.
The calculating subunit is configured to calculate an area ratio of the first region to the second image in the second image when it is determined that the correctness score of the first region is greater than the second threshold and the correctness score of any one of the second regions is less than or equal to the third threshold.
The determining subunit is configured to determine whether the target map in the second image is an erroneous target map according to the area ratio.
According to an embodiment of the disclosure, the determining subunit is specifically configured to determine, when the determined area ratio is greater than the fourth threshold, that the target map in the second image is an erroneous target map.
According to an embodiment of the disclosure, the determining subunit is specifically further configured to intercept the target map from the second image if the determined area ratio is less than or equal to the fourth threshold; and returns to the calibration unit.
According to an embodiment of the present disclosure, the map detection apparatus 800 further includes a second scaling module.
The second scaling module is used for scaling the second images to a second preset size before the calibration unit uses the map detection model to calibrate the plurality of feature areas in the target map in each second image.
According to an embodiment of the present disclosure, the determining subunit is specifically further configured to determine an erroneous second area in the target map according to the correctness score of each second area in the second image.
According to an embodiment of the present disclosure, the determining subunit is specifically further configured to determine, when it is determined that the correctness scores of the first areas are greater than the second threshold and the correctness scores of the at least one second area are all greater than the third threshold, that the target map in the second image is a correct target map.
According to an embodiment of the present disclosure, the map detection apparatus 800 further includes a first training module and a second training module.
The first training module is used for training the first neural network model by using the first map sample set to obtain a map detection model; wherein the first set of map samples includes a plurality of target map samples, each target map sample having a calibrated first region and at least one calibrated second region.
The second training module is used for training the second neural network model by using the second map sample set to obtain a map classification model; wherein the second set of map samples includes a plurality of target map positive samples and a plurality of target map negative samples.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.
The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, such as a map detection method. For example, in some embodiments, the map detection method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the map detection method described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the map detection method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (16)

1. A map detection method, comprising:
extracting a plurality of first images from the video and determining the positions of the plurality of first images in the video;
determining a first image including a target map from the plurality of first images as a second image using a map classification model;
determining a second image containing the erroneous target map as a target image using the map detection model;
determining the position of the target image in the video as the position of an error target map;
wherein the determining, using the map detection model, the second image including the erroneous target map as the target image includes:
marking a plurality of feature areas in the target map in each second image using the map detection model;
determining whether the target map in each second image is an error target map according to the calibrated characteristic region;
Taking a second image containing an error target map as a target image;
wherein the plurality of feature regions includes a first region and at least one second region, and the marking the plurality of feature regions in the target map in each of the second images using the map detection model includes:
calibrating a first region in the target map based on at least two points on a boundary of the target map, and
marking at least one geographical area meeting preset conditions in the target map as a second area;
wherein the determining whether the target map in each second image is an error target map according to the calibrated feature area comprises:
scoring the correctness of the calibrated first area and second area;
calculating an area ratio of the first region in the second image to the second image under the condition that the correctness score of the first region is determined to be larger than a second threshold value and the correctness score of any second region is determined to be smaller than or equal to a third threshold value;
and determining whether the target map in the second image is an error target map according to the area ratio.
2. The method of claim 1, wherein the extracting the plurality of first images from the video and determining the locations of the plurality of first images in the video comprises:
Dividing the video according to frames to obtain a plurality of first images; and
the frame position of each first image in the video is determined as the position of the first image in the video.
3. The method of claim 1, wherein the determining at least one first image containing a target map from the plurality of first images as a second image using a map classification model comprises:
classifying the plurality of first images by using a map classification model to obtain the probability that each first image contains a target map;
a first image having a probability greater than a first threshold is determined as the second image.
4. The method of claim 3, further comprising, prior to classifying the plurality of first images using a map classification model:
the first image is scaled to a first preset size.
5. The method of claim 1, wherein the determining whether the target map in each second image is an erroneous target map based on the calibrated feature region comprises:
and under the condition that the number of the calibrated characteristic areas is not equal to the number of the characteristic areas in the preset standard target map, determining the target map in the second image as an error target map.
6. The method of claim 5, further comprising: and determining the missing characteristic region in the error target map according to the characteristic region in the preset standard target map.
7. The method of claim 1, wherein the determining whether the target map in the second image is an erroneous target map according to the area ratio comprises:
and determining that the target map in the second image is an error target map under the condition that the area ratio is determined to be larger than a fourth threshold value.
8. The method of claim 7, wherein the determining whether the target map in the second image is an erroneous target map according to the area ratio further comprises:
in the case that the area ratio is determined to be less than or equal to a fourth threshold value, cutting out the target map from the second image;
and enlarging the target map to a second preset size, and returning to the step of marking at least one geographic area meeting preset conditions in the target map as a second area.
9. The method of claim 8, further comprising, prior to using the map detection model to identify the plurality of feature regions in the target map in each second image:
Scaling the second image to the second preset size.
10. The method of claim 1, further comprising:
and determining the wrong second area in the target map according to the correctness score of each second area in the second image.
11. The method of claim 1, further comprising:
and determining that the target map in the second image is a correct target map under the condition that the correctness score of the first area is larger than a second threshold value and the correctness score of the at least one second area is larger than a third threshold value.
12. The method of claim 1, further comprising:
training a first neural network model by using a first map sample set to obtain the map detection model;
wherein the first set of map samples includes a plurality of target map samples, each target map sample having a calibrated first region and at least one calibrated second region.
13. The method of claim 1, further comprising:
training a second neural network model by using a second map sample set to obtain the map classification model;
wherein the second set of map samples includes a plurality of target map positive samples and a plurality of target map negative samples.
14. A map detection apparatus comprising:
an extraction module for extracting a plurality of first images from a video and determining the positions of the plurality of first images in the video;
a first determining module for determining a first image including a target map from the plurality of first images as a second image using a map classification model;
a second determining module for determining a second image containing the erroneous target map as a target image using the map detection model;
a third determining module, configured to determine a position of the target image in the video as a position of an erroneous target map;
the second determining module includes:
a calibration unit for calibrating a plurality of feature areas in the target map in each of the second images using the map detection model;
the first determining unit is used for determining whether the target map in each second image is an error target map according to the calibrated characteristic region;
a second determination unit configured to take a second image including an erroneous target map as a target image;
wherein the plurality of characteristic areas comprise a first area and at least one second area, the calibration unit is used for calibrating the first area in the target map based on at least two points on the boundary of the target map, and at least one geographic area meeting preset conditions is calibrated in the target map as the second area;
Wherein the first determining unit includes:
the scoring subunit is used for scoring the correctness of the calibrated first area and second area;
a calculating subunit, configured to calculate an area ratio of the first region in the second image to the second image when it is determined that the correctness score of the first region is greater than a second threshold and the correctness score of any second region is less than or equal to a third threshold;
and the determining subunit is used for determining whether the target map in the second image is an error target map according to the area ratio.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 13.
16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 13.
CN202110059261.0A 2021-01-15 2021-01-15 Map detection method, map detection device, electronic equipment and storage medium Active CN112785567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110059261.0A CN112785567B (en) 2021-01-15 2021-01-15 Map detection method, map detection device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110059261.0A CN112785567B (en) 2021-01-15 2021-01-15 Map detection method, map detection device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112785567A CN112785567A (en) 2021-05-11
CN112785567B true CN112785567B (en) 2023-09-22

Family

ID=75756928

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110059261.0A Active CN112785567B (en) 2021-01-15 2021-01-15 Map detection method, map detection device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112785567B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019101021A1 (en) * 2017-11-23 2019-05-31 腾讯科技(深圳)有限公司 Image recognition method, apparatus, and electronic device
CN109977191A (en) * 2019-04-01 2019-07-05 国家基础地理信息中心 Problem map detection method, device, electronic equipment and medium
CN111222423A (en) * 2019-12-26 2020-06-02 深圳供电局有限公司 Target identification method and device based on operation area and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019101021A1 (en) * 2017-11-23 2019-05-31 腾讯科技(深圳)有限公司 Image recognition method, apparatus, and electronic device
CN109977191A (en) * 2019-04-01 2019-07-05 国家基础地理信息中心 Problem map detection method, device, electronic equipment and medium
CN111222423A (en) * 2019-12-26 2020-06-02 深圳供电局有限公司 Target identification method and device based on operation area and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Hybird Framework for Fault Detection、Classification、and Location-Part 1:Concept、Structure 、and Methodology;Joe-Air Jiang;《IEEE》;26(3);全文 *
CNN图像修复区域故障检测的迭代方法;胡德敏;胡钰媛;褚成伟;胡晨;;小型微型计算机系统(07);全文 *

Also Published As

Publication number Publication date
CN112785567A (en) 2021-05-11

Similar Documents

Publication Publication Date Title
WO2020238054A1 (en) Method and apparatus for positioning chart in pdf document, and computer device
WO2020207167A1 (en) Text classification method, apparatus and device, and computer-readable storage medium
CN112560862B (en) Text recognition method and device and electronic equipment
CN109446061B (en) Page detection method, computer readable storage medium and terminal device
CN110751043A (en) Face recognition method and device based on face visibility and storage medium
CN112861885B (en) Image recognition method, device, electronic equipment and storage medium
CN111666907B (en) Method, device and server for identifying object information in video
CN116844177A (en) Table identification method, apparatus, device and storage medium
CN113361468A (en) Business quality inspection method, device, equipment and storage medium
CN114419035B (en) Product identification method, model training device and electronic equipment
CN113205041A (en) Structured information extraction method, device, equipment and storage medium
CN109460496B (en) Method and device for realizing data display
CN114120071A (en) Detection method of image with object labeling frame
CN112508005B (en) Method, apparatus, device and storage medium for processing image
CN112785567B (en) Map detection method, map detection device, electronic equipment and storage medium
CN114120305B (en) Training method of text classification model, and text content recognition method and device
CN112749978B (en) Detection method, apparatus, device, storage medium, and program product
CN113221519B (en) Method, apparatus, device, medium and product for processing form data
CN114067328A (en) Text recognition method and device and electronic equipment
CN113887394A (en) Image processing method, device, equipment and storage medium
CN111950354A (en) Seal home country identification method and device and electronic equipment
CN113157160B (en) Method and apparatus for identifying misleading play button
CN113010721B (en) Picture auditing method and device, electronic equipment and storage medium
CN114580631B (en) Model training method, smoke and fire detection method, device, electronic equipment and medium
CN117351010B (en) Metal concave structure defect detection method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant