WO2021189870A1 - 违法建筑识别方法、装置、设备及存储介质 - Google Patents
违法建筑识别方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2021189870A1 WO2021189870A1 PCT/CN2020/128257 CN2020128257W WO2021189870A1 WO 2021189870 A1 WO2021189870 A1 WO 2021189870A1 CN 2020128257 W CN2020128257 W CN 2020128257W WO 2021189870 A1 WO2021189870 A1 WO 2021189870A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- image
- building
- feature
- target image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 230000009466 transformation Effects 0.000 claims description 75
- 238000000605 extraction Methods 0.000 claims description 52
- 238000010276 construction Methods 0.000 claims description 47
- 230000004927 fusion Effects 0.000 claims description 37
- 230000015654 memory Effects 0.000 claims description 20
- 239000011159 matrix material Substances 0.000 claims description 17
- 239000000284 extract Substances 0.000 claims description 16
- 230000008602 contraction Effects 0.000 claims description 6
- 238000013519 translation Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 abstract description 10
- 230000008569 process Effects 0.000 description 21
- 238000001514 detection method Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 13
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000007781 pre-processing Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000007689 inspection Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011410 subtraction method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/176—Urban or other man-made structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/16—Real estate
- G06Q50/163—Real estate management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
- G06T7/001—Industrial image inspection using an image reference approach
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/753—Transform-based matching, e.g. Hough transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/809—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/95—Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30181—Earth observation
- G06T2207/30184—Infrastructure
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/247—Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids
Definitions
- This application relates to image processing technology, for example, to the field of cloud computing, and specifically to a method, device, equipment, and storage medium for identifying illegal buildings.
- the embodiments of the present application provide a method, device, equipment, and storage medium for identifying illegal buildings, so as to realize automatic identification of illegal buildings, reduce identification costs, and improve identification efficiency.
- the embodiments of the present application provide a method for identifying illegal buildings, including:
- the result of the illegal building recognition of the target image is determined.
- This application obtains the target image and the reference image associated with the target image; extracts the target building features of the target image and the reference building features of the reference image respectively; and determines the illegal construction recognition result of the target image according to the target building features and the reference building features.
- the above technical solution obtains the reference image associated with the target image, binds the target image and the reference image, and performs feature extraction on the bound image, so as to identify the target image based on the architectural features of the reference image.
- the distance difference between the acquisition positions of the target image and the reference image is less than a set distance threshold, or the angle difference between the acquisition angles of the target image and the reference image is less than a set angle threshold Or the distance difference between the acquisition positions of the target image and the reference image is less than a set distance threshold, and the angle difference between the acquisition angles of the target image and the reference image is less than the set angle threshold.
- An optional implementation in the above-mentioned application by comparing the distance difference between the collection position of the target image and the reference image, or the angle difference between the collection angle, or the distance difference between the collection position of the target image and the reference image and the collection angle.
- the angle difference is limited, so as to realize the binding of the target image and the reference image, avoid the situation that the target image is identified based on multiple reference images, and reduce the amount of data calculation.
- extracting the target building features of the target image and the reference building features of the reference image associated with the target image respectively includes:
- feature extraction is performed on the target basic feature and the reference basic feature respectively to obtain the target building feature and the reference building feature under the at least two scales.
- An optional implementation of the above application is to refine the architectural feature extraction process into basic feature extraction, and further perform feature extraction on the basic features at at least two scales, so as to obtain the details of the image at different scales.
- determining the result of the illegal building recognition of the target image according to the target building feature and the reference building feature includes:
- the illegal recognition result of the target image is determined.
- An optional implementation in the above-mentioned application is to refine the process of determining the result of the illegal building identification to integrate the building features at each scale, and to identify the illegal building based on the characteristic content and results of at least two scales.
- the mechanism for identifying illegal establishments under multiple scales has been improved.
- the feature fusion of the target building feature and the reference building feature at each scale includes:
- the difference between the target building feature and the reference building feature at each scale is made, and the difference is taken as the feature fusion result at the scale.
- An optional implementation in the above application is to refine the feature fusion process to use the difference between the target building feature and the reference building feature at each scale as the feature fusion result, thereby improving the feature fusion mechanism.
- extracting the target basic features of the target image and the reference basic features of the reference image associated with the target image respectively includes:
- the target basic features of the target image and the reference basic features of the reference image associated with the target image are extracted respectively.
- An optional implementation in the above application refines the basic feature extraction process into basic feature extraction based on a deep residual network, which improves the feature extraction method and at the same time improves the accuracy of the feature extraction result.
- the method before extracting the reference building feature of the reference image, the method further includes:
- the coordinate transformation includes at least one of contraction transformation, stretching transformation, rotation transformation and translation transformation.
- An optional implementation in the above application is to perform at least one of shrinking, stretching, rotating, and translating the reference image according to the target image before the feature extraction of the reference image, so that the transformed image and the target
- the image coordinate matching provides a guarantee for the accuracy of the result of the illegal construction recognition.
- performing coordinate transformation on the reference image according to the target image includes:
- a transformation matrix is determined, and coordinate transformation is performed on the reference image according to the transformation matrix.
- An optional implementation in the above application is to refine the process of changing the coordinates of the reference image into the extraction of key points and descriptors for the target image and the reference image, and according to the description of the target image and the reference image.
- the key point is matched, and the transformation matrix is determined according to the key point matching result, and then the coordinate change of the reference image is performed according to the determined transformation matrix.
- the processing mechanism for the coordinate transformation of the reference image is improved, so as to ensure the accuracy of the identification result of the illegal construction. Degree provides protection.
- determining the result of the illegal construction recognition of the target image includes:
- the position coordinates of the illegally constructed area are determined.
- An optional implementation in the above application is to refine the process of determining the result of the illegal construction into a two-class classification of the construction area in the target image, and when the construction area includes the illegal construction area, the location coordinates of the illegal construction area The detection has enriched the content of the illegal construction identification results.
- an illegal building identification device including:
- An image acquisition module configured to acquire a target image and a reference image associated with the target image
- the building feature extraction module is configured to extract the target building features of the target image and the reference building features of the reference image respectively;
- the recognition result determining module is configured to determine the illegal building recognition result of the target image according to the target building feature and the reference building feature.
- an electronic device including:
- At least one processor At least one processor
- a memory communicatively connected with the at least one processor; wherein,
- the memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the one provided by the embodiment of the first aspect.
- an embodiment of the present application also provides a non-transitory computer-readable storage medium storing computer instructions that are configured to make the computer execute the illegal building identification provided by the embodiment of the first aspect method.
- Fig. 1 is a flowchart of a method for identifying illegal buildings in the first embodiment of the present application
- Figure 2 is a flowchart of a method for identifying illegal buildings in the second embodiment of the present application
- Fig. 3 is a flowchart of a method for identifying illegal buildings in the third embodiment of the present application.
- 4A is a flowchart of a method for identifying illegal buildings in the fourth embodiment of the present application.
- 4B is a structural block diagram of an image matching process in the fourth embodiment of the present application.
- 4C is a schematic diagram of a reference image in the fourth embodiment of the present application.
- 4D is a schematic diagram of a target image in the fourth embodiment of the present application.
- 4E is a schematic diagram of a transformed reference image in the fourth embodiment of the present application.
- 4F is a structural block diagram of an image detection process in the fourth embodiment of the present application.
- 4G is a schematic diagram of a target image labeling result in the fourth embodiment of the present application.
- Figure 5 is a structural diagram of an illegal building identification device in the fifth embodiment of the present application.
- Fig. 6 is a block diagram of an electronic device used to implement the data access method of an embodiment of the present application.
- Fig. 1 is a flowchart of a method for identifying illegal buildings in the first embodiment of the present application.
- the embodiment of the present application is suitable for the case of identifying illegal buildings in an image.
- the method is executed by an illegal building identification device, which is implemented by software, hardware, or hardware and software, and is specifically configured in an electronic device.
- An illegal building identification method as shown in Figure 1 includes:
- the target image is the image that needs to be identified for illegal buildings;
- the reference image is the default image without illegal buildings.
- the target image and the reference image can be understood as images acquired at different times for the same area or approximately the same area, wherein the acquisition time of the reference image is earlier than the target image.
- the reference image can be an image collected at a set collection interval from the current collection time, or an image collected when the illegal building is identified for the first time.
- the reference image can also be replaced in real time or at a fixed time, which is not limited in this application.
- the distance difference between the acquisition position of the acquired target image and the reference image is less than the set distance threshold.
- the angle difference between the acquisition angles of the acquired target image and the reference image is smaller than the set angle threshold to ensure that the acquisition angles of the target image and the reference image are the same or similar.
- the set distance threshold and the set angle threshold can be determined by a technician according to needs or experience values.
- the acquisition angle may be the image angle or the pitch angle of the acquisition device.
- UAVs are usually used to take images according to the set inspection route and the set collection frequency.
- the collected images can be numbered according to the order of image collection.
- the acquisition frequency can be determined by the technician according to the acquisition requirements or the lens parameters of the drone.
- the target image and the reference image associated with the target image can be pre-stored locally in the electronic device, other storage devices associated with the electronic device, or in the cloud, and when the illegal building identification needs to be performed, the electronic device is locally or connected to the electronic device.
- the target image and reference image are acquired in other storage devices or the cloud associated with the device.
- the target image and the reference image may be numbered according to the image collection position respectively, so that the target image and the reference image number at the same collection position are the same.
- the target image and the reference image with the same number are acquired.
- the target image it is also possible to transmit the target image to the electronic device in real time when the acquisition device (such as a drone) collects the target image; store the reference image locally in the electronic device, other storage devices associated with the electronic device, or the cloud middle.
- the electronic device receives the target image collected by the acquisition device in real time, it acquires the reference image associated with the target image from the local of the electronic device, other storage devices associated with the electronic device, or the cloud.
- S102 Extract the target building features of the target image and the reference building features of the reference image respectively.
- deep learning is used. Extracting the architectural features in the target image and the reference image, so that the extracted features can reflect the semantic information in the image, making the extracted features more rich and comprehensive.
- the areas to be identified are usually divided according to administrative regions, such as townships and towns. Therefore, the number of image samples collected in the area to be identified is limited. Because the difference between different images is significant, the target image and the reference image are bound to train the feature extraction model based on the siamese idea. Correspondingly, when using the feature extraction model for feature extraction, the same feature extraction model and model parameters are used to extract the architectural features of the target image and the reference image associated with the target image to ensure the consistency of the extracted architectural features .
- determining the result of the illegal construction recognition of the target image may be: fusing the features of the target building and the reference building feature; and determining the identification of the illegal construction of the target image based on the fused features result.
- the difference between the target building feature and the reference building feature may be used, and the difference result is used as the feature fusion result. It is understandable that by performing feature fusion in a poor way, it can highlight the fused architectural features corresponding to the dissimilar areas of the target image and the reference image; correspondingly, when fusing the architectural features to identify illegal buildings, the recognition results can be significantly improved. Accuracy.
- determining the result of the illegal construction recognition of the target image may be that the construction area in the target image is classified into the existence of illegal construction and the absence of illegal construction.
- the classification model can be based on the fusion building feature after feature fusion based on the target building feature and the reference building feature to obtain the classification result.
- determining the result of the illegal construction recognition of the target image may also be: determining whether the target image includes the illegal construction area; if the target image includes the illegal construction area, determining the illegal construction The location coordinates of the area.
- the fused building features after feature fusion is performed according to the target building feature and the reference building feature, the illegal construction area in the target image is detected, and the location coordinates of the illegal construction area are determined.
- the recognition loss function and the positioning loss function may be introduced in the training process of the detection model, and the network parameters in the detection model can be optimized and adjusted based on the recognition loss function and the positioning deviation loss function.
- the recognition loss function is set to characterize the deviation between the classification result output by the model and the actual classification result
- the positioning loss function is set to characterize the deviation between the position coordinates of the illegally constructed area output by the model and the actual position coordinates of the illegally constructed area.
- the illegally constructed area can be represented by a circular area or a rectangular area.
- the position coordinates can include the center of the circle and the radius of the circle.
- the position coordinates include one of the apex coordinates, the rectangle length value, and the rectangle width value of the rectangular area; or, the position coordinates include at least two apex coordinates, such as two apex coordinates corresponding to a diagonal line.
- the location coordinates include the coordinates of the upper left vertex of the rectangular area, the rectangular length value, and the rectangular width value.
- take the upper left vertex as the starting point determine one side of the rectangle in a direction parallel to the length direction of the target image and the distance as the rectangle length value, and determine the other side of the rectangle in a direction parallel to the width direction of the target image and the distance as the rectangle width value.
- Side determine the illegal construction area.
- the illegal construction area can also be marked in the target image or the reference image according to the position coordinates of the illegal construction area. In order to avoid the difference between the coordinates of the target image and the reference image due to the different acquisition angles of the target image and the reference image, generally, the illegal area will be marked in the target image.
- This application obtains the target image and the reference image associated with the target image; extracts the target building feature of the target image and the reference building feature of the reference image respectively; and determines the illegal construction recognition result of the target image according to the target building feature and the reference building feature.
- the above technical solution obtains the reference image associated with the target image, binds the target image and the reference image, and performs feature extraction on the bound image, so as to identify the target image based on the architectural features of the reference image. Automatic identification of illegal buildings, and reduces the amount of data processing in the identification process of illegal buildings; at the same time, based on the siamese (twin) idea, the target image and reference image are extracted from the architectural features, and then the illegal building identification is performed based on the extracted architectural features. The accuracy of the recognition result is improved.
- Fig. 2 is a flowchart of a method for identifying illegal buildings in the second embodiment of the present application.
- the embodiment of the present application is optimized and improved on the basis of the technical solutions of each of the foregoing embodiments.
- the operation “respectively extract the target architectural features of the target image and the reference architectural features of the reference image associated with the target image” is refined into “respectively extract the target basic features of the target image, and The reference basic feature of the reference image associated with the target image; under at least two set scales, feature extraction is performed on the target basic feature and the reference basic feature respectively to obtain the at least two scales The target building features and the reference building features" in order to improve the extraction method of building features.
- An illegal building identification method as shown in Figure 2 includes:
- S202 Extract the target basic features of the target image and the reference basic features of the reference image associated with the target image respectively.
- the target basic features of the target image and the reference basic features of the reference image associated with the target image are extracted respectively.
- the network depth of the deep residual network can be determined based on empirical values or a large number of experiments. For example, you can set the network depth to 50.
- S203 Perform feature extraction on the target basic feature and the reference basic feature under the set at least two scales, respectively, to obtain the target building feature and the reference building feature under the at least two scales.
- the number of scales can be determined by the technicians according to needs or experience values, and can also be determined according to the model training results during the model training process.
- the scale can be set to 5.
- the feature pyramid model may be used to perform feature extraction on the target basic feature and the reference basic feature in at least two set scales, respectively, to obtain the target building feature and the reference building feature in the at least two scales.
- S204 Determine the result of the illegal building recognition of the target image according to the target building feature and the reference building feature.
- both the target building feature and the reference building feature contain feature maps of different scales, it is necessary to identify the target building features and reference building features at different scales when identifying the illegal building of the target image according to the target building features and reference building features Perform feature fusion.
- determining the illegal recognition result of the target image according to the target building feature and the reference building feature may be: combining the target building feature and the reference building feature at each scale , Perform feature fusion; determine the result of the illegal construction recognition of the target image according to the feature fusion results in at least two scales.
- the feature fusion of the target building feature and the reference building feature at each scale may be: difference between the target building feature and the reference building feature at each scale, and Take the difference as the result of feature fusion at this scale.
- the difference between the target building feature and the reference building feature at each scale can be highlighted. Furthermore, when determining the illegal recognition result of the target image based on the feature fusion results in at least two scales, the difference between the target image and the reference image in each scale can be referred to, making the reference information more abundant and comprehensive, and thus improving the illegality. Build the accuracy of the recognition results.
- the embodiment of this application refines the process of extracting architectural features of the target image and the reference image into extracting the basic features of the target image and the reference image respectively, and extracting the basic features with multi-scale features to obtain architectural features, which are based on multi-scale
- the target building features and reference building features determine the result of the illegal construction recognition of the target image, which improves the accuracy and richness of the extracted architectural features, improves the characterization ability of the architectural features, and further improves the accuracy of the illegal construction recognition results.
- Fig. 3 is a flowchart of a method for identifying illegal buildings in the third embodiment of the present application.
- the embodiment of the present application is optimized and improved on the basis of the technical solutions of each of the foregoing embodiments.
- An illegal building identification method as shown in Figure 3 includes:
- S302 Perform coordinate transformation on the reference image according to the target image.
- the coordinate transformation includes at least one of contraction transformation, stretching transformation, rotation transformation and translation transformation.
- the coordinate of the target image and the reference image may not match.
- the coordinates of the target image and the reference image do not match, it will have a certain impact on the accuracy of the target image's illegal recognition result.
- a machine learning model can be used to extract the target orientation feature of the target image and the reference orientation feature of the reference image respectively; according to the matching of the target orientation feature and the reference orientation feature, the deformation of the reference image relative to the target image is determined ; According to the deformation situation, the reference image is adjusted to make the adjusted reference image match the coordinates of the target image.
- the target key points and target descriptors of the target image and the reference key points and reference descriptors of the reference image respectively; according to the target descriptor and the reference descriptor, the target The key point and the reference key point perform a matching operation; according to the matching result, a transformation matrix is determined, and the reference image is transformed according to the transformation matrix.
- the scale-invariant feature transform (SIFT) algorithm is used to extract key points and descriptors of the target image to obtain the target key points and target descriptors;
- the SIFT algorithm is used to perform key points and descriptors on the reference image. Extract, get reference key points and reference descriptors.
- the K-dimensional tree (KD Tree) is used to match the target key points and the reference key points according to the matching situation of the target descriptor and the reference descriptor to obtain the initial matching relationship; random sampling consensus (Random Sample Consensus, RANSAC) algorithm to remove the invalid initial matching relationship to obtain the target matching relationship; according to the target matching relationship, determine the transformation matrix between the reference image and the target image; perform coordinate transformation on the reference image according to the transformation matrix to make the transformed reference image Match the coordinates of the target image.
- RANSAC Random Sample Consensus
- S304 Determine the result of the illegal building recognition of the target image according to the target building feature and the reference building feature.
- the embodiment of the application performs coordinate transformation on the reference image according to the target image before extracting the reference building features of the reference image, so that the coordinates of the reference image and the target image are matched, thereby providing a guarantee for the accuracy of the result of the illegal building recognition .
- the target image and the reference image can also be preprocessed.
- the target image, or the reference image, or the target image and the reference image are scaled to keep the size of the target image and the reference image consistent.
- grayscale transformation is performed on the target image and the reference image to eliminate the image hue and saturation information, while retaining the brightness information to realize the conversion of the RGB image or the color image It is a grayscale image.
- Histogram Equalization is performed on the target image and the reference image to enhance image contrast and remove the influence of factors such as illumination.
- Fig. 4A is a flowchart of a method for identifying illegal buildings in the fourth embodiment of the present application.
- the embodiment of the present application provides a preferred implementation on the basis of the technical solutions of each of the foregoing embodiments.
- An illegal building identification method as shown in Figure 4A includes:
- the image acquisition process includes:
- the target image is an image containing buildings collected at a set frequency by controlling the drone at the current moment according to a set route.
- the reference image is an image containing buildings collected at a set frequency by controlling the drone at a historical moment in accordance with a set route.
- the image collection parameters include acquisition frequency and acquisition angle.
- the acquisition route, acquisition frequency, and other acquisition parameters of the UAV can be determined by a technician according to needs or experience values.
- the set distance threshold and the set angle threshold are determined by technicians according to needs or empirical values, or through a large number of experiments.
- the resolution of the image collected by the drone is 4000*6000.
- the height is 4000 and the width is 6000.
- the image matching process includes:
- S421 Perform image preprocessing on the reference image and the target image.
- the reference image is Img1 and the target image is Img2.
- the image preprocessing operation includes: scaling transformation (resize), which is set to perform scaling processing on the reference image and the target image, so that the processed target image and the reference image have the same size.
- scaling transformation resize
- the size is unified as 1000*1500.
- the image preprocessing operation also includes: grayscale transformation (rgb2gray), which is set to transform the scaled reference image and target image from a color image into a grayscale image.
- grayscale transformation rgb2gray
- the image preprocessing operation also includes histogram equalization (EqualizeHist) to eliminate the influence of the target image and the reference image on the detection result due to the difference in the acquisition environment such as illumination.
- EqualizeHist histogram equalization
- S422 Perform an image matching operation on the reference image and the target image to obtain a transformation matrix when the reference image is transformed to the target image.
- the image matching operation including key point and descriptor extraction, is set to use the SIFT algorithm to perform feature extraction operations on the reference image to obtain the reference key points and reference descriptors of the reference image; perform feature extraction on the target image through the same algorithm Operation, the target key points and target descriptors of the target image are obtained.
- the image matching operation also includes key point matching, which is set to match the target key point and the reference key point according to the consistency of the reference descriptor and the target descriptor through the KD Tree algorithm to obtain the key point matching result.
- the target matching operation also includes abnormal point elimination, which is set to remove the invalid matching relationship in the key point matching result through the RANSAC algorithm, obtain the final exact matching relationship, and determine the transformation matrix corresponding to the exact matching relationship.
- the coordinate transformation includes at least one of contraction, stretching, rotation and translation transformation.
- the coordinates of the reference image after the coordinate transformation are consistent with the coordinates of the target image.
- FIG. 4C and FIG. 4D are the reference image Img1 and the target image Img2, respectively;
- FIG. 4E is the transformed reference image Img1_trans.
- the reference image Img1 in FIG. 4C is rotated and transformed to obtain FIG. 4E. Comparing Fig. 4E and Fig. 4D, it can be seen that the coordinates of the two are consistent.
- the image detection process includes:
- a deep residual network is used to extract the basic features of the target image and the transformed reference image respectively.
- the network parameters of the deep residual network used for basic feature extraction of the target image and the transformed reference image are the same.
- the network depth of the deep residual network can be determined by technical personnel according to needs or empirical values, and it can also be determined repeatedly through a large number of experiments.
- the network depth can be 50.
- the feature pyramid network FPN is used to extract the architectural features of the basic features of the target image and the reference image at different scales to obtain the target architectural feature Fea1 and the reference architectural feature Fea2.
- S433 Difference the architectural features of the target image at each scale and the architectural features of the reference image to obtain a fusion feature.
- the feature subtraction method is used to fuse the target building feature and the reference building feature at each scale to obtain the fusion feature Feature, which can highlight the difference between the target image and the reference image at the same scale, and obtain the suspected violation. Build area.
- S434 Based on the detection model, according to the fusion features at all scales, determine whether the target image includes an illegally constructed area.
- S435 If the target image includes an illegally constructed area, output the coordinates of the illegally constructed area.
- the illegal construction area includes at least one illegal building.
- illegal construction can be the addition of color steel plates, scaffolding, and roof repairs to existing buildings, or the construction of houses in areas where housing construction cannot be carried out.
- the detection model can be constructed based on the neural network model.
- the recognition loss function Focal_loss and the positioning loss function SmoothL1_loss can be introduced.
- the network parameters in the detection model can be optimized and adjusted.
- the recognition loss function is set to characterize the deviation between the classification result output by the model and the actual classification result
- the positioning loss function is set to characterize the deviation between the position coordinates of the illegally constructed area output by the model and the actual position coordinates of the illegally constructed area.
- a rectangular frame is used to label the illegal area in the target image.
- Fig. 5 is a structural diagram of an illegal building identification device in the fifth embodiment of the present application.
- the embodiment of the present application is suitable for the case of recognizing illegal buildings in the image.
- the device is implemented by software, or hardware, or software and hardware, and is specifically configured in an electronic device.
- An illegal building identification device 500 as shown in FIG. 5 includes: an image acquisition module 501, a building feature extraction module 502, and a recognition result determination module 503. in,
- the image acquisition module 501 is configured to acquire a target image and a reference image associated with the target image
- the building feature extraction module 502 is configured to extract the target building features of the target image and the reference building features of the reference image respectively;
- the recognition result determining module 503 is configured to determine the illegal building recognition result of the target image according to the target building feature and the reference building feature.
- This application acquires the target image and the reference image associated with the target image through the image acquisition module; extracts the target building features of the target image and the reference building features of the reference image through the building feature extraction module; determines the module according to the target building features and the reference building through the recognition result Features to determine the result of the illegal recognition of the target image.
- the above technical solution obtains the reference image associated with the target image, binds the target image and the reference image, and performs feature extraction on the bound image, so as to identify the target image based on the architectural features of the reference image.
- the distance difference between the acquisition positions of the target image and the reference image is less than a set distance threshold, or the angle difference between the acquisition angles of the target image and the reference image is less than a set angle threshold Or the distance difference between the acquisition positions of the target image and the reference image is less than a set distance threshold, and the angle difference between the acquisition angles of the target image and the reference image is less than the set angle threshold.
- the building feature extraction module 502 includes:
- the basic feature extraction unit is configured to extract the target basic features of the target image and the reference basic features of the reference image associated with the target image respectively;
- the architectural feature extraction unit is configured to perform feature extraction on the target basic feature and the reference basic feature respectively under at least two set scales to obtain the target architectural feature and the target building feature under the at least two scales.
- the reference building features are described.
- the recognition result determining module 503 includes:
- the feature fusion unit is set to perform feature fusion between the target building feature and the reference building feature at each scale;
- the recognition result determining unit is configured to determine the illegally constructed recognition result of the target image according to the feature fusion results in at least two scales.
- the feature fusion unit includes:
- the feature fusion subunit is set to make the difference between the target building feature and the reference building feature at each scale, and use the difference as the feature fusion result at the scale.
- the basic feature extraction unit includes:
- the basic feature extraction subunit is configured to extract the target basic feature of the target image and the reference basic feature of the reference image associated with the target image based on a deep residual network.
- the device further includes a coordinate transformation module configured to:
- the coordinate transformation includes at least one of contraction transformation, stretching transformation, rotation transformation and translation transformation.
- the coordinate transformation module includes:
- a key point extraction unit configured to respectively extract target key points and target descriptors of the target image, and reference key points and reference descriptors of the reference image;
- a key point matching unit configured to perform a matching operation on the target key point and the reference key point according to the target descriptor and the reference descriptor;
- the coordinate transformation unit is configured to determine a transformation matrix according to the matching result, and perform coordinate transformation on the reference image according to the transformation matrix.
- the recognition result determining module 503 includes:
- An illegally constructed area determining unit configured to determine whether the illegally constructed area is included in the target image according to the target building feature and the reference building feature;
- the position coordinate determination unit is configured to determine the position coordinates of the illegal construction area if the target image includes the illegal construction area.
- the above-mentioned illegal building identification device can execute the illegal building identification method provided in any embodiment of the present application, and has the corresponding functional modules and beneficial effects for executing the illegal building identification method.
- the present application also provides an electronic device and a readable storage medium.
- FIG. 6 it is a block diagram of an electronic device that implements the method for identifying illegal buildings in an embodiment of the present application.
- Electronic devices are intended to represent every form of digital computer, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
- An electronic device can also represent every form of mobile device, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices.
- the components shown herein, their connections and relationships, and their functions are merely examples, and are not intended to limit the implementation of the application described or required herein.
- the electronic device includes: one or more processors 601, a memory 602, and an interface configured to connect each component, including a high-speed interface and a low-speed interface.
- Each component is connected to each other using a different bus, and can be installed on a common motherboard or installed in other ways as needed.
- the processor may process instructions executed in the electronic device, including instructions stored in or on the memory to display graphical information of the GUI on an external input/output device (such as a display device coupled to an interface).
- an external input/output device such as a display device coupled to an interface.
- multiple processors, or multiple buses, or multiple servers and multiple buses may be used together with multiple memories and multiple memories.
- multiple electronic devices can be connected, and each device provides part of the necessary operations (for example, as a server array, a group of blade servers, or a multi-processor system).
- a processor 601 is taken as an example.
- the memory 602 is a non-transitory computer-readable storage medium provided by this application.
- the memory stores instructions executable by at least one processor, so that the at least one processor executes the illegal building identification method provided in this application.
- the non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are configured to make the computer execute the illegal building identification method provided in the present application.
- the memory 602 can be configured to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as the program instructions/modules corresponding to the illegal building identification method in the embodiment of the present application (for example, , The image acquisition module 501, the building feature extraction module 502, and the recognition result determination module 503 shown in FIG. 5).
- the processor 601 executes each functional application and data processing of the server by running the non-transitory software programs, instructions, and modules stored in the memory 602, that is, realizes the illegal building identification method in the foregoing method embodiment.
- the memory 602 may include a storage program area and a storage data area.
- the storage program area can store an operating system and an application program required by at least one function; the storage data area can store data created by the use of an electronic device that implements the method for identifying illegal buildings. Wait.
- the memory 602 may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid-state storage devices.
- the memory 602 may optionally include a memory remotely provided with respect to the processor 601, and these remote memories may be connected to an electronic device that implements a method for identifying illegal buildings through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
- the electronic device that implements the method for identifying illegal buildings may further include: an input device 603 and an output device 604.
- the processor 601, the memory 602, the input device 603, and the output device 604 may be connected by a bus or in other ways. In FIG. 6, the connection by a bus is taken as an example.
- the input device 603 can receive input digital or character information, and generate key signal input related to the user settings and function control of the electronic device that implements the illegal building identification method, such as touch screen, keypad, mouse, trackpad, touchpad, and instructions Stick, one or more mouse buttons, trackball, joystick and other input devices.
- the output device 604 may include a display device, an auxiliary lighting device (for example, LED), a tactile feedback device (for example, a vibration motor), and the like.
- the display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some embodiments, the display device may be a touch screen.
- Each implementation of the systems and techniques described herein can be implemented in a digital electronic circuit system, an integrated circuit system, a special ASIC (application specific integrated circuit), computer hardware, firmware, software, or a combination thereof.
- Each of these implementations may include: being implemented in one or more computer programs, the one or more computer programs may be executed, or interpreted, or executed and interpreted on a programmable system including at least one programmable processor, the The programmable processor may be a dedicated or general programmable processor, which can receive data and instructions from the storage system, at least one input device, and at least one output device, and transmit the data and instructions to the storage system and the at least one input device , And the at least one output device.
- the systems and techniques described here can be implemented on a computer that has: a display device (for example, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) configured to display information to the user ); and a keyboard and a pointing device (for example, a mouse or a trackball) through which the user can provide input to the computer.
- a display device for example, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and a pointing device for example, a mouse or a trackball
- Other types of devices can also be configured to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (for example, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, voice input, or tactile input) to receive input from the user.
- the systems and technologies described herein can be implemented in a computing system that includes back-end components (for example, as a data server), or a computing system that includes middleware components (for example, an application server), or a computing system that includes front-end components (for example, A user computer with a graphical user interface or a web browser through which the user can interact with the implementation of the system and technology described herein), or includes such back-end components, middleware components, Or any combination of front-end components in a computing system.
- the components of the system can be connected to each other through any form or medium of digital data communication (for example, a communication network). Examples of communication networks include: local area network (LAN), wide area network (WAN), the Internet, and blockchain networks.
- the computer system can include clients and servers.
- the client and server are generally far away from each other and usually interact through a communication network.
- the relationship between the client and the server is generated by computer programs that run on the corresponding computers and have a client-server relationship with each other.
- This application obtains the target image and the reference image associated with the target image; extracts the target building feature of the target image and the reference building feature of the reference image respectively; and determines the illegal construction recognition result of the target image according to the target building feature and the reference building feature.
- the above technical solution obtains the reference image associated with the target image, binds the target image and the reference image, and performs feature extraction on the bound image, so as to identify the target image based on the architectural features of the reference image. Automatic identification of illegal buildings, and reduces the amount of data processing in the identification process of illegal buildings; at the same time, based on the siamese (twin) idea, the target image and reference image are extracted from the architectural features, and then the illegal building identification is performed based on the extracted architectural features. The accuracy of the recognition result is improved.
- each form of the process shown above can be used to reorder, add, or delete steps.
- each step described in this application can be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in this application can be achieved, this is not limited herein.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Quality & Reliability (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (20)
- 一种违法建筑识别方法,包括:获取目标图像,以及所述目标图像关联的参考图像;分别提取所述目标图像的目标建筑特征,以及所述参考图像的参考建筑特征;根据所述目标建筑特征和所述参考建筑特征,确定所述目标图像的违建识别结果。
- 根据权利要求1所述的方法,其中,所述目标图像和所述参考图像的采集位置的距离差值小于设定距离阈值、或所述目标图像和所述参考图像的采集角度的角度差值小于设定角度阈值、或所述目标图像和所述参考图像的采集位置的距离差值小于设定距离阈值且所述目标图像和所述参考图像的采集角度的角度差值小于设定角度阈值。
- 根据权利要求1所述的方法,其中,分别提取目标图像的目标建筑特征,以及与所述目标图像关联的参考图像的参考建筑特征,包括:分别提取所述目标图像的目标基础特征,以及与所述目标图像关联的所述参考图像的参考基础特征;在设定的至少两个尺度下,分别对所述目标基础特征和所述参考基础特征进行特征提取,得到所述至少两个尺度下的所述目标建筑特征和所述参考建筑特征。
- 根据权利要求3所述的方法,其中,根据所述目标建筑特征和所述参考建筑特征,确定所述目标图像的违建识别结果,包括:将每一尺度下的所述目标建筑特征和所述参考建筑特征,进行特征融合;根据至少两个尺度下的特征融合结果,确定所述目标图像的违建识别结果。
- 根据权利要求4所述的方法,其中,将每一尺度下的所述目标建筑特征和所述参考建筑特征,进行特征融合,包括:将每一尺度下的所述目标建筑特征和所述参考建筑特征做差, 并将差值作为该尺度下的特征融合结果。
- 根据权利要求3所述的方法,其中,分别提取所述目标图像的目标基础特征,以及与所述目标图像关联的所述参考图像的参考基础特征,包括:基于深度残差网络,分别提取所述目标图像的目标基础特征,以及与所述目标图像关联的所述参考图像的参考基础特征。
- 根据权利要求1-6任一项所述的方法,其中,在提取所述参考图像的参考建筑特征之前,所述方法还包括:根据所述目标图像,对所述参考图像进行坐标变换;其中,所述坐标变换包括收缩变换、拉伸变换、旋转变换和平移变换中的至少一种。
- 根据权利要求7所述的方法,其中,根据所述目标图像,对所述参考图像进行坐标变换,包括:分别提取所述目标图像的目标关键点和目标描述子,以及所述参考图像的参考关键点和参考描述子;根据所述目标描述子和所述参考描述子,对所述目标关键点和所述参考关键点进行匹配操作;根据匹配结果,确定变换矩阵,并根据所述变换矩阵对所述参考图像进行坐标变换。
- 根据权利要求1所述的方法,其中,确定所述目标图像的违建识别结果,包括:确定所述目标图像中是否包括违建区域;若所述目标图像中包括违建区域,则确定所述违建区域的位置坐标。
- 一种违法建筑识别装置,包括:图像获取模块,设置为获取目标图像,以及所述目标图像关联的参考图像;建筑特征提取模块,设置为分别提取所述目标图像的目标建筑特征,以及所述参考图像的参考建筑特征;识别结果确定模块,设置为根据所述目标建筑特征和所述参考 建筑特征,确定所述目标图像的违建识别结果。
- 根据权利要求10所述的装置,其中,所述目标图像和所述参考图像的采集位置的距离差值小于设定距离阈值、或所述目标图像和所述参考图像的采集角度的角度差值小于设定角度阈值、或所述目标图像和所述参考图像的采集位置的距离差值小于设定距离阈值且所述目标图像和所述参考图像的采集角度的角度差值小于设定角度阈值。
- 根据权利要求10所述的装置,其中,建筑特征提取模块,包括:基础特征提取单元,设置为分别提取所述目标图像的目标基础特征,以及与所述目标图像关联的所述参考图像的参考基础特征;建筑特征提取单元,设置为在设定的至少两个尺度下,分别对所述目标基础特征和所述参考基础特征进行特征提取,得到所述至少两个尺度下的所述目标建筑特征和所述参考建筑特征。
- 根据权利要求12所述的装置,其中,识别结果确定模块,包括:特征融合单元,设置为将每一尺度下的所述目标建筑特征和所述参考建筑特征,进行特征融合;识别结果确定单元,设置为根据至少两个尺度下的特征融合结果,确定所述目标图像的违建识别结果。
- 根据权利要求13所述的装置,其中,特征融合单元,包括:特征融合子单元,设置为将每一尺度下的所述目标建筑特征和所述参考建筑特征做差,并将差值作为该尺度下的特征融合结果。
- 根据权利要求12所述的装置,其中,基础特征提取单元,包括:基础特征提取子单元,设置为基于深度残差网络,分别提取所述目标图像的目标基础特征,以及与所述目标图像关联的所述参考图像的参考基础特征。
- 根据权利要求10-15任一项所述的装置,其中,所述装置还包括:坐标变换模块,设置为在提取所述参考图像的参考建筑特征之前,根据所述目标图像,对所述参考图像进行坐标变换;其中,所述坐标变换包括收缩变换、拉伸变换、旋转变换和平移变换中的至少一种。
- 根据权利要求16所述的装置,其中,坐标变换模块,包括:关键点提取单元,设置为分别提取所述目标图像的目标关键点和目标描述子,以及所述参考图像的参考关键点和参考描述子;关键点匹配单元,设置为根据所述目标描述子和所述参考描述子,对所述目标关键点和所述参考关键点进行匹配操作;坐标变换单元,设置为根据匹配结果,确定变换矩阵,并根据所述变换矩阵对所述参考图像进行坐标变换。
- 根据权利要求10所述的装置,其中,识别结果确定模块,包括:违建区域确定单元,设置为根据所述目标建筑特征和所述参考建筑特征,确定所述目标图像中是否包括违建区域;位置坐标确定单元,设置为若所述目标图像中包括违建区域,则确定所述违建区域的位置坐标。
- 一种电子设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-9中任一项所述的一种违法建筑识别方法。
- 一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令设置为使所述计算机执行权利要求1-9中任一项所述的一种违法建筑识别方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/436,560 US20230005257A1 (en) | 2020-03-27 | 2020-11-12 | Illegal building identification method and apparatus, device, and storage medium |
KR1020217028330A KR20210116665A (ko) | 2020-03-27 | 2020-11-12 | 불법 건축물 식별 방법, 장치, 설비 및 저장 매체 |
EP20919395.2A EP3916629A4 (en) | 2020-03-27 | 2020-11-12 | METHOD, EQUIPMENT AND DEVICE FOR IDENTIFICATION OF ILLEGAL BUILDING AND STORAGE MEDIA |
JP2021551984A JP2022529876A (ja) | 2020-03-27 | 2020-11-12 | 違法建築物の識別方法、装置、機器および記憶媒体 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010231088.3A CN111460967B (zh) | 2020-03-27 | 2020-03-27 | 一种违法建筑识别方法、装置、设备及存储介质 |
CN202010231088.3 | 2020-03-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021189870A1 true WO2021189870A1 (zh) | 2021-09-30 |
Family
ID=71680219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/128257 WO2021189870A1 (zh) | 2020-03-27 | 2020-11-12 | 违法建筑识别方法、装置、设备及存储介质 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230005257A1 (zh) |
EP (1) | EP3916629A4 (zh) |
JP (1) | JP2022529876A (zh) |
KR (1) | KR20210116665A (zh) |
CN (1) | CN111460967B (zh) |
WO (1) | WO2021189870A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115049935A (zh) * | 2022-08-12 | 2022-09-13 | 松立控股集团股份有限公司 | 一种城市违章建筑分割检测方法 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460967B (zh) * | 2020-03-27 | 2024-03-22 | 北京百度网讯科技有限公司 | 一种违法建筑识别方法、装置、设备及存储介质 |
CN111950493B (zh) * | 2020-08-20 | 2024-03-08 | 华北电力大学 | 图像识别方法、装置、终端设备和可读存储介质 |
CN112414374A (zh) * | 2020-10-27 | 2021-02-26 | 江苏科博空间信息科技有限公司 | 基于无人机违法用地勘测系统 |
CN112967264A (zh) * | 2021-03-19 | 2021-06-15 | 深圳市商汤科技有限公司 | 缺陷检测方法及装置、电子设备和存储介质 |
CN113920425A (zh) * | 2021-09-03 | 2022-01-11 | 佛山中科云图智能科技有限公司 | 一种基于神经网络模型的目标违建点获取方法和获取系统 |
US11869260B1 (en) * | 2022-10-06 | 2024-01-09 | Kargo Technologies Corporation | Extracting structured data from an image |
CN116070314B (zh) * | 2022-12-16 | 2024-01-09 | 二十一世纪空间技术应用股份有限公司 | 一种自适应形状特征优化的建筑物矢量化简方法和装置 |
CN116385651A (zh) * | 2023-04-10 | 2023-07-04 | 北京百度网讯科技有限公司 | 图像处理方法、神经网络模型的训练方法、装置和设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805864A (zh) * | 2018-05-07 | 2018-11-13 | 广东省电信规划设计院有限公司 | 基于图像数据的违章建筑物的获取方法以及装置 |
CN109753928A (zh) * | 2019-01-03 | 2019-05-14 | 北京百度网讯科技有限公司 | 违章建筑物识别方法和装置 |
CN110675408A (zh) * | 2019-09-19 | 2020-01-10 | 成都数之联科技有限公司 | 基于深度学习的高分辨率影像建筑物提取方法及系统 |
US20200069222A1 (en) * | 2018-08-31 | 2020-03-05 | Yun yun AI Baby camera Co., Ltd. | Image detection method and image detection device for determining position of user |
CN111460967A (zh) * | 2020-03-27 | 2020-07-28 | 北京百度网讯科技有限公司 | 一种违法建筑识别方法、装置、设备及存储介质 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5962497B2 (ja) * | 2012-12-25 | 2016-08-03 | 富士通株式会社 | 画像処理方法、画像処理装置および画像処理プログラム |
CN104331682B (zh) * | 2014-10-11 | 2018-11-30 | 东南大学 | 一种基于傅里叶描述子的建筑物自动识别方法 |
CN107092871B (zh) * | 2017-04-06 | 2018-01-16 | 重庆市地理信息中心 | 基于多尺度多特征融合的遥感影像建筑物检测方法 |
CN109145812A (zh) * | 2018-08-20 | 2019-01-04 | 贵州宜行智通科技有限公司 | 违建监测方法及装置 |
CN110032983B (zh) * | 2019-04-22 | 2023-02-17 | 扬州哈工科创机器人研究院有限公司 | 一种基于orb特征提取和flann快速匹配的轨迹识别方法 |
-
2020
- 2020-03-27 CN CN202010231088.3A patent/CN111460967B/zh active Active
- 2020-11-12 JP JP2021551984A patent/JP2022529876A/ja active Pending
- 2020-11-12 EP EP20919395.2A patent/EP3916629A4/en not_active Withdrawn
- 2020-11-12 WO PCT/CN2020/128257 patent/WO2021189870A1/zh unknown
- 2020-11-12 US US17/436,560 patent/US20230005257A1/en active Pending
- 2020-11-12 KR KR1020217028330A patent/KR20210116665A/ko not_active Application Discontinuation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108805864A (zh) * | 2018-05-07 | 2018-11-13 | 广东省电信规划设计院有限公司 | 基于图像数据的违章建筑物的获取方法以及装置 |
US20200069222A1 (en) * | 2018-08-31 | 2020-03-05 | Yun yun AI Baby camera Co., Ltd. | Image detection method and image detection device for determining position of user |
CN109753928A (zh) * | 2019-01-03 | 2019-05-14 | 北京百度网讯科技有限公司 | 违章建筑物识别方法和装置 |
CN110675408A (zh) * | 2019-09-19 | 2020-01-10 | 成都数之联科技有限公司 | 基于深度学习的高分辨率影像建筑物提取方法及系统 |
CN111460967A (zh) * | 2020-03-27 | 2020-07-28 | 北京百度网讯科技有限公司 | 一种违法建筑识别方法、装置、设备及存储介质 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3916629A4 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115049935A (zh) * | 2022-08-12 | 2022-09-13 | 松立控股集团股份有限公司 | 一种城市违章建筑分割检测方法 |
Also Published As
Publication number | Publication date |
---|---|
US20230005257A1 (en) | 2023-01-05 |
JP2022529876A (ja) | 2022-06-27 |
KR20210116665A (ko) | 2021-09-27 |
EP3916629A1 (en) | 2021-12-01 |
EP3916629A4 (en) | 2022-05-11 |
CN111460967B (zh) | 2024-03-22 |
CN111460967A (zh) | 2020-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021189870A1 (zh) | 违法建筑识别方法、装置、设备及存储介质 | |
US11836996B2 (en) | Method and apparatus for recognizing text | |
US11335101B2 (en) | Locating element detection method, device and medium | |
Liu et al. | Curved scene text detection via transverse and longitudinal sequence connection | |
US20210303921A1 (en) | Cross-modality processing method and apparatus, and computer storage medium | |
US20240078646A1 (en) | Image processing method, image processing apparatus, and non-transitory storage medium | |
WO2021238062A1 (zh) | 车辆跟踪方法、装置及电子设备 | |
Tan et al. | Mirror detection with the visual chirality cue | |
Li et al. | A deep learning approach for real-time rebar counting on the construction site based on YOLOv3 detector | |
CN111695628B (zh) | 关键点标注方法、装置、电子设备及存储介质 | |
CN110569846A (zh) | 图像文字识别方法、装置、设备及存储介质 | |
US20210209401A1 (en) | Character recognition method and apparatus, electronic device and computer readable storage medium | |
US20220004928A1 (en) | Method and apparatus for incrementally training model | |
WO2021098300A1 (zh) | 面部解析方法及相关设备 | |
US11928563B2 (en) | Model training, image processing method, device, storage medium, and program product | |
JP2022133378A (ja) | 顔生体検出方法、装置、電子機器、及び記憶媒体 | |
CN112270745B (zh) | 一种图像生成方法、装置、设备以及存储介质 | |
WO2022089170A1 (zh) | 字幕区域识别方法、装置、设备及存储介质 | |
US20150294184A1 (en) | Pattern recognition based on information integration | |
WO2024093641A1 (zh) | 多模态融合的高精地图要素识别方法、装置、设备及介质 | |
CN111862030B (zh) | 一种人脸合成图检测方法、装置、电子设备及存储介质 | |
CN113361303B (zh) | 临时交通标志牌识别方法、装置及设备 | |
Lu et al. | Anchor-free multi-orientation text detection in natural scene images | |
CN116982073A (zh) | 媒体项中基于用户输入的干扰移除 | |
CN115937993A (zh) | 活体检测模型训练方法、活体检测方法、装置和电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2021551984 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20217028330 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020919395 Country of ref document: EP Effective date: 20210827 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20919395 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |