WO2023102723A1 - 图像的处理方法和系统 - Google Patents

图像的处理方法和系统 Download PDF

Info

Publication number
WO2023102723A1
WO2023102723A1 PCT/CN2021/136052 CN2021136052W WO2023102723A1 WO 2023102723 A1 WO2023102723 A1 WO 2023102723A1 CN 2021136052 W CN2021136052 W CN 2021136052W WO 2023102723 A1 WO2023102723 A1 WO 2023102723A1
Authority
WO
WIPO (PCT)
Prior art keywords
target object
image
pixel
enhanced
enhanced image
Prior art date
Application number
PCT/CN2021/136052
Other languages
English (en)
French (fr)
Inventor
王智玉
李璐
魏晋
Original Assignee
宁德时代新能源科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宁德时代新能源科技股份有限公司 filed Critical 宁德时代新能源科技股份有限公司
Priority to CN202180078471.2A priority Critical patent/CN116802683A/zh
Priority to EP21960095.4A priority patent/EP4220552A4/en
Priority to PCT/CN2021/136052 priority patent/WO2023102723A1/zh
Priority to US18/295,513 priority patent/US11967125B2/en
Publication of WO2023102723A1 publication Critical patent/WO2023102723A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/446Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering using Haar-like filters, e.g. using integral image techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20004Adaptive image processing
    • G06T2207/20012Locally adaptive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Definitions

  • This application relates to computer technology, especially image processing technology.
  • Image processing by computer is widely used in various fields. Image processing can be used to improve the visual quality of images, extract features of specific objects in images, store and transmit images, etc. In order to extract features of a specific object in an image, it is desirable to identify and localize the specific object.
  • the present application provides an image processing method and system capable of improving the accuracy of locating and segmenting a specific target in an image.
  • the present application provides an image processing method, including: using a segmentation algorithm to determine a target object enhanced image of an input image, wherein the target object enhanced image includes each pixel classified as a target object to be enhanced and displayed image; and applying an integral map algorithm to the target object enhanced image to determine a target object localization image.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • determining the target object enhanced image of the input image using a segmentation algorithm further comprises: performing feature extraction on the input image to determine a pixel feature map; performing feature extraction on the input image to determine a context feature map; based on the The pixel feature map and the context feature map are used to determine the context-related information of each pixel; the target object enhanced image is determined according to the context-related information and the input image, wherein the pixels of the target object enhanced image include Weight information, where the weight information is related to whether the pixel belongs to the target object.
  • the segmentation algorithm in this application not only considers the pixel-level classification information, but also takes into account the classification information of the context around the target pixel, and determines the final classification result of the target pixel based on the correlation between the target pixel and its context. Contextual information is incorporated into the classification algorithm to further improve the classification accuracy of the target pixels, thereby providing a more accurate segmentation of the target object. Change the weight applied to each pixel that is finally classified as a target object to generate an enhanced image of the target object, so that the target object is enhanced and displayed, thereby providing a more accurate basis for subsequent further positioning processing, and can further improve the positioning of the target object accuracy. Weights may be user configurable. The change of the weight setting can affect the enhancement effect of the target object in the target object enhancement image, so that the desired enhancement effect of the target object can be achieved through user settings.
  • applying an integral map algorithm to the target object enhanced image to determine a target object localization image further comprises: determining an integral map from the target object enhanced image; and using the integral map to determine the target object Position the image. Applying the integral map algorithm to the enhanced image of the target object in which the target object has been enhanced can further improve the positioning accuracy of the target object.
  • determining an integral map from the enhanced image of the target object further comprises applying a scaling factor to the enhanced image of the target object.
  • the amount of data to be processed can be adjusted by applying a scaling factor, so that the calculation process can be accelerated and/or the accuracy of the integral graph can be improved according to actual needs.
  • the method further includes: using a loss function to calculate a loss rate between the enhanced image of the target object and the input image; and feeding the calculated loss rate back to the segmentation algorithm.
  • the loss rate between the enhanced image of the target object output by the segmentation algorithm and the labeled production line image reflects the similarity between the enhanced image of the target object output by the segmentation algorithm and the original input image.
  • the loss rate is fed back to the segmentation algorithm to perform supervised learning training on the segmentation algorithm. While achieving training fit regression, the accuracy of the segmentation algorithm can be improved through continuous training and learning.
  • the method further includes updating the segmentation algorithm based on the loss rate or the labeled line image or a combination of both.
  • the segmentation algorithm in this application uses the calculated loss rate or labeled production line images or a combination of the two as training data to train, and can continuously improve the accuracy of the segmentation algorithm in target object segmentation in a supervised learning manner .
  • the training data comes from the real production line, it can cover actual needs and be used and promoted in the production line.
  • the segmentation algorithm is implemented by a deep convolutional neural network HRNet18.
  • HRNet18 maintains high-resolution features throughout the segmentation algorithm process, which is helpful for accurate segmentation of target objects.
  • different branches of the HRNet18 network produce features of different resolutions, and these features interact to obtain information, so that high-resolution features containing multi-channel information can be obtained.
  • choosing the HRNet18 model avoids the risk of overfitting, and at the same time, because of its small structure, it can speed up the operation speed of the entire segmentation algorithm.
  • the present application provides an image processing system, including: a segmentation module configured to determine a target object enhanced image of an input image using a segmentation algorithm, wherein the target object enhanced image includes objects classified as target objects An image displayed with each pixel enhanced; a positioning image generating module configured to apply an integral map algorithm to the enhanced target object image to determine a target object positioning image.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • the segmentation module further comprises: a feature extraction component configured to perform feature extraction on the input image to determine a pixel feature map and to perform feature extraction on the input image to determine a context feature map; context a component configured to determine context-related information for each pixel based on the pixel feature map and the context feature map; an enhanced image generation component configured to determine from the context-related information and the input image The enhanced image of the target object, wherein the pixels of the enhanced image of the target object include weight information, and the weight information is related to whether the pixel belongs to the target object.
  • the segmentation algorithm in this application not only considers the pixel-level classification information, but also takes into account the classification information of the context around the target pixel, and determines the final classification result of the target pixel based on the correlation between the target pixel and its context. Contextual information is incorporated into the classification algorithm to further improve the classification accuracy of the target pixels, thereby providing a more accurate segmentation of the target object. Change the weight applied to each pixel that is finally classified as a target object to generate an enhanced image of the target object, so that the target object is enhanced and displayed, thereby providing a more accurate basis for subsequent further positioning processing, and can further improve the positioning of the target object accuracy. Weights may be user configurable. The change of the weight setting can affect the enhancement effect of the target object in the target object enhancement image, so that the desired enhancement effect of the target object can be achieved through user settings.
  • the positioning image generation module is further configured to: determine an integral map from the enhanced image of the target object; and determine the target object positioning image using the integral map. Applying the integral map algorithm to the enhanced image of the target object in which the target object has been enhanced can further improve the accuracy of the localization of the target object.
  • the positioning image generation module is further configured to apply a scaling factor to the target object enhanced image.
  • the amount of data to be processed can be adjusted by applying a scaling factor, so that the calculation process can be accelerated and/or the accuracy of the integral graph can be improved according to actual needs.
  • the system further includes a loss rate module configured to: use a loss function to calculate a loss rate between the enhanced image of the target object and the input image; and feed back the calculated loss rate to the segmentation module.
  • the loss rate between the enhanced image of the target object output by the segmentation algorithm and the labeled production line image reflects the similarity between the enhanced image of the target object output by the segmentation algorithm and the original input image.
  • the loss rate is fed back to the segmentation algorithm to perform supervised learning training on the segmentation algorithm. While achieving training fit regression, the accuracy of the segmentation algorithm can be improved through continuous training and learning.
  • the segmentation module is further configured to update the segmentation module based on the loss rate or the labeled line image, or a combination of both.
  • the segmentation algorithm in this application uses the calculated loss rate or labeled production line images or a combination of the two as training data to train, and can continuously improve the accuracy of the segmentation algorithm in target object segmentation in a supervised learning manner .
  • the training data comes from the real production line, it can cover actual needs and be used and promoted in the production line.
  • the present application provides an image processing system, including: a memory storing computer-executable instructions; and a processor coupled to the memory, wherein the computer-executable instructions are executed by the processor When executed, causes the system to perform the following operations: determine a target object-enhanced image of an input image using a segmentation algorithm, wherein the target object-enhanced image includes an image in which each pixel classified as a target object is displayed enhanced; Applied to the target object augmented image to determine a target object localization image.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • Fig. 1 is a flowchart of an image processing method according to some embodiments of the present application.
  • FIG. 2 is a flowchart of a method for determining an enhanced image of a target object of an input image using a segmentation algorithm according to some embodiments of the present application;
  • Fig. 3 is an effect diagram illustrating the steps of segmenting a target object according to some embodiments of the present application.
  • Fig. 4 is an effect diagram illustrating the steps of locating a target object according to some embodiments of the present application.
  • Fig. 5 is a network model architecture diagram for implementing the segmentation algorithm of the image processing method of some embodiments of the present application.
  • FIG. 6 is a functional block diagram of an image processing system according to some embodiments of the present application.
  • Figure 7 is a functional block diagram of a segmentation module according to some embodiments of the application.
  • FIG. 8 is a structural block diagram of a computer system suitable for implementing an image processing system according to some embodiments of the present application.
  • multiple refers to more than two (including two), similarly, “multiple groups” refers to more than two groups (including two), and “multiple pieces” refers to More than two pieces (including two pieces).
  • Image processing by computer is widely used in various fields.
  • Image processing can be used to improve the visual quality of images, extract features of specific objects in images, store and transmit images, etc.
  • image processing In order to extract features of a specific object in an image, it is desirable to identify and localize the specific object.
  • the extraction of specific objects can be used for defect detection of specific objects. For example, for power lithium batteries, by taking images of lithium batteries produced on the production line and locating target objects such as tabs, it is possible to effectively detect whether there are defects such as tabs.
  • Some image processing methods include performing double-Gaussian difference on the input image, labeling the processed image, constructing a neural network and model for training and learning, and finally performing data inference based on the model.
  • the first step is often to feed image data into the model for feature extraction. Therefore, the quality of the input image data (such as resolution, signal-to-noise ratio, etc.) will directly affect the accuracy of the trained model.
  • the method of using double Gaussian difference cannot effectively locate the target object with a very small body and high resolution requirements, and the image background (non-target object) will affect The interference of the target object is large, resulting in low target object positioning accuracy and finally making it difficult to accurately detect the defect of the target object (for example, whether the tab is turned over). Therefore, there is a need for an improved technique that can accurately locate target objects that are relatively small in an image and require high resolution.
  • the present application provides a technology capable of accurately locating target objects that occupy a small proportion in an image and require high resolution.
  • the scheme of the present application may include the segmentation of the target object and the localization of the target object.
  • the present application utilizes a segmentation algorithm to determine a target object enhanced image of an input image, wherein the target object enhanced image includes an image in which each pixel classified as a target object is enhanced and displayed.
  • the positioning stage the present application generates an integral image according to the enhanced image of the target object, and uses an integral image algorithm to generate the target object positioning image.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • the technical solutions of the embodiments of the present application are applicable to the segmentation and positioning of target objects with a small proportion in the image and requiring high resolution, including but not limited to, the defect detection of tabs in lithium batteries, and the detection of species observed in the field.
  • Recognition and labeling, detection and interpretation of human facial micro-expressions, etc. In the case of observing species in the wild, the recognition of the species is often based on the marking of specific patterns and patterns on a certain part of its face or body, and infrared cameras for field observation often cannot provide high-resolution clear images.
  • Improved segmentation and positioning algorithms to improve the segmentation and positioning of specific clusters and patterns are helpful for the identification and labeling of this species.
  • face recognition through image capture has been widely used.
  • the method includes: at step 105 , using a segmentation algorithm to determine a target object enhanced image of an input image, wherein the target object enhanced image includes an image in which each pixel classified as a target object is enhanced and displayed.
  • the method includes: at step 110, applying an integral map algorithm to the enhanced target object image to determine a target object localization image.
  • the target object enhanced image includes an image in which every pixel belonging to the target object is displayed enhanced and every pixel not belonging to the target object is not displayed enhanced.
  • enhancing the image of the target object may include displaying an image of pixels belonging to the target object with enhanced brightness.
  • the target object enhanced image may be converted into the form of a mask map.
  • applying an integral map algorithm to the target object enhanced image to determine a target object localization image includes computing and obtaining an integral map for the target object enhanced image converted into a mask map form. Integral map is a method to quickly calculate the sum of rectangular areas in an image. The value of each pixel in the integral map represents the sum of all pixels in the upper left corner of the pixel in the image.
  • the target object location image may take the form of a mask map and may be determined based on the integral map.
  • the value of each pixel in the target object positioning image may depend on whether the pixel value in the integral map is 0, if it is 0, then the value of the pixel in the target object positioning image is 0, if not 0, then The pixel in the target object localization image has a value of 1, where a 1 indicates that the pixel belongs to the target object, and a 0 indicates that the pixel belongs to the image background or a non-target object.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • FIG. 2 is a flowchart of a method for determining an enhanced image of a target object of an input image using a segmentation algorithm according to some embodiments of the present application
  • FIG. 3 is an effect diagram showing the steps of segmenting a target object according to some embodiments of the present application. Step 102 in FIG.
  • Step 205 performing feature extraction on the input image to determine a pixel feature map
  • Step 210 Perform feature extraction on the input image to determine a context feature map
  • step 215 determine context-related information for each pixel based on the pixel feature map and the context feature map
  • step 220 according to the context-related information and the The target object enhanced image is determined from the input image, wherein the target object enhanced image is generated by changing the weight applied to each pixel based on the classification that each pixel belongs to a target object or a non-target object.
  • step 205 may include inputting the input image into a deep convolutional neural network to perform pixel-level feature extraction on the input image.
  • step 205 may include inputting the input image into HRNet 18 to generate a feature map for each pixel in the input image.
  • the feature value of each pixel in the pixel feature map may indicate that the pixel belongs to an initial classification of a target object or a non-target object. In some examples, where the feature values of pixels range from 0-255, every pixel whose feature value is higher than 128 can be considered to belong to the target object and every pixel whose feature value is lower than 128 can be considered to belong to non-target object.
  • the pixel feature map can be a matrix (pixel representation) representing pixel-level features after the input image is calculated by a deep convolutional neural network, and its image representation can be as shown in a in FIG. 3 , for example.
  • step 210 may include inputting the input image into a deep convolutional neural network to perform block-level feature extraction on the input image.
  • step 210 may include inputting the input image into HRNet 18 to generate a feature map of pixel blocks in the input image including the center pixel.
  • pixel blocks may be determined by selecting an appropriate convolution kernel n ⁇ n, where n is an odd number. As shown in b in FIG.
  • a frame in the figure represents a central pixel, and pixels around the frame plus the central pixel represent a pixel block.
  • the pixel block feature map may be a matrix (object region representation) representing pixel block-level features after the input image is calculated by a deep convolutional neural network with a selected convolution kernel.
  • the pixel block feature map represents feature values extracted in units of the pixel block including the center pixel.
  • the feature value of the pixel block may indicate that the pixel block belongs to the classification of the target object or non-target object.
  • each pixel block whose feature value is higher than 128 may be considered to belong to the target object and each pixel block whose feature value is lower than 128 may be considered to be an off-target object.
  • a pixel block feature value may represent a classification or likelihood that pixels around a central pixel in the pixel block belong to a target object or a non-target object.
  • the pixel block feature map and the context feature map can be used interchangeably to represent the surrounding pixels and/or context information of the central pixel in the pixel block.
  • step 215 may include determining context-related information for each pixel based on the pixel feature map determined in step 205 and the context feature map determined in step 210, the context-related information indicating that each pixel is related to The strength of the correlation between the contexts of this pixel.
  • contextual information may be obtained by performing matrix multiplication of the pixel feature map determined in step 205 with the context feature map determined in step 210 and applying a softmax function to obtain the context of each pixel Associated information (pixel region relation).
  • the resulting context association of the pixel Information is strong.
  • the pixel feature map and the context feature map indicate opposite results (such as the pixel feature map indicates that the central pixel belongs to the target object and the context feature map indicates that the context pixel of the central pixel belongs to a non-target object)
  • the resulting context association information of the pixel is weak.
  • step 220 may include determining a final classification of each pixel belonging to a target object or a non-target object based on the context-related information in step 215, and generating a target object by enhancing each pixel belonging to a target object based on the final classification.
  • Object augmented image In some examples, the context association information (pixel region relation) obtained in step 215 is multiplied by the context feature map (objectregion representation) execution matrix determined in step 210 to obtain a weighted pixel-level feature map, and the The weighted pixel-level feature map is concatenated with the pixel representation determined in step 205 to obtain a final pixel feature map.
  • the target object enhanced image is generated by varying the weight applied to each pixel based on the feature value of each pixel in the final pixel feature map, which in turn reflects the classification that the pixel belongs to a target object or a non-target object , its image representation can be as shown in c in FIG. 3 , for example.
  • the target object enhanced image may be generated by increasing the weight applied to each pixel whose feature value is higher than 128.
  • the segmentation algorithm in this application not only considers the pixel-level classification information, but also takes into account the classification information of the context around the target pixel, and determines the final classification result of the target pixel based on the correlation between the target pixel and its context. Contextual information is incorporated into the classification algorithm to further improve the classification accuracy of the target pixels, thereby providing a more accurate segmentation of the target object. Change the weight applied to each pixel that is finally classified as a target object to generate an enhanced image of the target object, so that the target object is enhanced and displayed, thereby providing a more accurate basis for subsequent further positioning processing, and can further improve the positioning of the target object accuracy. Weights may be user configurable. The change of the weight setting can affect the enhancement effect of the target object in the target object enhancement image, so that the desired enhancement effect of the target object can be achieved through user settings.
  • applying the integral map algorithm to the enhanced image of the target object to determine the positioning image of the target object further includes: determining an integral map according to the enhanced image of the target object; and using the integral map to determine the target object positioning image.
  • an integral map is calculated and obtained for the enhanced image of the target object, as shown in a and b in FIG. 4 .
  • normalization is performed on the integral map to take advantage of the invariant moments of the image to find a set of parameters that allow it to cancel the influence of other transformation functions on the image transformation:
  • img_normal img_integral/max(img_integral(:)).
  • an integral graph algorithm is applied to find the upper left and lower right points as follows:
  • Applying the integral map algorithm to the enhanced image of the target object in which the target object has been enhanced can further improve the accuracy of the positioning of the target object.
  • determining the integral map according to the enhanced image of the target object further includes applying a scaling factor to the enhanced image of the target object.
  • a scaling factor (img_scale) is applied to the augmented image of the target object converted to mask image form.
  • the redundant length can be expanded in the following ways to ensure positioning accuracy:
  • x_extend (int)((x_right-x_left)*extend_scale_x/2).
  • x_top (int)(max((x_left-x_extend), 0)/img_scale)
  • y_top (int)(max((y-left-y_extend), 0)/img_scale)
  • x_bottom (int)(max((x_left-x_extend), 0)/img_scale)
  • y_bottom (int)(max((y-left-y_extend), 0)/img_scale).
  • the amount of data to be processed can be adjusted by applying a scaling factor, so that the calculation process can be accelerated and/or the accuracy of the integral graph can be improved according to actual needs.
  • the method further includes: using a loss function to calculate a loss rate between the enhanced image of the target object and the input image; and feeding back the calculated loss rate to the The segmentation algorithm described above.
  • a cross entropy loss (cross entropy loss) function may be used to calculate a loss rate between the enhanced image of the target object generated in step 220 and the input image.
  • the calculated loss rate represents the similarity between the enhanced image of the target object and the original input image.
  • the loss rate between the enhanced image of the target object output by the segmentation algorithm and the labeled production line image reflects the similarity between the enhanced image of the target object output by the segmentation algorithm and the original input image.
  • the loss rate is fed back to the segmentation algorithm to perform supervised learning training on the segmentation algorithm. While achieving training fit regression, the accuracy of the segmentation algorithm can be improved through continuous training and learning.
  • the method further includes: updating the segmentation algorithm based on the loss rate or the labeled production line image or a combination of the two.
  • the segmentation algorithm in this application uses the calculated loss rate or labeled production line images or a combination of the two as training data to train, and can continuously improve the accuracy of the segmentation algorithm in target object segmentation in a supervised learning manner .
  • the training data comes from the real production line, it can cover actual needs and be used and promoted in the production line.
  • FIG. 5 is a network model architecture diagram of a segmentation algorithm for implementing a method for image processing according to some embodiments of the present application.
  • the segmentation algorithm consists of Implementation of deep convolutional neural network HRNet18.
  • HRNet is a high-resolution network capable of maintaining high-resolution representations throughout. Starting from the high-resolution subnetwork as the first stage, gradually increasing the high-resolution to low-resolution subnetworks, forming more stages, and connecting the multi-resolution subnetworks in parallel. Throughout the process, multi-scale repeated fusion is performed by repeatedly exchanging information on parallel multi-resolution sub-networks. Keypoints are estimated by high-resolution representations output by the network, whose architecture is shown in Figure 4. In some examples, considering whether the segmentation of the target object depends on very high-level semantic information and a limited amount of real training data, the smaller model HRNet18 in the HRNet series is selected to implement the segmentation algorithm of the present application.
  • HRNet18 maintains high-resolution features throughout the segmentation algorithm process, which is helpful for accurate segmentation of target objects.
  • different branches of the HRNet18 network produce features of different resolutions, and these features interact to obtain information, so that high-resolution features containing multi-channel information can be obtained.
  • choosing the HRNet18 model avoids the risk of overfitting, and at the same time, because of its small structure, it can speed up the operation speed of the entire segmentation algorithm.
  • the present application provides an image processing method, including: performing feature extraction on the input image to determine a pixel feature map; performing feature extraction on the input image To determine the context feature map; determine the context associated information of each pixel based on the pixel feature map and the context feature map; determine the ear enhancement image according to the context associated information and the input image, wherein the pole an ear-enhanced image is generated by varying the weights applied to each pixel based on its classification as belonging to an ear or not; determining an integral map from the ear-enhanced image, wherein a scaling factor is applied to the ear ear enhancement image; and using the integral image to determine the polar ear positioning image, wherein the segmentation algorithm is implemented by HRNet18.
  • the system includes: a segmentation module 605 configured to determine a target object enhanced image of an input image using a segmentation algorithm, wherein the target object enhanced image includes each pixel classified as a target object being enhanced for display an image of the target object; a positioning image generating module 610 configured to apply an integral map algorithm to the enhanced image of the target object to determine a positioning image of the target object.
  • the target object positioning is performed on the input image, and the segmentation and positioning of the target object are combined Furthermore, combining the segmentation algorithm with the integral graph algorithm can improve the accuracy of the positioning of the target object.
  • FIG. 7 is a functional block diagram of a segmentation module according to some embodiments of the present application.
  • the segmentation module 605 further includes: a feature extraction component 705 configured to perform feature extraction on the input image to determine a pixel feature map and to perform feature extraction on the input image to determine a context feature map; a context component 710 that configured to determine context-related information for each pixel based on the pixel feature map and the context feature map; an enhanced image generating component 715 configured to determine the context-related information according to the context-related information and the input image A target object enhanced image, wherein the target object enhanced image is generated by changing a weight applied to each pixel based on a classification of each pixel as belonging to a target object or a non-target object.
  • the segmentation algorithm in this application not only considers the pixel-level classification information, but also takes into account the classification information of the context around the target pixel, and determines the final classification result of the target pixel based on the correlation between the target pixel and its context. Contextual information is incorporated into the classification algorithm to further improve the classification accuracy of the target pixels, thereby providing a more accurate segmentation of the target object. Change the weight applied to each pixel that is finally classified as a target object to generate an enhanced image of the target object, so that the target object is enhanced and displayed, thereby providing a more accurate basis for subsequent further positioning processing, and can further improve the positioning of the target object accuracy. Weights may be user configurable. The change of the weight setting can affect the enhancement effect of the target object in the target object enhancement image, so that the desired enhancement effect of the target object can be achieved through user settings.
  • the positioning image generating module 610 is further configured to: determine an integral map according to the enhanced image of the target object; and use the integral map to determine the Target object positioning image.
  • Applying the integral map algorithm to the enhanced image of the target object in which the target object has been enhanced can further improve the positioning accuracy of the target object.
  • the positioning image generation module 610 is further configured to apply a scaling factor to the enhanced image of the target object.
  • the amount of data to be processed can be adjusted by applying a scaling factor, so that the calculation process can be accelerated and/or the accuracy of the integral graph can be improved according to actual needs.
  • the system further includes a loss rate module 615 configured to: use a loss function to calculate the difference between the enhanced image of the target object and the input image and feeding back the calculated loss rate to the segmentation algorithm to update the segmentation module.
  • a loss rate module 615 configured to: use a loss function to calculate the difference between the enhanced image of the target object and the input image and feeding back the calculated loss rate to the segmentation algorithm to update the segmentation module.
  • the loss rate between the enhanced image of the target object output by the segmentation algorithm and the labeled production line image reflects the similarity between the enhanced image of the target object output by the segmentation algorithm and the original input image.
  • the loss rate is fed back to the segmentation algorithm to perform supervised learning training on the segmentation algorithm. While achieving training fit regression, the accuracy of the segmentation algorithm can be improved through continuous training and learning.
  • the segmentation module 605 is further configured to update the segmentation module based on the loss rate or the labeled production line image or a combination of the two .
  • the segmentation algorithm in this application uses the calculated loss rate or labeled production line images or a combination of the two as training data to train, and can continuously improve the accuracy of the segmentation algorithm in target object segmentation in a supervised learning manner .
  • the training data comes from the real production line, it can cover actual needs and be used and promoted in the production line.
  • the present application provides an image processing system, including:
  • a feature extraction component 705 configured to perform feature extraction on the input image to determine a pixel feature map and to perform feature extraction on the input image to determine a context feature map;
  • a context component 710 configured to determine context-related information for each pixel based on the pixel feature map and the context feature map;
  • An enhanced image generating component 715 configured to determine a tab-enhanced image according to the context-related information and the input image, wherein the tab-enhanced image is classified based on whether each pixel belongs to a tab or does not belong to a tab to change the weight applied to each pixel to generate;
  • the positioning image generation module 610 is configured to: determine an integral image according to the enhanced tab image; and use the integrated image to determine a tab positioning image, wherein a scaling factor is applied to the enhanced tab image.
  • FIG. 8 it is a structural block diagram of a computer system suitable for implementing an image processing system according to some embodiments of the present application.
  • the system includes: a memory 028 having computer-executable instructions stored thereon; and a processor 016 coupled to the memory 028, wherein the computer-executable instructions, when executed by the processor, cause the The system performs the following operations: using a segmentation algorithm to determine a target object enhanced image of an input image, wherein the target object enhanced image includes an image in which each pixel classified as a target object is displayed enhanced; and applying an integral map algorithm to the The target object augments the image to determine the target object localization image.
  • FIG. 8 shows a block diagram of a computer system 012 suitable for implementing a system for image processing according to some embodiments of the present application.
  • the computer system 012 shown in FIG. 8 is only an example, and should not limit the functions and scope of use of this embodiment of the present invention.
  • computer system 012 takes the form of a general-purpose computing device.
  • Components of computer system 012 may include, but are not limited to: one or more processors or processing units 016, system memory 028, bus 018 connecting various system components including system memory 028 and processing unit 016.
  • Bus 018 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures.
  • bus structures include, by way of example, but are not limited to Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, Enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect ( PCI) bus.
  • Computer system 012 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by computer system 012 and include both volatile and nonvolatile media, removable and non-removable media.
  • System memory 028 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 030 and/or cache memory 032 .
  • the computer system 012 may further include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • storage system 034 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive”).
  • a disk drive for reading and writing to a removable non-volatile disk such as a "floppy disk”
  • a removable non-volatile disk such as a "floppy disk”
  • each drive may be connected to bus 018 via one or more data media interfaces.
  • Memory 028 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present invention.
  • Program modules 042 generally perform the functions and/or methods of the described embodiments of the present invention.
  • the computer system 012 can also communicate with one or more external devices 014 (such as keyboards, pointing devices, displays 024, etc.). Communicate with devices capable of interacting with the computer system 012, and/or communicate with any device (eg, network card, modem, etc.) that enables the computer system 012 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 022 . Also, the computer system 012 can also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN) and/or a public network, such as the Internet) through the network adapter 020 . As shown, network adapter 020 communicates with other modules of computer system 012 via bus 018 .
  • LAN local area network
  • WAN wide area network
  • public network such as the Internet
  • the processing unit 016 executes various functional applications and data processing by running the programs stored in the system memory 028 , such as implementing the method flow provided by the embodiment of the present invention.
  • the above-mentioned computer program can be set in a computer storage medium, that is, the computer storage medium is encoded with a computer program, and when the program is executed by one or more computers, one or more computers can execute the computer programs shown in the above-mentioned embodiments of the present invention.
  • Method flow and/or device operation For example, the process of the method provided by the embodiment of the present invention is executed by the above-mentioned one or more processors.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including - but not limited to - wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out the operations of the present invention may be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, including conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • connect such as via the Internet using an Internet service provider

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及图像的处理方法和系统。该方法包括:使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。

Description

图像的处理方法和系统 技术领域
本申请涉及计算机技术,尤其涉及图像的处理技术。
背景技术
利用计算机进行图像处理在各个领域被广泛应用。图像处理可以被用于提升图像的视觉质量、提取图像中的特定目标的特征、图像的存储和传输等。为了提取图像中的特定目标的特征,标识并定位特定目标是合乎需要的。
因此,需要一种能够准确地定位图像中的特定目标的改进的技术。
发明内容
鉴于上述问题,本申请提供了能够提高定位和分割图像中的特定目标的准确性的图像的处理方法和系统。
第一方面,本申请提供了一种图像的处理方法,包括:使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
在一些实施例中,使用分割算法确定输入图像的目标对象增强图像进一步包括:对所述输入图像执行特征提取以确定像素特征图;对所述输入图像执行特征提取以确定上下文特征图;基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像的像素包括权重信息,所述权重信息与所 述像素是否属于所述目标对象相关。本申请中的分割算法不仅仅考虑像素级分类信息,同时还将目标像素周围的上下文的分类信息考虑在内,基于目标像素与其上下文之间的关联性来确定目标像素的最终分类结果,通过将上下文信息纳入分类算法中以进一步提高对目标像素的分类的准确性,从而提供对目标对象的更准确的分割。改变应用于被最终分类为目标对象的每一像素的权重来生成目标对象增强图像,使得目标对象被增强显示,从而为后续的进一步定位处理提供更准确的基础,能够进一步提升对目标对象的定位的准确性。权重可以是用户可配置的。权重设置的改变可影响目标对象增强图像中目标对象的增强效果,从而可通过用户设置来达成所需的目标对象增强效果。
在一些实施例中,将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像进一步包括:根据所述目标对象增强图像确定积分图;以及利用所述积分图来确定所述目标对象定位图像。对其中目标对象已经被增强显示的目标对象增强图像应用积分图算法能够进一步提升对目标对象的定位的准确性。
在一些实施例中,根据所述目标对象增强图像确定积分图进一步包括将缩放因子应用于所述目标对象增强图像。通过应用缩放因子能够调整待处理的数据量的大小,从而能够根据实际需求加速运算过程和/或提升积分图的准确性。
在一些实施例中,所述方法还包括:利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率;以及将计算所得的损失率反馈至所述分割算法。分割算法输出的目标对象增强图像与带标签的产线图像之间的损失率反应了分割算法输出的目标对象增强图像与原始输入图像之间的相似性。将该损失率反馈至分割算法以对分割算法执行有监督学习训练,在达到训练拟合回归性的同时能够通过不断的训练和学习提升分割算法的准确性。
在一些实施例中,所述方法还包括:基于所述损失率或带标签的产线图像或这两者的组合来更新所述分割算法。本申请中的分割算法将计算所得的损失率或带标签的产线图像或这两者的组合作为训练数据来训练,能够以有监督的学习方式不断提高分割算法在目标对象分割方面的准确性。此外,由于训练数据均来自于真实产线,能够覆盖实际需求,真正地在产线进行落地使用和推广。
在一些实施例中,所述分割算法由深度卷积神经网络HRNet18实现。HRNet18在整个分割算法过程中使特征始终保持高分辨率,有助于对目标对象的 准确分割。此外,HRNet18网络的不同分支产生不同分辨率的特征,这些特征之间交互获取信息,从而能够得到包含多通道信息的高分辨率特征。此外,针对训练数据量有限的情况,选择HRNet18模型避免了过拟合的风险,同时由于其结构较小能够加快整个分割算法的运算速度。
第二方面,本申请提供了一种图像的处理系统,包括:分割模块,其被配置成使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;定位图像生成模块,其被配置成将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
在一些实施例中,所述分割模块进一步包括:特征提取组件,其被配置成对所述输入图像执行特征提取以确定像素特征图以及对所述输入图像执行特征提取以确定上下文特征图;上下文组件,其被配置成基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;增强图像生成组件,其被配置成根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像的像素包括权重信息,所述权重信息与所述像素是否属于所述目标对象相关。本申请中的分割算法不仅仅考虑像素级分类信息,同时还将目标像素周围的上下文的分类信息考虑在内,基于目标像素与其上下文之间的关联性来确定目标像素的最终分类结果,通过将上下文信息纳入分类算法中以进一步提高对目标像素的分类的准确性,从而提供对目标对象的更准确的分割。改变应用于被最终分类为目标对象的每一像素的权重来生成目标对象增强图像,使得目标对象被增强显示,从而为后续的进一步定位处理提供更准确的基础,能够进一步提升对目标对象的定位的准确性。权重可以是用户可配置的。权重设置的改变可影响目标对象增强图像中目标对象的增强效果,从而可通过用户设置来达成所需的目标对象增强效果。
在一些实施例中,所述定位图像生成模块被进一步配置成:根据所述目标对象增强图像确定积分图;以及利用所述积分图来确定所述目标对象定位图像。对其中目标对象已经被增强显示的目标对象增强图像应用积分图算法能够进一步提 升对目标对象的定位的准确性。
在一些实施例中,所述定位图像生成模块被进一步配置成将缩放因子应用于所述目标对象增强图像。通过应用缩放因子能够调整待处理的数据量的大小,从而能够根据实际需求加速运算过程和/或提升积分图的准确性。
在一些实施例中,所述系统还包括损失率模块,其被配置成:利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率;以及将计算所得的损失率反馈至所述分割模块。分割算法输出的目标对象增强图像与带标签的产线图像之间的损失率反应了分割算法输出的目标对象增强图像与原始输入图像之间的相似性。将该损失率反馈至分割算法以对分割算法执行有监督学习训练,在达到训练拟合回归性的同时能够通过不断的训练和学习提升分割算法的准确性。
在一些实施例中,所述分割模块被进一步配置成基于所述损失率或带标签的产线图像或这两者的组合来更新所述分割模块。本申请中的分割算法将计算所得的损失率或带标签的产线图像或这两者的组合作为训练数据来训练,能够以有监督的学习方式不断提高分割算法在目标对象分割方面的准确性。此外,由于训练数据均来自于真实产线,能够覆盖实际需求,真正地在产线进行落地使用和推广。
第三方面,本申请提供了一种图像的处理系统,包括:其上存储有计算机可执行指令存储器;以及与所述存储器耦合的处理器,其中所述计算机可执行指令在由所述处理器执行时致使所述系统执行如下操作:使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
附图说明
通过阅读对下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在全部附图中,用相同的附图标号表示相同的部件。在附图中:
图1是根据本申请的一些实施例的图像的处理方法的流程图;
图2是根据本申请的一些实施例的使用分割算法确定输入图像的目标对象增强图像的方法的流程图;
图3是示出根据本申请的一些实施例的分割目标对象的步骤效果图;
图4是示出根据本申请的一些实施例的定位目标对象的步骤效果图;
图5是用于实现本申请的一些实施例的图像的处理方法的分割算法的网络模型架构图;
图6是根据本申请的一些实施例的图像的处理系统的功能框图;
图7是根据本申请的一些实施例的分割模块的功能框图;以及
图8是适于实现根据本申请的一些实施例的图像的处理系统的计算机系统的结构框图。
具体实施方式
下面将结合附图对本申请技术方案的实施例进行详细的描述。以下实施例仅用于更加清楚地说明本申请的技术方案,因此只作为示例,而不能以此来限制本申请的保护范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。
在本申请实施例的描述中,技术术语“第一”“第二”等仅用于区别不同对象,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量、特定顺序或主次关系。在本申请实施例的描述中,“多个”的含义是两个以上,除非另有明确具体的限定。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
在本申请实施例的描述中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
在本申请实施例的描述中,术语“多个”指的是两个以上(包括两个),同理,“多组”指的是两组以上(包括两组),“多片”指的是两片以上(包括两片)。
利用计算机进行图像处理在各个领域被广泛应用。图像处理可以被用于提升图像的视觉质量、提取图像中的特定目标的特征、图像的存储和传输等。为了提取图像中的特定目标的特征,标识并定位特定目标是合乎需要的。特定目标的提取可以用于对特定目标进行缺陷检测。例如,对于动力锂电池来说,通过拍摄生产线上产出的锂电池的图像并且定位诸如极耳之类的目标对象,能够有效地执行对极耳是否存在诸如翻折等缺陷的检测。
在动力锂电池生产过程中,由于工艺及设备原因,缺陷不可避免。贯穿产线的各个环节,检测锂电池的极耳是否存在翻折是至关重要的一环,其检测结果有效性确保了电池出厂的安全性。然而,由于极耳在整个锂电池中仅占据非常小的百分比,对极耳是否存在翻折进行的检测对图像的分辨率以及极耳的准确定位存在相当高的要求。
一些图像的处理方法包括将输入图像进行双高斯差分,对处理后的图像进行标注,构建神经网络和模型进行训练学习,最后根据该模型进行数据推理。在此类技术中,其第一步往往是把图像数据输入到模型中进行特征提取。因此,输入的图像数据的质量(如分辨率、信噪比等)将直接影响训练出的模型的准确率。在目标对象体积较小的情形中,例如,锂电池极耳,使用双高斯差分的方法不能针对体积极小且对分辨率要求极高的目标对象进行有效定位,图像背景(非目标对象)对目标对象的干扰较大,从而导致较低的目标对象定位准确性并且最终导致难以准确检测出目标对象的缺陷(例如,极耳是否存在翻折)。因此,需要一种能够准确地 定位图像中占比较小且要求高分辨率的目标对象的改进的技术。
针对上述问题,本申请提供了一种能够准确定位在图像中占比较小且要求高分辨率的目标对象的技术。本申请的方案可包括目标对象的分割和目标对象的定位。在分割阶段,本申请利用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像。在定位阶段,本申请根据目标对象增强图像来生成积分图,利用积分图算法来生成目标对象定位图像。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
本申请的实施例的技术方案适用于对图像中占比较小且要求高分辨率的目标对象的分割和定位,包括但不限于,对锂电池中的极耳的缺陷检测,对野外观测物种的识别和标注,对人类面部微表情的检测和解读等等。在野外观测物种的情形中,对物种的识别往往基于其面部或身体某一部位的特定图案、花纹的标注,而野外观测的红外相机往往无法提供高分辨率的清晰图像,因而通过本申请的改进的分割和定位算法来提高对特定团、花纹的分割和定位有助于对该物种的识别和标注。类似地,通过图像捕捉来进行人脸识别已经被广泛应用,在此基础上,对识别到的人脸的微表情进行解读也存在着广泛应用,而嘴角的微微上扬、眉头微皱、某块面部肌肉的短暂抽搐往往在整个图像中占比较小而难以被识别出,通过本申请的改进的分割和定位算法来提高对微表情的识别和定位能够提高对微表情的解读准确性。
参照图1,其示出了根据本申请的一些实施例的图像的处理方法的流程图,本申请提供了一种图像的处理方法。如图1所示,该方法包括:在步骤105,使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像。该方法包括:在步骤110,将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在一些示例中,目标对象增强图像包括其中属于目标对象的每一个像素被增强显示,而不属于目标对象的每一个像素不被增强显示的图像。在一些示例中,目标对象增强图像可包括以增强的亮度来显示属于目标对象的像素的图像。在一些 示例中,目标对象增强图像可以被转换成掩码图的形式。在一些示例中,将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像包括针对转换成掩码图形式的目标对象增强图像计算并得到其积分图。积分图是一种在图像中快速计算矩形区域和的方法,积分图中每一像素的值表示图像中在该像素左上角的所有像素之和,从而一旦计算得出一个图像的积分图,则可以快速地计算出图像中任意大小矩形区域的和。在一些示例中,目标对象定位图像可以采用掩码图的形式,并且可以是基于积分图来确定的。例如,目标对象定位图像中的每一像素的值可取决于积分图中该像素值是否为0,若为0,则目标对象定位图像中的该像素的值为0,若不为0,则目标对象定位图像中的该像素的值1,其中1指示该像素属于目标对象,而0指示该像素属于图像背景或非目标对象。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
根据本申请的一些实施例,可选地,进一步参考图2和图3,图2是根据本申请的一些实施例的使用分割算法确定输入图像的目标对象增强图像的方法的流程图,而图3是示出根据本申请的一些实施例的分割目标对象的步骤效果图,图1中的步骤102可进一步包括:步骤205,对所述输入图像执行特征提取以确定像素特征图;步骤210,对所述输入图像执行特征提取以确定上下文特征图;步骤215,基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;以及步骤220根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像通过基于每一像素属于目标对象或非目标对象的分类来改变应用于每一像素的权重来生成。
在一些示例中,步骤205可包括将输入图像输入到深度卷积神经网络中以对输入图像进行像素级特征提取。在一些示例中,步骤205可包括将输入图像输入到HRNet18中以生成输入图像中的每一像素的特征图。在一些示例中,所述像素特征图中每一像素的特征值可表示该像素属于目标对象或非目标对象的初始分类。在一些示例中,在像素的特征值范围为0-255的情形中,其特征值高于128的每一像素可以被认为属于目标对象而其特征值低于128的每一像素可以被认为属于非 目标对象。在一些示例中,像素特征图可以是输入图像经过深度卷积神经网络计算之后的表示像素级特征的矩阵(pixel representation),其图像表示可例如如图3中的a所示。在一些示例中,步骤210可包括将输入图像输入到深度卷积神经网络中以对输入图像进行图像块级特征提取。在一些示例中,步骤210可包括将输入图像输入到HRNet18中以生成输入图像中包括中心像素在内的像素块的特征图。在一些示例中,像素块可通过选择恰适的卷积核n×n来确定,其中n为奇数。如图3中的b所示,图中的框表示中心像素,该框周围的像素加上该中心像素表示像素块。在一些示例中,像素块特征图可以是输入图像以所选择的卷积核经过深度卷积神经网络计算之后的表示像素块级特征的矩阵(objectregion representation)。在一些示例中,像素块特征图表示以包括中心像素在内的该像素块为单位所提取的特征值。类似地,该像素块的特征值可表示该像素块属于目标对象或非目标对象的分类。在一些示例中,在像素块的特征值范围为0-255的情形中,其特征值高于128的每一像素块可以被认为属于目标对象而其特征值低于128的每一像素块可以被认为属于非目标对象。在一些示例中,像素块特征值可表示该像素块中的中心像素周围的像素属于目标对象或非目标对象的分类或可能性。在本文中,像素块特征图与上下文特征图可以可互换地使用以表示该像素块中的中心像素的周围像素和/或上下文的信息。在一些示例中,步骤215可包括基于在步骤205中所确定的像素特征图以及在步骤210中所确定的上下文特征图来确定每一像素的上下文关联信息,该上下文关联信息表示每一像素与该像素的上下文之间的关联性的强弱。在一些示例中,上下文关联信息可以通过将在步骤205中所确定的像素特征图与在步骤210中所确定的上下文特征图执行矩阵相乘,并对其应用softmax函数来获得每一像素的上下文关联信息(pixel region relation)。在一些示例中,在中心像素的像素特征图指示该像素属于目标对象(非目标对象)而上下文特征图指示该像素的上下文亦属于目标对象(非目标对象)时,所得的该像素的上下文关联信息为强。在像素特征图与上下文特征图指示相反结果的情形中(诸如像素特征图指示中心像素属于目标对象而上下文特征图指示中心像素的上下文像素属于非目标对象),所得的该像素的上下文关联信息为弱。在一些示例中,步骤220可包括根据步骤215中的上下文关联信息来确定每一像素属于目标对象或非目标对象的最终分类,并且通过基于该最终分类增强属于目标对象的每一像素来生成目标对象增强图像。在一些示例中, 将步骤215中获得的上下文关联信息(pixel region relation)与在步骤210中所确定的上下文特征图(objectregion representation)执行矩阵相乘,得到带权重的像素级特征图,将该带权重的像素级特征图与步骤205中所确定的像素特征图(pixel representation)连接在一起以获得最终像素特征图。在一些示例中,目标对象增强图像是通过基于最终像素特征图中每一像素的特征值(其进而反应该像素属于目标对象或非目标对象的分类)来改变应用于每一像素的权重来生成的,其图像表示可例如如图3中的c所示。在一些示例中,目标对象增强图像可通过增大应用于其特征值高于128的每一像素的权重来生成。
本申请中的分割算法不仅仅考虑像素级分类信息,同时还将目标像素周围的上下文的分类信息考虑在内,基于目标像素与其上下文之间的关联性来确定目标像素的最终分类结果,通过将上下文信息纳入分类算法中以进一步提高对目标像素的分类的准确性,从而提供对目标对象的更准确的分割。改变应用于被最终分类为目标对象的每一像素的权重来生成目标对象增强图像,使得目标对象被增强显示,从而为后续的进一步定位处理提供更准确的基础,能够进一步提升对目标对象的定位的准确性。权重可以是用户可配置的。权重设置的改变可影响目标对象增强图像中目标对象的增强效果,从而可通过用户设置来达成所需的目标对象增强效果。
根据本申请的一些实施例,可选地,将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像进一步包括:根据所述目标对象增强图像确定积分图;以及利用所述积分图来确定所述目标对象定位图像。
在一些示例中,针对目标对象增强图像计算并获得其积分图,如图4中的a和b所示。在一些示例中,对积分图执行归一化,以利用图像的不变矩来寻找一组参数使其能够消除其他变换函数对图像变换的影响:
img_normal=img_integral/max(img_integral(:))。
在一些示例中,应用积分图算法来如下寻找左上点和右下点:
x_left,y_left=img_normal>low_thr
x_right,y_right=img_normal>high_thr。
将积分图算法应用于通过本申请的分割算法得出的最终分类结果能够准确地定位目标对象。
对其中目标对象已经被增强显示的目标对象增强图像应用积分图算法能够 进一步提升对目标对象的定位的准确性。
根据本申请的一些实施例,可选地,根据所述目标对象增强图像确定积分图进一步包括将缩放因子应用于所述目标对象增强图像。
在一些示例中,对转换为掩码图形式的目标对象增强图像应用缩放因子(img_scale)。在一些示例中,在积分图的计算过程中,可通过以下方式来扩充冗余长度以确保定位准确性:
y_extend=(int)((y_right-y_left)*extend_scale_y/2)
x_extend=(int)((x_right-x_left)*extend_scale_x/2)。
在应用缩放因子的示例中,根据下式基于缩放因子img_scale来映射回原图以生成目标对象定位图像,如图3中的c所示:
x_top=(int)(max((x_left-x_extend),0)/img_scale)
y_top=(int)(max((y-left-y_extend),0)/img_scale)
x_bottom=(int)(max((x_left-x_extend),0)/img_scale)
y_bottom=(int)(max((y-left-y_extend),0)/img_scale)。
通过应用缩放因子能够调整待处理的数据量的大小,从而能够根据实际需求加速运算过程和/或提升积分图的准确性。
根据本申请的一些实施例,可选地,所述方法还包括:利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率;以及将计算所得的损失率反馈至所述分割算法。
在一些示例中,可以利用叉熵损失(cross entropy loss)函数来计算在步骤220中生成的目标对象增强图像与输入图像之间的损失率。在一些示例中,计算所得的损失率表示目标对象增强图像与原始输入图像之间的相似性。
分割算法输出的目标对象增强图像与带标签的产线图像之间的损失率反应了分割算法输出的目标对象增强图像与原始输入图像之间的相似性。将该损失率反馈至分割算法以对分割算法执行有监督学习训练,在达到训练拟合回归性的同时能够通过不断的训练和学习提升分割算法的准确性。
根据本申请的一些实施例,可选地,所述方法还包括:基于所述损失率或带标签的产线图像或这两者的组合来更新所述分割算法。
本申请中的分割算法将计算所得的损失率或带标签的产线图像或这两者的 组合作为训练数据来训练,能够以有监督的学习方式不断提高分割算法在目标对象分割方面的准确性。此外,由于训练数据均来自于真实产线,能够覆盖实际需求,真正地在产线进行落地使用和推广。
根据本申请的一些实施例,可选地,进一步参考图5,图5是用于实现本申请的一些实施例的用于图像处理的方法的分割算法的网络模型架构图,所述分割算法由深度卷积神经网络HRNet18实现。
在一些示例中,HRNet是高分辨率网络,它能够在整个过程中维护高分辨率的表示。从高分辨率子网作为第一阶段开始,逐步增加高分辨率到低分辨率的子网,形成更多的阶段,并将多分辨率子网并行连接。在整个过程中,通过在并行的多分辨率子网络上反复交换信息来进行多尺度的重复融合。通过网络输出的高分辨率表示来估计关键点,网络架构如图4所示。在一些示例中,鉴于目标对象的分割是否依赖于非常高级的语义信息以及有限数量的真实训练数据,选择HRNet系列中的较小的模型HRNet18来实现本申请的分割算法。
HRNet18在整个分割算法过程中使特征始终保持高分辨率,有助于对目标对象的准确分割。此外,HRNet18网络的不同分支产生不同分辨率的特征,这些特征之间交互获取信息,从而能够得到包含多通道信息的高分辨率特征。此外,针对训练数据量有限的情况,选择HRNet18模型避免了过拟合的风险,同时由于其结构较小能够加快整个分割算法的运算速度。
根据本申请的一些实施例,参考图1-图5,本申请提供了一种图像的处理方法,包括:对所述输入图像执行特征提取以确定像素特征图;对所述输入图像执行特征提取以确定上下文特征图;基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;根据所述上下文关联信息和所述输入图像来确定极耳增强图像,其中所述极耳增强图像通过基于每一像素属于极耳或不属于极耳的分类来改变应用于每一像素的权重来生成;根据所述极耳增强图像确定积分图,其中缩放因子被应用于所述极耳增强图像;以及利用所述积分图来确定极耳定位图像,其中所述分割算法由HRNet18实现。
参照图6,其是根据本申请的一些实施例的图像的处理系统的功能框图,本申请提供了一种图像的处理系统。如图6所示,该系统包括:分割模块605,其被配置成使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图 像包括被分类为目标对象的每一像素被增强显示的图像;定位图像生成模块610,其被配置成将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在本申请实施例的技术方案中,在利用分割算法对图像中的每一像素属于目标对象还是非目标对象进行分类的基础上对输入图像执行目标对象定位,将目标对象的分割与定位相组合并且将分割算法与积分图算法进行组合,能够提高目标对象的定位的准确性。
根据本申请的一些实施例,可选地,进一步参考图7,图7是根据本申请的一些实施例的分割模块的功能框图。所述分割模块605进一步包括:特征提取组件705,其被配置成对所述输入图像执行特征提取以确定像素特征图以及对所述输入图像执行特征提取以确定上下文特征图;上下文组件710,其被配置成基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;增强图像生成组件715,其被配置成根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像通过基于每一像素属于目标对象或非目标对象的分类来改变应用于每一像素的权重来生成。
本申请中的分割算法不仅仅考虑像素级分类信息,同时还将目标像素周围的上下文的分类信息考虑在内,基于目标像素与其上下文之间的关联性来确定目标像素的最终分类结果,通过将上下文信息纳入分类算法中以进一步提高对目标像素的分类的准确性,从而提供对目标对象的更准确的分割。改变应用于被最终分类为目标对象的每一像素的权重来生成目标对象增强图像,使得目标对象被增强显示,从而为后续的进一步定位处理提供更准确的基础,能够进一步提升对目标对象的定位的准确性。权重可以是用户可配置的。权重设置的改变可影响目标对象增强图像中目标对象的增强效果,从而可通过用户设置来达成所需的目标对象增强效果。
根据本申请的一些实施例,可选地,继续参考图6,所述定位图像生成模块610被进一步配置成:根据所述目标对象增强图像确定积分图;以及利用所述积分图来确定所述目标对象定位图像。
对其中目标对象已经被增强显示的目标对象增强图像应用积分图算法能够进一步提升对目标对象的定位的准确性。
根据本申请的一些实施例,可选地,继续参考图6,所述定位图像生成模块610被进一步配置成将缩放因子应用于所述目标对象增强图像。
通过应用缩放因子能够调整待处理的数据量的大小,从而能够根据实际需求加速运算过程和/或提升积分图的准确性。
根据本申请的一些实施例,可选地,继续参考图6,所述系统还包括损失率模块615,其被配置成:利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率;以及将计算所得的损失率反馈至所述分割算法以更新所述分割模块。
分割算法输出的目标对象增强图像与带标签的产线图像之间的损失率反应了分割算法输出的目标对象增强图像与原始输入图像之间的相似性。将该损失率反馈至分割算法以对分割算法执行有监督学习训练,在达到训练拟合回归性的同时能够通过不断的训练和学习提升分割算法的准确性。
根据本申请的一些实施例,可选地,继续参考图6,所述分割模块605被进一步配置成基于所述损失率或带标签的产线图像或这两者的组合来更新所述分割模块。
本申请中的分割算法将计算所得的损失率或带标签的产线图像或这两者的组合作为训练数据来训练,能够以有监督的学习方式不断提高分割算法在目标对象分割方面的准确性。此外,由于训练数据均来自于真实产线,能够覆盖实际需求,真正地在产线进行落地使用和推广。
根据本申请的一些实施例,参考图6和图7,本申请提供了一种图像的处理系统,包括:
分割模块605,其包括:
特征提取组件705,其被配置成对所述输入图像执行特征提取以确定像素特征图以及对所述输入图像执行特征提取以确定上下文特征图;
上下文组件710,其被配置成基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;
增强图像生成组件715,其被配置成根据所述上下文关联信息和所述输入图像来确定极耳增强图像,其中所述极耳增强图像通过基于每一像素属于极耳或不属于极耳的分类来改变应用于每一像素的权重来生成;
定位图像生成模块610,其被配置成:根据所述极耳增强图像确定积分图;以及利用所述积分图来确定极耳定位图像,其中缩放因子被应用于所述极耳增强图像。
参照图8,其是适于实现根据本申请的一些实施例的图像的处理系统的计算机系统的结构框图。如图8所示,该系统包括:其上存储有计算机可执行指令存储器028;以及与所述存储器028耦合的处理器016,其中所述计算机可执行指令在由所述处理器执行时致使所述系统执行如下操作:使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
在一些示例中,图8示出了适于实现根据本申请的一些实施例的用于图像处理的系统的计算机系统012的结构框图。图8显示的计算机系统012仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。
如图8所示,计算机系统012以通用计算设备的形式表现。计算机系统012的组件可以包括但不限于:一个或者多个处理器或者处理单元016,系统存储器028,连接不同系统组件(包括系统存储器028和处理单元016)的总线018。
总线018表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(ISA)总线,微通道体系结构(MAC)总线,增强型ISA总线、视频电子标准协会(VESA)局域总线以及外围组件互连(PCI)总线。
计算机系统012典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机系统012访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
系统存储器028可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(RAM)030和/或高速缓存存储器032。计算机系统012可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统034可以用于读写不可移动的、非易失性磁介质(图6未显示,通常称为“硬盘驱动器”)。尽管图6中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如CD-ROM、DVD-ROM或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线018相连。存储器028可以包括至少一个 程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本发明各实施例的功能。
具有一组(至少一个)程序模块042的程序/实用工具040,可以存储在例如存储器028中,这样的程序模块042包括——但不限于——操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块042通常执行本发明所描述的实施例中的功能和/或方法。
计算机系统012也可以与一个或多个外部设备014(例如键盘、指向设备、显示器024等)通信,在本发明中,计算机系统012与外部雷达设备进行通信,还可与一个或者多个使得用户能与该计算机系统012交互的设备通信,和/或与使得该计算机系统012能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口022进行。并且,计算机系统012还可以通过网络适配器020与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器020通过总线018与计算机系统012的其它模块通信。应当明白,尽管图7中未示出,可以结合计算机系统012使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
处理单元016通过运行存储在系统存储器028中的程序,从而执行各种功能应用以及数据处理,例如实现本发明实施例所提供的方法流程。
上述的计算机程序可以设置于计算机存储介质中,即该计算机存储介质被编码有计算机程序,该程序在被一个或多个计算机执行时,使得一个或多个计算机执行本发明上述实施例中所示的方法流程和/或装置操作。例如,被上述一个或多个处理器执行本发明实施例所提供的方法流程。
随着时间、技术的发展,介质含义越来越广泛,计算机程序的传播途径不再受限于有形介质,还可以直接从网络下载等。可以采用一个或多个计算机可读的介质的任意组合。
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半 导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围,其均应涵盖在本申请的权利要求和说明书的范围当中。尤其是,只要不存在结构冲突,各个实施例中所提到的各项技术特 征均可以任意方式组合起来。本申请并不局限于文中公开的特定实施例,而是包括落入权利要求的范围内的所有技术方案。

Claims (12)

  1. 一种图像的处理方法,包括:
    使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及
    将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
  2. 如权利要求1中所述的方法,其特征在于,使用分割算法确定输入图像的目标对象增强图像进一步包括:
    对所述输入图像执行特征提取以确定像素特征图;
    对所述输入图像执行特征提取以确定上下文特征图;
    基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;
    根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像的像素包括权重信息,所述权重信息与所述像素是否属于所述目标对象相关。
  3. 如权利要求1-2中任一项所述的方法,其特征在于,将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像进一步包括:
    根据所述目标对象增强图像确定积分图;以及
    利用所述积分图来确定所述目标对象定位图像。
  4. 如权利要求3所述的方法,其特征在于,根据所述目标对象增强图像确定积分图进一步包括将缩放因子应用于所述目标对象增强图像。
  5. 如权利要求1-4中任一项所述的方法,其特征在于,所述方法还包括:
    利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率;以及
    将计算所得的损失率反馈至所述分割算法以更新所述分割算法。
  6. 如权利要求1-5中任一项所述的方法,其特征在于,所述分割算法由深度卷积神经网络HRNet18实现。
  7. 一种图像的处理系统,包括:
    分割模块,其被配置成使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;
    定位图像生成模块,其被配置成将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
  8. 如权利要求7所述的系统,其特征在于,所述分割模块进一步包括:
    特征提取组件,其被配置成对所述输入图像执行特征提取以确定像素特征图以及对所述输入图像执行特征提取以确定上下文特征图;
    上下文组件,其被配置成基于所述像素特征图和所述上下文特征图来确定每一像素的上下文关联信息;
    增强图像生成组件,其被配置成根据所述上下文关联信息和所述输入图像来确定所述目标对象增强图像,其中所述目标对象增强图像的像素包括权重信息,所述权重信息与所述像素是否属于所述目标对象相关。
  9. 如权利要求7-8中任一项所述的系统,其特征在于,所述定位图像生成模块被进一步配置成:
    根据所述目标对象增强图像确定积分图;以及
    利用所述积分图来确定所述目标对象定位图像。
  10. 如权利要求9所述的系统,其特征在于,所述定位图像生成模块被进一步配置成将缩放因子应用于所述目标对象增强图像。
  11. 如权利要求7-9中任一项所述的系统,其特征在于,所述系统还包括损失率模块,其被配置成:
    利用损失函数来计算所述目标对象增强图像与所述输入图像之间的损失率; 以及
    将计算所得的损失率反馈至所述分割算法以更新所述分割算法。
  12. 一种图像的处理系统,包括:
    其上存储有计算机可执行指令存储器;以及
    与所述存储器耦合的处理器,其中所述计算机可执行指令在由所述处理器执行时致使所述系统执行如下操作:
    使用分割算法确定输入图像的目标对象增强图像,其中所述目标对象增强图像包括被分类为目标对象的每一像素被增强显示的图像;以及
    将积分图算法应用于所述目标对象增强图像以确定目标对象定位图像。
PCT/CN2021/136052 2021-12-07 2021-12-07 图像的处理方法和系统 WO2023102723A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202180078471.2A CN116802683A (zh) 2021-12-07 2021-12-07 图像的处理方法和系统
EP21960095.4A EP4220552A4 (en) 2021-12-07 2021-12-07 IMAGE PROCESSING METHOD AND SYSTEM
PCT/CN2021/136052 WO2023102723A1 (zh) 2021-12-07 2021-12-07 图像的处理方法和系统
US18/295,513 US11967125B2 (en) 2021-12-07 2023-04-04 Image processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/136052 WO2023102723A1 (zh) 2021-12-07 2021-12-07 图像的处理方法和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/295,513 Continuation US11967125B2 (en) 2021-12-07 2023-04-04 Image processing method and system

Publications (1)

Publication Number Publication Date
WO2023102723A1 true WO2023102723A1 (zh) 2023-06-15

Family

ID=86729503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/136052 WO2023102723A1 (zh) 2021-12-07 2021-12-07 图像的处理方法和系统

Country Status (4)

Country Link
US (1) US11967125B2 (zh)
EP (1) EP4220552A4 (zh)
CN (1) CN116802683A (zh)
WO (1) WO2023102723A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350993A (zh) * 2023-11-02 2024-01-05 上海贝特威自动化科技有限公司 一种基于图像识别的极耳层数检测方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024044947A1 (zh) * 2022-08-30 2024-03-07 宁德时代新能源科技股份有限公司 缺陷检测的方法、装置和计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190244346A1 (en) * 2018-02-07 2019-08-08 Analogic Corporation Visual augmentation of regions within images
CN110648334A (zh) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 一种基于注意力机制的多特征循环卷积显著性目标检测方法
CN111445493A (zh) * 2020-03-27 2020-07-24 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN112508939A (zh) * 2020-12-22 2021-03-16 郑州金惠计算机系统工程有限公司 法兰表面缺陷检测方法及系统和设备
CN113065467A (zh) * 2021-04-01 2021-07-02 中科星图空间技术有限公司 一种基于深度学习的卫星图像低相干区域识别方法及装置
US20210295108A1 (en) * 2018-07-29 2021-09-23 Zebra Medical Vision Ltd. Systems and methods for automated detection of visual objects in medical images

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8824797B2 (en) * 2011-10-03 2014-09-02 Xerox Corporation Graph-based segmentation integrating visible and NIR information
AU2013222016A1 (en) * 2013-08-30 2015-03-19 Canon Kabushiki Kaisha Method, system and apparatus for determining a property of an image
US10460214B2 (en) * 2017-10-31 2019-10-29 Adobe Inc. Deep salient content neural networks for efficient digital object segmentation
CN110889410B (zh) * 2018-09-11 2023-10-03 苹果公司 浅景深渲染中语义分割的稳健用途
EP3956711A4 (en) * 2019-04-18 2023-01-11 The Administrators of The Tulane Educational Fund SAMPLE POSITIONING SYSTEMS AND METHODS TO FACILITATE MICROSCOPY
CN111080615B (zh) * 2019-12-12 2023-06-16 创新奇智(重庆)科技有限公司 基于卷积神经网络的pcb缺陷检测系统及检测方法
US11875510B2 (en) * 2021-03-12 2024-01-16 Adobe Inc. Generating refined segmentations masks via meticulous object segmentation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190244346A1 (en) * 2018-02-07 2019-08-08 Analogic Corporation Visual augmentation of regions within images
US20210295108A1 (en) * 2018-07-29 2021-09-23 Zebra Medical Vision Ltd. Systems and methods for automated detection of visual objects in medical images
CN110648334A (zh) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 一种基于注意力机制的多特征循环卷积显著性目标检测方法
CN111445493A (zh) * 2020-03-27 2020-07-24 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN112508939A (zh) * 2020-12-22 2021-03-16 郑州金惠计算机系统工程有限公司 法兰表面缺陷检测方法及系统和设备
CN113065467A (zh) * 2021-04-01 2021-07-02 中科星图空间技术有限公司 一种基于深度学习的卫星图像低相干区域识别方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4220552A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350993A (zh) * 2023-11-02 2024-01-05 上海贝特威自动化科技有限公司 一种基于图像识别的极耳层数检测方法

Also Published As

Publication number Publication date
US20230237763A1 (en) 2023-07-27
EP4220552A4 (en) 2023-12-27
CN116802683A (zh) 2023-09-22
EP4220552A1 (en) 2023-08-02
US11967125B2 (en) 2024-04-23

Similar Documents

Publication Publication Date Title
CN109858555B (zh) 基于图像的数据处理方法、装置、设备及可读存储介质
US11436739B2 (en) Method, apparatus, and storage medium for processing video image
WO2020006961A1 (zh) 用于提取图像的方法和装置
US11967125B2 (en) Image processing method and system
WO2020024484A1 (zh) 用于输出数据的方法和装置
CN112560874B (zh) 图像识别模型的训练方法、装置、设备和介质
CN113343826B (zh) 人脸活体检测模型的训练方法、人脸活体检测方法及装置
CN112598643A (zh) 深度伪造图像检测及模型训练方法、装置、设备、介质
CN111539916B (zh) 一种对抗鲁棒的图像显著性检测方法及系统
CN113205041B (zh) 结构化信息提取方法、装置、设备和存储介质
CN115861462B (zh) 图像生成模型的训练方法、装置、电子设备及存储介质
CN113177449B (zh) 人脸识别的方法、装置、计算机设备及存储介质
US11756288B2 (en) Image processing method and apparatus, electronic device and storage medium
WO2023001059A1 (zh) 检测方法、装置、电子设备及存储介质
JP2022185144A (ja) 対象検出方法、対象検出モデルのレーニング方法および装置
CN115565186B (zh) 文字识别模型的训练方法、装置、电子设备和存储介质
CN115457329B (zh) 图像分类模型的训练方法、图像分类方法和装置
CN114820885B (zh) 图像编辑方法及其模型训练方法、装置、设备和介质
CN116052288A (zh) 活体检测模型训练方法、活体检测方法、装置和电子设备
CN115937993A (zh) 活体检测模型训练方法、活体检测方法、装置和电子设备
CN115393488A (zh) 虚拟人物表情的驱动方法、装置、电子设备和存储介质
CN114093006A (zh) 活体人脸检测模型的训练方法、装置、设备以及存储介质
CN116071628B (zh) 图像处理方法、装置、电子设备和存储介质
CN116310657B (zh) 特征点检测模型训练方法、图像特征匹配方法及装置
US20240020941A1 (en) Multi-camera domain adaptive object detection system and detection method thereof

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021960095

Country of ref document: EP

Effective date: 20230427

WWE Wipo information: entry into national phase

Ref document number: 202180078471.2

Country of ref document: CN