WO2022151755A1 - Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique - Google Patents

Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique Download PDF

Info

Publication number
WO2022151755A1
WO2022151755A1 PCT/CN2021/119982 CN2021119982W WO2022151755A1 WO 2022151755 A1 WO2022151755 A1 WO 2022151755A1 CN 2021119982 W CN2021119982 W CN 2021119982W WO 2022151755 A1 WO2022151755 A1 WO 2022151755A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
image
bounding box
training
positive
Prior art date
Application number
PCT/CN2021/119982
Other languages
English (en)
Chinese (zh)
Inventor
王娜
宋涛
刘星龙
黄宁
张少霆
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2022151755A1 publication Critical patent/WO2022151755A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung
    • G06T2207/30064Lung nodule
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present disclosure relates to, but is not limited to, the field of computer technology, and in particular, to a target detection method and apparatus, electronic equipment, storage medium, computer program product, and computer program.
  • Pulmonary nodules are a common lesion, and the characteristics of nodules often indicate the nature of lung disease.
  • the detection of pulmonary nodules is of great significance to determine whether the lesion is lung cancer.
  • the early detection, diagnosis and treatment of pulmonary nodules are beneficial to the early diagnosis and treatment of lung cancer and the key to reducing the mortality of lung cancer.
  • Pulmonary nodules can be detected based on Computed Tomography (CT) images.
  • CT Computed Tomography
  • the embodiments of the present disclosure provide a target detection method and apparatus, electronic equipment, storage medium, computer program product and computer program, which not only improve the sensitivity of target detection, but also improve the accuracy of target detection.
  • An embodiment of the present disclosure provides a target detection method, including: performing feature extraction on a first image to be detected to obtain first feature maps of multiple scales of the first image; The first feature maps of multiple scales of the first image are processed to obtain the position of the first object of the target category existing in the first image; wherein, the target detection network is trained in a recursive manner; the target The detection network includes a classification sub-network, a regression sub-network and a segmentation sub-network, the classification sub-network is used to determine whether the first object exists in the first image, and the regression sub-network is used to determine the first image The bounding box of the first object existing in the first image, the segmentation sub-network is used to determine the outline of the first object existing in the first image.
  • the training of the target detection network is performed based on the multi-task learning of classification, regression and segmentation, and the correlation between tasks is used to improve the recognition ability of objects of the target category;
  • the recursive phased training strategy is used to train the target detection network, which not only improves the sensitivity of target detection, but also improves the accuracy of target detection.
  • the method further includes: training the target detection network according to a first training set to obtain a target detection network in a first state, where the first training set includes a plurality of sample images and the The first annotation information of the sample image, the first annotation information includes the real position of the second object in the sample image; the sample image is processed through the target detection network in the first state to obtain the sample image The predicted position of the second object in the sample image; according to the predicted position and real position of the second object, determine the false positive area, false negative area and true positive area in the sample image; A target detection network in one state is trained to obtain a trained target detection network.
  • the second training set includes a plurality of sample images and second annotation information of the sample images, and the second annotation information includes the sample images. False positive regions, false negative regions, and true positive regions.
  • the training process of the target detection network is divided into two stages.
  • the focus is on sensitivity, so that the target detection network can obtain as many suspected first objects as possible; in the second stage, the focus is on accuracy, so that the target detection network can obtain relatively high sensitivity based on high sensitivity. high accuracy.
  • the plurality of sample images include positive sample images and negative sample images
  • the method further includes: cropping the marked second image to obtain a positive sample image and a negative sample image of a preset size,
  • the positive sample image includes at least one second object, and the negative sample image does not include the second object.
  • the real position of the second object includes a bounding box of the second object
  • the target detection network is trained according to a first training set to obtain a target detection network in a first state
  • the method includes: performing feature extraction on the sample image to obtain second feature maps of multiple scales of the sample image; determining the sample according to the second feature maps of multiple scales and a plurality of preset anchor frames A plurality of first reference frames in the image; according to the bounding box of the second object in the sample image, a preset number of training samples are determined from the plurality of first reference frames, and the training samples include label information as The positive samples belonging to the target category and the negative samples not belonging to the target category are marked with information; the classification sub-network is trained according to the training samples.
  • determining a preset number of training samples from the plurality of first reference frames according to the bounding box of the second object in the sample image includes: converting the boundary in the sample image The frame is divided into multiple bounding box sets, and the size of the bounding box in each bounding box set is within a preset size interval; for any bounding box set, removing from the multiple first reference frames has been determined as training The first reference frame of the sample, to obtain a reference frame set corresponding to the bounding box set; for any bounding box in the bounding box set, according to the bounding box and each first reference in the corresponding reference frame set The intersection ratio between the boxes determines the positive samples and negative samples corresponding to the bounding box, and the number of positive samples is negatively correlated with the size interval of the bounding box set; according to the order of the size interval from small to large Each bounding box set is processed to obtain the preset number of training samples.
  • the second object with a larger size and the second object with a smaller size can be taken into consideration.
  • the training of the classification sub-network according to the training sample includes: cropping the second feature map to obtain a third feature map corresponding to the training sample;
  • the feature map is input to the classification sub-network, and the first probability that the training sample belongs to the target category is obtained; according to the first probability that the training sample belongs to the target category and the label information of the training sample, the classification sub-network is determined.
  • the first loss according to the first loss, adjust the network parameters of the classification sub-network.
  • the real position of the second object includes a bounding box of the second object
  • the target detection network is trained according to a first training set to obtain a target detection network in a first state
  • the method includes: performing feature extraction on the positive sample image to obtain fourth feature maps of multiple scales of the positive sample image; multiple second reference frames in the positive sample image; for any bounding box of the second object in the sample image: determine the intersection ratio of the bounding frame and the multiple second reference frames, The second reference frame with the largest sum ratio is determined as the matching frame corresponding to the bounding box; the fifth feature map corresponding to the matching frame is input into the regression sub-network to obtain the prediction frame of the matching frame; according to the The difference between the bounding box and the prediction box determines the second loss of the regression sub-network; according to the second loss, the network parameters of the regression sub-network are adjusted.
  • the determining the second loss of the regression sub-network according to the difference between the bounding box and the prediction box includes: according to the coordinates between the bounding box and the prediction box Offset and intersection ratio, determine the first regression loss of the matching box; determine the second regression loss of the matching box according to the intersection, union and minimum closed area between the bounding box and the prediction box loss; according to the first regression loss and the second regression loss, determine the second loss of the regression sub-network.
  • the real position of the second object includes the outline of the second object
  • the target detection network is trained according to the first training set to obtain the target detection network in the first state, including : perform feature extraction on the positive sample image to obtain fourth feature maps of multiple scales of the positive sample image; input the fourth feature maps of multiple scales into the segmentation sub-network to obtain the positive sample
  • the second probability that each pixel of the image belongs to the target category the segmentation is determined according to the number of pixels in the positive sample image, the contour of the second object in the positive sample image, and the second probability that each pixel belongs to the target category
  • the third loss of the sub-network according to the third loss, the network parameters of the segmentation sub-network are adjusted.
  • the training of the target detection network in the first state according to the second training set to obtain the trained target detection network includes: according to the second label information, performing training on the sample image The second feature maps of multiple scales of the The third probability that the false negative area and the true positive area belong to the target category; according to the third probability that the false positive area, the false negative area and the true positive area belong to the target category, and the true category of the false positive area, the false negative area and the true positive area , determine the fourth loss of the classification sub-network; adjust the network parameters of the classification sub-network according to the fourth loss.
  • the training of the target detection network in the first state according to the second training set to obtain the trained target detection network includes: according to the second label information, performing training on the sample image
  • the second feature maps of multiple scales are cropped to obtain the sixth feature map corresponding to the true positive area and the false negative area; determine the bounding box matching the true positive area and the false negative area;
  • Input the regression sub-network to obtain the prediction frame of the true positive area and the false negative area; determine the regression sub-network according to the difference between the prediction frame of the true positive area and the false negative area and the corresponding bounding box
  • the fifth loss according to the fifth loss, adjust the network parameters of the regression sub-network.
  • the training of the target detection network in the first state according to the second training set to obtain the trained target detection network includes: assigning the first state corresponding to the true positive area and the false negative area
  • the six feature maps are input into the segmentation sub-network to obtain the fourth probability that each pixel in the true positive area and the false negative area belongs to the target category; according to the number of pixels in the true positive area and the false negative area, the true positive area
  • the outline of the second object in the positive area and the false negative area and the fourth probability that each pixel belongs to the target category determines the sixth loss of the segmentation sub-network; according to the sixth loss, adjust the network of the segmentation sub-network parameter.
  • the first image includes a 2D medical image and/or a 3D medical image
  • the target category includes a nodule and/or a cyst.
  • An embodiment of the present disclosure provides a target detection device, comprising: an extraction part, configured to perform feature extraction on a first image to be detected to obtain first feature maps of multiple scales of the first image; a first processing part, is configured to process the first feature maps of multiple scales of the first image through the trained target detection network to obtain the position of the first object of the target category existing in the first image; wherein, the target The detection network is trained in a recursive manner; the target detection network includes a classification sub-network, a regression sub-network and a segmentation sub-network, and the classification sub-network is used to determine whether the first object, the The regression sub-network is used to determine the bounding box of the first object existing in the first image, and the segmentation sub-network is used to determine the outline of the first object existing in the first image.
  • An embodiment of the present disclosure provides an electronic device, including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • Embodiments of the present disclosure provide a computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is implemented.
  • Embodiments of the present disclosure provide a computer program product, including computer-readable codes.
  • a processor in the device executes the video detection method for implementing any of the embodiments of the present disclosure. some or all of the steps.
  • An embodiment of the present disclosure provides a computer program configured to store computer-readable instructions, which, when executed, cause a computer to execute part or all of the steps of the video detection method in any of the embodiments of the present disclosure.
  • FIG. 1 is a schematic diagram of an implementation flowchart of a target detection method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of the composition and structure of a residual attention network according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of the composition and structure of a feature pyramid network provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of the composition structure of a target detection architecture provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of a prediction frame of a lung nodule when the target detection network shown in FIG. 4 is the target detection network in the first state;
  • FIG. 6 is a schematic diagram of a prediction frame of a lung nodule when the target detection network shown in FIG. 4 is a trained target detection network;
  • FIG. 7 is a schematic diagram of the composition and structure of a target detection device according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of the composition and structure of an electronic device according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram of an implementation flowchart of a target detection method provided by an embodiment of the present disclosure. As shown in FIG. 1 , the method may include:
  • Step S11 perform feature extraction on the first image to be detected, and obtain first feature maps of multiple scales of the first image.
  • Step S12 processing the first feature maps of multiple scales of the first image through the trained target detection network to obtain the position of the first object of the target category in the first image.
  • the target detection network is trained in a recursive manner; the target detection network includes a classification sub-network, a regression sub-network and a segmentation sub-network, and the classification sub-network is used to determine whether the first image has the The first object and the regression sub-network are used for determining the bounding box of the first object existing in the first image, and the segmentation sub-network is used for determining the outline of the first object existing in the first image.
  • the training of the target detection network is performed based on the multi-task learning of classification, regression and segmentation, and the correlation between tasks is used to improve the recognition ability of objects of the target category;
  • the recursive phased training strategy is used to train the target detection network, which not only improves the sensitivity of target detection, but also improves the accuracy of target detection.
  • the target detection method may be performed by an electronic device such as a terminal device or a server
  • the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal For digital processing (Personal Digital Assistant, PDA), handheld devices, computing devices, vehicle-mounted devices, wearable devices, etc.
  • the method can be implemented by the processor calling the computer-readable instructions stored in the memory.
  • the method may be performed by a server.
  • the first object may represent an object of a target category.
  • the target categories may include nodules (eg, lung nodules, breast nodules, etc.), cysts, and the like.
  • the first image may represent an image to be subjected to the first object detection.
  • the first image may include 2D medical images (eg, X-ray films, etc.) and/or 3D medical images (eg, CT images and MRI images, etc.). This embodiment of the present disclosure does not limit the first image and the target category.
  • the target detection method provided by the embodiment of the present disclosure, whether there is a first object in the first image can be detected, and the position of the first object in the first image can be obtained.
  • the network parameters of the target detection network can be initialized by using the public lung nodule data set LUNA to reduce problems such as long network training time and disappearance of gradients.
  • step S11 it is considered that the size difference between different first subjects may be large (eg, the diameter of the lung nodules is distributed between 3 millimeters (mm) to 30 mm).
  • the size difference between different first subjects may be large (eg, the diameter of the lung nodules is distributed between 3 millimeters (mm) to 30 mm).
  • low-level feature information at high resolution ie, a feature map with a smaller scale
  • high-order feature information under a large receptive field ie, a feature map with a larger scale. Therefore, in order to take into account the first objects of different sizes and improve the accuracy of target detection, in this step, first feature maps of multiple scales may be extracted from the first image.
  • the first feature map may be used to represent a feature map obtained by performing feature extraction on the first image.
  • the scales of the extracted first feature maps of multiple scales may include 48*48*48, 24*24*24, 12*12*12, 6*6*6, etc.
  • the scales of the extracted first feature maps of multiple scales may include 48*48, 24*24, 12*12, 6*6, and so on.
  • a three-dimensional first image is used as an example for description, and the processing process of the two-dimensional first image may refer to the three-dimensional first image.
  • feature extraction may be performed on the first image through a feature extraction network to obtain first feature maps of multiple scales of the first image.
  • the feature extraction network can be any network capable of multi-scale feature extraction.
  • the feature extraction network can be trained on a large number of images in the visualization database ImageNet.
  • the feature extraction network in the embodiment of the present disclosure may include a basic network and a feature pyramid network (Feature Pyramid Networks, FPN).
  • the basic network can be used to extract the basic feature map of the first image.
  • the base network may include a residual network (Residual Network, ResNet), such as ResNet18.
  • ResNet residual Network
  • the convolution parameters of each layer in the backbone network of the residual network can be set as: the convolution kernel size K is 3*3*3, the step size S is 1, the expansion P is 1, and a batch is connected after each layer of convolution Normalization (Batch Normalization, BN) layer and linear rectification unit (Rectified Linear Unit, ReLU).
  • the basic network may include a Residual Attention Network (Residual Attention Network) formed by combining a residual network and an attention model (Attention Model).
  • the residual network usually extracts features on the entire image range
  • the local features of the first object are more valuable than the regional features far away from the first object. Therefore, the introduction of an attention model into the basic network can enable the basic network to focus on extracting and learning feature information with more reference value (ie, local features of the first object). That is to say, using the residual attention network as the basic network to extract the basic feature map can make the extracted basic feature map more representative of the local features of the first object, thereby improving the accuracy of target detection.
  • FIG. 2 is a schematic diagram of the composition and structure of a residual attention network provided by an embodiment of the present disclosure.
  • the residual attention network includes: a residual network 10 and an attention model 20 .
  • the backbone feature map of the first image 31 can be obtained through the residual network, and the attention feature map of the first image can be obtained through the attention model (it should be noted that the scale of the attention feature map is the same as the scale of the backbone feature map),
  • the basic feature map 32 of the first image can be obtained by combining the backbone feature map and the attention feature map.
  • the base feature map of the first image (1+attention feature map)*backbone feature map.
  • the attention model may include a global mean pooling unit 21 , a fully connected modified linear unit 22 and a fully connected activation unit 23 .
  • FPN includes downsampling processing and upsampling processing.
  • the downsampling process can reduce the scale of the feature map and expand the receptive field, but it will lose the feature information of the first object with a small size
  • the upsampling process can increase the scale of the feature map and retain the features of the first object with a small size information, but narrows the receptive field.
  • FIG. 3 is a schematic diagram of the composition and structure of an FPN provided by an embodiment of the present disclosure.
  • C1 may be used to represent a basic feature map of a first image acquired through a basic network. Since the first feature maps of four scales are finally required, in the embodiment of the present disclosure, C1 is sequentially downsampled four times to obtain C2, C3, C4, and C5, respectively.
  • the basic feature map extracted by the basic network is converted into a multi-scale feature map through FPN, so that the first object of various sizes can be detected, and the amount of calculation can be basically not increased by changing the simple network connection. In the case of , the performance of detecting the first object of small size can be effectively improved.
  • step S12 the first feature maps of multiple scales of the first image may be processed by the trained target detection network, so as to obtain the position of the first object existing in the first image.
  • the position of the first object may be represented by the bounding box of the first object and the outline of the first object.
  • the target detection network includes a classification sub-network, a regression sub-network and a segmentation sub-network, wherein the classification sub-network can be used to determine whether the first object exists in the first image, and the regression sub-network can be used to determine whether the first object exists in the first image.
  • the bounding box of the first object, the segmentation sub-network may be used to determine the outline of the first object present in the first image. It is obtained through the joint training of multiple tasks of classification, regression and segmentation, and the ability to recognize the first object can be improved by using the correlation between tasks.
  • the above-mentioned target detection network including the classification sub-network, the regression sub-network and the segmentation sub-network is trained in a recursive manner. On the basis of improving the sensitivity of target detection, the target detection can be improved. accuracy.
  • a trained target detection network is obtained based on multi-task learning and recursive training.
  • the target detection network in the case of maintaining high sensitivity, there is a problem of low accuracy (that is, a large number of objects are misclassified); in the case of maintaining high accuracy, there is a problem of high sensitivity. low (that is, there are a large number of objects of the target class that are not detected). For example: when the sensitivity reaches more than 95%, there are a large number of false positive sample images (about 32%); when the false positive sample images are controlled below 3%, the sensitivity is low (about 32%). 20% of objects are not detected).
  • the training process of the target detection network is divided into two stages.
  • the focus is on sensitivity, so that the target detection network can obtain as many suspected first objects as possible;
  • the focus is on accuracy, so that the target detection network can obtain relatively high sensitivity based on high sensitivity. high accuracy.
  • the method further includes: training the target detection network according to the first training set to obtain the target detection network in the first state; and training the target detection network in the first state according to the second training set , to get the trained object detection network.
  • the training process of the target detection network is divided into two stages: in the training of the first stage, the target detection network is trained according to the first training set, and the target of the first state is obtained.
  • the detection network is the training of the first stage; in the training of the second stage, the target detection network in the first state is trained to obtain the trained target detection network.
  • the first training set is used to train the target detection network.
  • the first training set includes a plurality of sample images and first annotation information of the sample images, where the first annotation information includes the real position of the second object in the sample image.
  • the plurality of sample images include positive sample images and negative sample images.
  • the positive sample image includes at least one second object, and the negative sample image does not include the second object.
  • the second object may represent an object of the target category existing in the training sample image, and the second object may refer to the first object, which will not be repeated here.
  • the method further includes: cropping the marked second image to obtain a positive sample image and a negative sample image of a preset size.
  • the second image may be used to represent the annotated image.
  • the second image may be an annotated medical image.
  • the annotation information of the second image may be used to indicate the real position (including the bounding box and outline) of each second object in the second image.
  • the bounding box of the second object may be represented by a binarized cuboid.
  • the bounding box of the second object may be represented by a binarized sphere. It can be understood that the center point of the binarized sphere is the same as the center point of the second object, and the radius of the binarized sphere is a radius set as required.
  • the contour of the second object may be represented by whether each pixel in the second image is a target category.
  • the default size can be set as required, for example, the default size can be 96*96*96 (unit: pixel*pixel*pixel).
  • a positive sample image and a negative sample image of a preset size may be acquired from the second image according to the label information of the second image.
  • the position (center point, bounding box, etc.) of each second object in the second image may be determined according to the label information of the second image. Then, according to the position of the second object (eg, centered on the second object), an image block with a size of a preset size and including the second object is cropped from the second image, and an image block with a size of a preset size and not including the second object is cropped from the second image.
  • the cropped image block including the second object may be used as a positive sample image, and the cropped image block not including the second object may be used as a negative sample image.
  • data augmentation is performed on the cropped image blocks including the second object and the image blocks not including the second object through operations such as rotation, translation, mirroring, and scaling, so as to implement data expansion and increase the data including the second object. , and increase the number of image blocks that do not include the second object.
  • These image blocks including the second object obtained through data augmentation can also be used as positive sample images, and these image blocks obtained through data augmentation without including the second object can also be used as negative sample images.
  • the same number of positive sample images and negative sample images are acquired.
  • the positive and negative sample images can be effectively balanced, thereby reducing overfitting.
  • a positive sample image and a negative sample image of a preset size may be obtained by first preprocessing the marked second image, and then cropping the preprocessed second image.
  • Preprocessing of the second image may include one or more of resampling, cropping, normalization, and the like.
  • the preprocessing process of the second image will be described.
  • lung CT images are 3D images
  • the thickness of CT images obtained by different CT instruments may be different (for example, the thickness of lung CT images may be 4 mm, 2.5 mm, 1.25 mm, 1 mm, and 0.7 mm, etc.).
  • the thickness difference between the lung CT images can be effectively eliminated.
  • the area where the lung parenchyma is located can be cropped out. In this way, both the positive sample image and the negative sample image can be made of tissue in the lung area, which can reduce the interference of other organs on the training target detection network.
  • the value of each pixel (also called voxel) in the cropped area can be normalized to a value range of 0-1 to obtain the preprocessed lung CT image. This can effectively reduce the amount of subsequent calculations.
  • the method of cropping the positive sample image and the negative sample image of the preset size from the preprocessed second image may refer to the method of directly cropping the positive sample image and the negative sample image of the preset size from the second image. .
  • the position of each second object in the second image can be determined. Therefore, according to the annotation information of the second image, the annotation information of each positive sample image and the annotation information of each negative sample image can be determined, that is, the first annotation information of each sample image in the first training set is determined.
  • the sample images in the first training set are obtained, and the first label information of each sample image is determined. That is, the acquisition of the first training set is completed.
  • the following describes the process of using the first training set to train the target detection network to obtain the target detection network in the first state.
  • the training of the target detection network according to the first training set to obtain the target detection network in the first state includes the classification, regression and segmentation of the target detection network according to the first training set.
  • the network is trained.
  • the training of the target detection network is performed based on the multi-task learning of classification, regression and segmentation, and the ability to recognize objects of the target category is improved by utilizing the correlation between the tasks.
  • the sample images to be used include: the original image and the negative sample image
  • the label information to be used includes: the bounding box of the second object.
  • the training of the classification sub-network of the target detection network according to the first training set may include steps S21 to S24.
  • step S21 feature extraction is performed on the sample image to obtain second feature maps of multiple scales of the sample image.
  • the second feature map may represent a feature map extracted from the sample image.
  • the process of performing feature extraction on the sample image may refer to the process of performing feature extraction on the first image.
  • the scale of the second feature map may include 6*6**6, 12*12*12, 24*24*24, 48*48*48, and so on.
  • step S22 a plurality of first reference frames in the sample image are determined according to the second feature maps of the plurality of scales and a plurality of preset anchor frames.
  • the preset anchor frame may be used to indicate the size of the first reference frame.
  • the preset anchor boxes can be preset as needed.
  • the size of the lung nodule is 3 mm to 30 mm, so the area of the preset anchor frame can be set to 4, 8, 16, and 32 (unit: pixel*pixel), etc.
  • the shape of the preset anchor frame may include: 1*4, 2*2 and 4*1 (unit: pixel*pixel).
  • the shapes of the preset anchor frame may include: 1*8, 2*4, 4*2 and 8*1.
  • the area and shape of the preset anchor frame can be set in advance as required, and the embodiment of the present disclosure does not limit the area and shape of the preset anchor frame.
  • the center points of a plurality of first reference frames may be determined in the sample image. For example, assuming that the scale of the feature map of a certain scale of the sample image is 3*3*3, the sample image is divided into 9 areas on average, and the center point of each area is the center point of a first reference frame. . For a center point of a first reference frame and a preset anchor frame, a first reference frame may be determined.
  • step S23 a preset number of training samples are determined from the plurality of first reference frames according to the bounding frame of the second object in the sample image.
  • the training samples include positive samples and negative samples, the label information of positive samples belongs to the target category, and the label information of negative samples does not belong to the target category.
  • the gap between the first reference frame and the bounding box of the second object can be determined, so as to determine whether the label of the first reference frame is a target category or a non-target category .
  • the label of the first reference frame may be the target category.
  • the reference frame can be used as a positive sample for the classification sub-network.
  • the first reference frame may be a non-target category, and the first reference frame can be used as a classification Negative samples for the subnetworks.
  • step S23 may include: dividing the bounding box in the sample image into multiple bounding box sets, and the size of the bounding box in each bounding box set is within a preset size interval; A frame set, removing the first reference frame that has been determined as a training sample from the plurality of first reference frames, to obtain a reference frame set corresponding to the bounding box set; for any boundary in the bounding box set frame, according to the intersection ratio between the bounding box and each first reference frame in the corresponding reference frame set, determine the positive samples and negative samples corresponding to the bounding box, and the number of the positive samples is the same as that of the The size intervals of the bounding box sets are negatively correlated; each bounding box set is sequentially processed according to the order of the size intervals from small to large, to obtain the preset number of training samples.
  • the bounding box in the sample image may be divided into multiple bounding box sets according to the size, and then each bounding box set is divided into multiple bounding box sets. to be processed.
  • a size interval can be preset for each bounding box set.
  • the bounding box can be divided into the bounding box set. In this way, the size of the bounding boxes in each bounding box set is within a preset size range for the bounding box set.
  • the size interval preset for the bounding box set may be set as required (for example, according to the size of the second object), and the embodiment of the present disclosure does not limit the size interval.
  • a pulmonary nodule is taken as an example of the second object for description.
  • the size of pulmonary nodules is between 3mm and 30mm. Among them, those with a size less than or equal to 6mm can be called small nodules, those with a size greater than 6mm and less than 12mm are called middle nodules, and those with a size greater than or equal to 12mm are called nodules. large nodules. Therefore, set three bounding box sets, and set a size interval for each bounding box set.
  • each bounding box set may be processed in sequence according to the order of size intervals from small to large.
  • the first bounding box set may represent any one of the divided bounding box sets.
  • the process of processing the first bounding box set includes: removing a first reference frame determined as a training sample from the plurality of first reference frames to obtain a reference frame set corresponding to the first bounding box set; Any one of the bounding boxes in the first bounding box set: according to the intersection ratio between the bounding box and each first reference frame in the reference frame set corresponding to the first bounding box set, determine the positive sample corresponding to the bounding box and negative samples.
  • the reference frame set includes a plurality of first reference frames, and the reference frame set can limit the range of selecting positive samples and negative samples. If the first bounding box set is the first processed bounding box set after sorting, it indicates that there is currently no first reference box determined as a training sample (including positive samples and negative samples). In this case, for any sample image, all the first reference frames in the sample image may be used to form a reference frame set corresponding to the first bounding frame set. If the first bounding box set is not the first processed bounding box set after sorting, it indicates that some of the first reference boxes may have been determined as training samples.
  • the first reference frame of the sample image can be determined as the first reference frame row of the training sample, and the remaining first reference frame can be used to form a first bounding box set corresponding to collection of reference frames. In this way, the number of computations of the cross-union ratio can be reduced, and the amount of computation and workload can be reduced.
  • the number of positive samples corresponding to a bounding box is negatively correlated with the size interval of the bounding box set of the bounding box. That is to say, when the size interval of the bounding box set to which a bounding box belongs is large, the number of positive samples corresponding to the bounding box is small; when the size interval of the bounding box set to which a bounding box belongs is small, The number of positive samples corresponding to the bounding box is large.
  • the number of positive samples corresponding to the bounding box set representing small nodules can be 6
  • the number of positive samples corresponding to the bounding box representing medium nodules can be 4
  • the number of positive samples representing large nodules can be 4.
  • the number of positive samples corresponding to the bounding box can be 2. Since the learning difficulty of the second object with a smaller size is higher, and the learning difficulty of the second object with a larger size is lower, in this way, more positive samples are determined for the second object with a smaller size, and more positive samples are determined for the second object with a larger size. Determining fewer positive samples for the second object can balance the difficulty of learning second objects of different sizes, thereby ensuring that the second objects of various sizes have sufficient sensitivity.
  • the The first reference frames in the reference frame set are sorted, and the first to Nth first reference frames are determined as the positive samples corresponding to the bounding frame, where N can be set as required; (It can be set as required, for example, it can be greater than 0.02 and less than 0.2)
  • the first reference frame is determined as the negative sample corresponding to the bounding box.
  • the number of positive samples corresponding to a bounding box can be the same or similar to the number of negative samples.
  • step S24 the classification sub-network is trained according to the training samples.
  • step S24 may include: cropping the second feature map to obtain a third feature map corresponding to the training sample; inputting the third feature map into the classification sub-network to obtain the The first probability that the training sample belongs to the target category; the first loss of the classification sub-network is determined according to the first probability that the training sample belongs to the target category and the label information of the training sample; Network parameters of the classification sub-network.
  • the position of the third feature map corresponding to the training sample in the second feature map corresponding to the sample image can be determined, and according to the third feature map
  • the second feature map is cropped to obtain a third feature map corresponding to the training sample. It is understandable that the second feature map has multiple scales, and the cropped third feature map also has multiple scales.
  • the third feature map of the training sample is input into the classification sub-network of the target detection network, and the first probability that the training sample belongs to the target category is output. Then, through formula 1, the first loss of the classification sub-network can be determined according to the first probability and the label information of the training sample.
  • L ft represents the first loss
  • y represents the label information of the training sample
  • y' represents the first probability of the output of the classification sub-network.
  • ⁇ and ⁇ are hyperparameters. Among them, ⁇ is mainly used to reduce the weight of the easy-to-classify training samples, so that the classification sub-network of the target detection network pays more attention to the difficult-to-classify training samples.
  • the value of ⁇ may be 2.
  • is mainly used to balance the ratio of positive samples and negative samples in training samples, effectively reducing the problem of serious imbalance in the proportion of positive and negative samples in target detection. In some embodiments, the value of ⁇ may be 0.25.
  • the first threshold and the second threshold can be set as required.
  • the first threshold may be set to a value closer to 1, for example, may be set to 0.9 or 0.95, etc.
  • the second threshold may be set to a value closer to 0, for example, may be set to 0.05 or 0.1. This embodiment of the present disclosure does not limit the settings of the first threshold and the second threshold.
  • the L ft obtained for the easily classified training samples is relatively small. That is to say, the first loss caused by the easy-to-classify training samples is relatively small, and the impact on the network parameters of the classification sub-network is relatively small. This is equivalent to reducing the weight of easily classified training samples.
  • the third threshold and the fourth threshold can be set as required.
  • the third and fourth thresholds may be set to values close to 0.5.
  • the third threshold may be set to 0.55 or 0.6, etc.
  • the fourth threshold may be set to 0.4 or 0.45, etc. This embodiment of the present disclosure does not limit the settings of the third threshold and the fourth threshold.
  • the L ft obtained for the hard-to-classify training samples is relatively large. That is to say, the first loss brought by the hard-to-classify samples is relatively large, and the impact on the network parameters of the classification sub-network is relatively large, which is equivalent to increasing the weight of the hard-to-classify training samples, making the classification sub-network pay more attention to the hard-to-classify samples. training samples.
  • a smoothing operation can be performed on the label information of the training samples, for example, the value of y can be softened from 0 and 1 to 0.1 and 0.9, so as to enhance the generalization of the target detection network performance.
  • the sample images to be used include: positive sample images
  • the label information to be used includes: the bounding box of the second object.
  • training the regression sub-network of the target detection network may include steps S31 to S36.
  • step S31 feature extraction is performed on the positive sample image to obtain fourth feature maps of multiple scales of the positive sample image.
  • the fourth feature map may represent a feature map of the positive sample image.
  • Step S31 may refer to step S21.
  • step S32 a plurality of second reference frames in the positive sample image are determined according to the fourth feature maps of the plurality of scales and a plurality of preset anchor frames.
  • Step S32 may refer to step S22.
  • step S33 for any bounding box of the second object in the sample image, determine the intersection ratio of the bounding box and the plurality of second reference frames, and determine the second reference frame with the largest intersection ratio A matching box corresponding to the bounding box is determined.
  • step S34 for any bounding box of the second object in the sample image, the fifth feature map corresponding to the matching box is input into the regression sub-network to obtain a prediction box of the matching box.
  • the fifth feature map may represent the feature map corresponding to the matching frame.
  • reference may be made to the manner of obtaining the third feature map corresponding to the training sample in step S24.
  • step S35 for any bounding box of the second object in the sample image, the second loss of the regression sub-network is determined according to the difference between the bounding box and the predicted box of the corresponding matching box.
  • step S35 may include: determining the first regression loss of the matching box according to the coordinate offset and the intersection ratio between the bounding box and the prediction box; The intersection, union and minimum closed area between the prediction frames determine the second regression loss of the matching frame; according to the first regression loss and the second regression loss, determine the first regression loss of the regression sub-network. Two losses.
  • the first regression loss can be determined by formula two:
  • W iou represents the weight of the prediction box
  • W iou (e -iou +0.4)
  • iou represents the intersection ratio between the prediction box and the corresponding bounding box
  • x represents the coordinates of the prediction box relative to the corresponding bounding box Offset.
  • the loss value of the smaller prediction box is given a larger loss value, so that when the regression sub-network is trained using the matching box corresponding to the prediction frame, the regression The parameters of the sub-network are updated more vigorously.
  • a second regression loss is introduced in the embodiment of the present disclosure to make the positioning of the second object more accurate.
  • the second regression loss can be determined by formula three;
  • L GIoU represents the second regression loss
  • a and B represent the prediction box and the corresponding bounding box respectively
  • C represents the minimum closed area of A and B
  • a ⁇ B represents the union of the prediction box and the corresponding bounding box
  • a ⁇ B represents The intersection of the predicted box and the corresponding bounding box.
  • the overlapping area and the non-overlapping area between the prediction box and the corresponding bounding box are optimized, so as to more accurately locate the area where the second object is located.
  • the weighted summation of the first regression loss and the second regression loss may be performed to obtain the second loss of the regression sub-network.
  • step S36 the network parameters of the regression sub-network are adjusted according to the second loss.
  • the sample images to be used include: positive sample images
  • the label information to be used includes: the outline of the second object.
  • training the segmentation sub-network of the target detection network may include steps S41 to S44.
  • step S41 feature extraction is performed on the positive sample image to obtain fourth feature maps of multiple scales of the positive sample image.
  • Step S41 may refer to step S31.
  • step S42 the fourth feature maps of the multiple scales are input into the segmentation sub-network to obtain the second probability that each pixel of the positive sample image belongs to the target category.
  • step S43 the third loss of the segmentation sub-network is determined according to the number of pixels in the positive sample image, the contour of the second object in the positive sample image, and the second probability that each pixel belongs to the target category.
  • the third loss of the segmentation sub-network can be determined by Equation 4:
  • L dice represents the third loss
  • N is the number of pixels in the positive sample image
  • i represents the ith pixel in the positive sample image
  • p i represents the ith pixel in the positive sample image output by the segmentation sub-network
  • the second probability that the pixels belong to the target category gi represents the true category of the ith pixel in the positive sample image, respectively, and the value of gi includes 0 and 1, where a value of 0 indicates that the ith pixel belongs to a non-target Category, a value of 1 indicates that the i-th pixel belongs to the target category.
  • g i can be determined according to the contour of each second object in the positive sample image.
  • the third loss is used in the embodiment of the present disclosure to optimize the segmentation task, which is beneficial to balance the positive and negative sample images, thereby improving the The ability to segment the second object with smaller size is improved.
  • step S44 the network parameters of the segmentation sub-network are adjusted according to the third loss.
  • the first stage of training is also completed, and the target detection network in the first state is obtained.
  • the second stage is entered.
  • the target detection network in the first state can be trained according to the second training set to obtain a trained target detection network.
  • the process of training the target detection network in the first state may be a fine-tuning process.
  • the second training set includes a plurality of sample images and second label information of the sample images, where the second label information includes false positive areas, false negative areas and true positive areas in the sample images.
  • the method further includes: processing the sample image through the target detection network in the first state to obtain a predicted position of the second object in the sample image; Predict the position and the real position, and determine the false positive area, false negative area and true positive area in the sample image.
  • the false positive (False Positive, FP) area indicates that the first label information in the sample image is displayed as not the second object, but the output result of the classification sub-network in the first state is displayed as the area of the second object; true positive (Truth Positive, TP) area indicates that the first label information in the sample image is displayed as the second object, and the classification sub-network output result of the first state is also displayed as the area of the second object; False Negative (False Negtive, FN) area indicates the sample image.
  • the first annotation information is displayed as the second object, but the output result of the classification sub-network in the first state shows the area that is not the second object; the true negative (Truth Negtive, TN) area indicates that the first annotation information in the sample image is displayed as not The second object, and the output result of the classification sub-network of the first state is also displayed as a sample image that is not the second object.
  • the negative sample images in the second training set can be determined according to the false positive regions.
  • the positive sample images in the second training set can be determined according to the true positive regions and the false negative regions.
  • all false positive regions may be used as negative sample images in the second training set; false negative regions may be triple-enhanced, and a portion (eg, 2/3) of true positive regions may be selected as Positive images in the second training set.
  • the following describes the process of training the target detection network in the first state according to the second training set.
  • the training of the target detection network in the first state according to the second training set to obtain the trained target detection network includes: according to the second training set, respectively classifying the classification sub-network, The regression sub-network and the segmentation sub-network are trained.
  • the training of the target detection network in the first state is performed based on multi-task learning of classification, regression and segmentation, and the ability to recognize objects of the target category is improved by utilizing the correlation between tasks.
  • the sample images used include: false positive area, false negative area and true positive area
  • the labeling information to be used includes: the bounding box of the second object.
  • the training of the classification sub-network of the target detection network in the first state according to the second training set may include: according to the second label information, performing the training on the first state of the sample image at multiple scales.
  • the second feature map is trimmed to determine the fifth feature map corresponding to the false positive area, the false negative area and the true positive area; the fifth feature map is input into the classification sub-network to obtain the false positive area, false negative area and true positive area
  • the third probability that the area belongs to the target category; the classifier is determined according to the third probability that the false positive area, the false negative area and the true positive area belong to the target category, and the true category of the false positive area, the false negative area and the true positive area.
  • the fourth loss of the network according to the fourth loss, the network parameters of the classification sub-network are adjusted.
  • the above process may refer to steps S21 to S24.
  • the sample images used include true positive regions and false negative regions
  • the annotation information to be used includes: the bounding box of the second object.
  • the training of the regression sub-network of the target detection network in the first state according to the second training set may include: determining bounding boxes matching the true positive regions and false negative regions;
  • the sixth feature map is input to the regression sub-network to obtain the prediction frame of the true positive area and the false negative area; according to the difference between the prediction frame of the true positive area and the false negative area and the corresponding bounding box, determine the prediction frame of the true positive area and the false negative area.
  • the fifth loss of the regression sub-network according to the fifth loss, the network parameters of the regression sub-network are adjusted.
  • the above process may refer to steps S31 to S36.
  • the sample images used include true positive regions and false negative regions
  • the annotation information to be used includes: the outline of the second object.
  • training the segmentation sub-network of the target detection network in the first state may include: inputting the sixth feature map corresponding to the true positive area and the false negative area into the Segment the sub-network to obtain the fourth probability that each pixel in the true positive area and the false negative area belongs to the target category; according to the number of pixels in the true positive area and the false negative area, the true positive area and the false negative area.
  • the contour of the second object and the fourth probability that each pixel belongs to the target category determines the sixth loss of the segmentation sub-network; and adjusts the network parameters of the segmentation sub-network according to the sixth loss.
  • the above process may refer to steps S41 to S44.
  • the coefficient of the corresponding loss (including the fourth loss) of the false positive region, the third probability of the false negative region and the true positive region may be determined according to the third probability of the false positive region It can be used as the coefficient of the corresponding losses (including the fourth loss, the fifth loss and the sixth loss) of the false negative area and the true positive area. In this way, convergence can be accelerated and training time can be saved.
  • an online-hardness-minig method may be used (for example, each iteration focuses on optimizing the 10 regions with the largest loss values),
  • the object detection network is trained as the trained object detection network. In this way, convergence can be accelerated and training time can be saved.
  • FIG. 4 is a schematic structural diagram of the composition of a target detection architecture provided by an embodiment of the present disclosure.
  • the target detection architecture includes a feature extraction network 40 and a target detection network 50 .
  • the feature extraction network 40 includes a basic network and FPN
  • the target detection network 50 includes a classification sub-network 51 , a regression sub-network 52 and a segmentation sub-network 53 .
  • the process of the target detection network for detecting lung nodules from the lung CT image shown in FIG. 4 may include: firstly, the lung CT image may be divided into image blocks of a specified size, and each image block is a first image ; Then, each first image is respectively input into the target detection network shown in FIG. 4 to obtain the bounding box and outline of the lung nodule in each first image. Finally, according to the bounding box and contour of the lung nodule in each first image, the bounding box and contour of the lung nodule in the lung CT image can be determined.
  • the first image is input into the feature extraction network shown in FIG. 4 for processing, and first feature maps of multiple scales of the first image are obtained.
  • the first feature maps of multiple scales of the first image are respectively input into the classification sub-network, regression sub-network and segmentation sub-network of the trained target detection network to obtain whether there are lung nodules in the first image, and whether each lung nodule exists in the first image.
  • the bounding box and contours of each lung nodule are respectively input into the classification sub-network, regression sub-network and segmentation sub-network of the trained target detection network to obtain whether there are lung nodules in the first image, and whether each lung nodule exists in the first image.
  • FIG. 5 is a schematic diagram of a prediction frame of a lung nodule when the target detection network shown in FIG. 4 is the target detection network in the first state.
  • the target detection network shown in FIG. 4 is the target detection network in the first state trained through the first stage, there are a large number of false positive lung nodules 61 and some false negative lung nodules 62.
  • FIG. 6 is a schematic diagram of a prediction frame of a lung nodule when the target detection network shown in FIG. 4 is a trained target detection network. As shown in Figure 6, when the target detection network shown in Figure 4 is a trained target detection network trained through the first and second stages, the number of false positive lung nodules is reduced.
  • the target detection method provided by the embodiment of the present application can be used to detect whether there is a first object in the first image, and can obtain the position of the first object in the first image.
  • the target detection method provided in this embodiment of the present application can be used to detect whether there is a lung nodule in the lung CT image, and can obtain Location of lung nodules in lung CT images.
  • the target detection method provided in this embodiment of the present application can be used in any suitable scenario that needs to detect whether there is a lung nodule in a lung CT image.
  • the target detection method can be used to screen lung nodules in the CT images of the lungs to be detected through remote cloud platforms or clinical landing equipment in hospitals, which is beneficial to improve the medical level in areas with low medical level.
  • the automatic screening of pulmonary nodules in lung CT images can be completed through the remote cloud platform or the hospital’s clinical floor equipment, which is helpful for doctors’ care. Rapid and accurate diagnosis provides auxiliary means.
  • Another example is the automatic screening of pulmonary nodules on the obtained lung CT images in the physical examination center to improve the detection level of pulmonary nodules.
  • the embodiments of the present disclosure also provide target detection devices, electronic devices, computer-readable storage media, computer programs, and computer program products, all of which can be used to implement any target detection method provided by the present disclosure, and corresponding technical solutions and descriptions See the corresponding entry in the Methods section.
  • FIG. 7 is a schematic structural diagram of a target detection apparatus provided by an embodiment of the present disclosure. As shown in FIG. 7 , the apparatus 700 includes:
  • the extraction part 701 is configured to perform feature extraction on the first image to be detected to obtain first feature maps of multiple scales of the first image;
  • the first processing part 702 is configured to perform the feature extraction on the The first feature maps of multiple scales of the first image are processed to obtain the position of the first object of the target category existing in the first image;
  • the target detection network is trained in a recursive manner;
  • the target The detection network includes a classification sub-network, a regression sub-network and a segmentation sub-network, the classification sub-network is used to determine whether the first object exists in the first image, and the regression sub-network is used to determine the first image
  • the bounding box of the first object existing in the first image, the segmentation sub-network is used to determine the contour of the first object existing in the first image.
  • the apparatus further includes:
  • the first training part is configured to train the target detection network according to a first training set to obtain a target detection network in a first state, and the first training set includes a plurality of sample images and a first sample image of the sample images. Labeling information, the first labeling information includes the real position of the second object in the sample image;
  • the second processing part is configured to process the sample image through the target detection network in the first state to obtain the predicted position of the second object in the sample image;
  • a determining part configured to determine a false positive area, a false negative area and a true positive area in the sample image according to the predicted position and the real position of the second object;
  • the second training part is configured to train the target detection network in the first state according to a second training set to obtain a trained target detection network, and the second training set includes a plurality of sample images and the sample images
  • the second annotation information includes the false positive area, the false negative area and the true positive area in the sample image.
  • the plurality of sample images include positive sample images and negative sample images
  • the apparatus further includes: a cropping part configured to crop the marked second image to obtain a positive sample image of a preset size and a negative sample image, the positive sample image includes at least one second object, and the negative sample image does not include the second object.
  • the real position of the second object includes a bounding box of the second object
  • the first training part is further configured to: perform feature extraction on the sample image to obtain multiple features of the sample image.
  • second feature maps of one scale multiple first reference frames in the sample image are determined according to the second feature maps of multiple scales and multiple preset anchor frames; according to the second feature maps in the sample image
  • the bounding box of the object, a preset number of training samples are determined from the plurality of first reference frames, and the training samples include positive samples whose annotation information belongs to the target category, and negative samples whose annotation information does not belong to the target category ; Train the classification sub-network according to the training samples.
  • the determining a preset number of training samples from the plurality of first reference frames according to the bounding box of the second object in the sample image includes: The frame is divided into multiple bounding box sets, and the size of the bounding box in each bounding box set is within a preset size interval; for any bounding box set, removing from the multiple first reference frames has been determined as training The first reference frame of the sample, to obtain a reference frame set corresponding to the bounding box set; for any bounding box in the bounding box set, according to the bounding box and each first reference in the corresponding reference frame set
  • the intersection ratio between boxes determines the positive samples and negative samples corresponding to the bounding box, and the number of positive samples is negatively correlated with the size interval of the bounding box set; according to the order of the size interval from small to large
  • Each bounding box set is processed to obtain the preset number of training samples.
  • the training of the classification sub-network according to the training sample includes: cropping the second feature map to obtain a third feature map corresponding to the training sample;
  • the feature map is input to the classification sub-network, and the first probability that the training sample belongs to the target category is obtained; according to the first probability that the training sample belongs to the target category and the label information of the training sample, the classification sub-network is determined. a first loss; according to the first loss, adjust the network parameters of the classification sub-network.
  • the real position of the second object includes a bounding box of the second object
  • the first training part is further configured to: perform feature extraction on the positive sample image to obtain the positive sample image fourth feature maps of multiple scales; according to the fourth feature maps of multiple scales and multiple preset anchor frames, determine multiple second reference frames in the positive sample image; for the sample image Any bounding box of the second object in: determine the intersection ratio of the bounding box and the plurality of second reference frames, and determine the second reference frame with the largest intersection ratio as the match corresponding to the bounding box frame; input the fifth feature map corresponding to the matching frame into the regression sub-network to obtain the prediction frame of the matching frame; determine the regression sub-network according to the difference between the bounding frame and the prediction frame The second loss; according to the second loss, adjust the network parameters of the regression sub-network.
  • the first training part is further configured to: determine the first regression loss of the matching box according to the coordinate offset and the intersection ratio between the bounding box and the prediction box; The intersection, union and minimum closed area between the bounding box and the prediction box determine the second regression loss of the matching box; according to the first regression loss and the second regression loss, determine the The second loss of the regression sub-network.
  • the real position of the second object includes the outline of the second object
  • the first training part is further configured to: perform feature extraction on the positive sample image to obtain the Fourth feature maps of multiple scales; input the fourth feature maps of multiple scales into the segmentation sub-network to obtain the second probability that each pixel of the positive sample image belongs to the target category; according to the positive sample image
  • the number of pixels in the positive sample image, the contour of the second object in the positive sample image, and the second probability that each pixel belongs to the target category determine the third loss of the segmentation sub-network; according to the third loss, adjust the segmentation Network parameters for the subnet.
  • the second training part is further configured to: according to the second label information, crop the second feature maps of multiple scales of the sample image to determine false positive areas, false negative areas and The fifth feature map corresponding to the true positive region; input the fifth feature map into the classification sub-network to obtain the third probability that the false positive region, the false negative region and the true positive region belong to the target category; The third probability that the negative area and the true positive area belong to the target category, and the true categories of the false positive area, the false negative area and the true positive area, determine the fourth loss of the classification sub-network; according to the fourth loss, adjust the Describe the network parameters of the classification sub-network.
  • the second training part is further configured to: according to the second label information, crop the second feature maps of multiple scales of the sample image to obtain the correspondence between true positive regions and false negative regions
  • the sixth feature map of determine the bounding box matching the true positive region and the false negative region; input the sixth feature map into the regression sub-network to obtain the prediction frame of the true positive region and the false negative region; Determine the fifth loss of the regression sub-network according to the difference between the prediction boxes and the corresponding bounding boxes of the true positive area and the false negative area; adjust the network parameters of the regression sub-network according to the fifth loss .
  • the second training part is further configured to: input the sixth feature map corresponding to the true positive area and the false negative area into the segmentation sub-network, to obtain the difference between the true positive area and the false negative area
  • the fourth probability that each pixel belongs to the target category according to the number of pixels in the true positive area and the false negative area, the outline of the second object in the true positive area and the false negative area, and the first probability that each pixel belongs to the target category.
  • the sixth loss of the segmentation sub-network is determined; according to the sixth loss, the network parameters of the segmentation sub-network are adjusted.
  • the first image includes a 2D medical image and/or a 3D medical image
  • the target category includes a nodule and/or a cyst.
  • the functions or included parts of the apparatus provided in the embodiments of the present disclosure may be configured to execute the methods described in the above method embodiments, and the specific implementation may refer to the descriptions in the above method embodiments.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, a unit, a module, or a non-modularity.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is implemented.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • Embodiments of the present disclosure also provide a computer program, including computer-readable codes.
  • the processor in the device executes the method for implementing the target detection provided in any of the above embodiments. instruction.
  • Embodiments of the present disclosure further provide a computer program product for storing computer-readable instructions, which, when executed, cause a computer to execute the steps of the target detection method provided by any of the foregoing embodiments.
  • the electronic device may be provided as a terminal, server or other form of device.
  • FIG. 8 is a schematic structural diagram of an electronic device 800 according to an embodiment of the disclosure.
  • electronic device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, etc. terminal.
  • an electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power supply component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812 , sensor component 814 , and communication component 816 .
  • the processing component 802 generally controls the overall operation of the electronic device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 802 may include one or more modules that facilitate interaction between processing component 802 and other components. For example, processing component 802 may include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at electronic device 800 . Examples of such data include instructions for any application or method operating on electronic device 800, contact data, phonebook data, messages, pictures, videos, and the like.
  • the memory 804 may be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random-Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable) Erasable Programmable Read Only Memory, EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (Read-Only Memory) , ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM Static Random-Access Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • Read-Only Memory Read-Only Memory
  • Power supply assembly 806 provides power to various components of electronic device 800 .
  • Power supply components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic device 800 .
  • Multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and the user.
  • the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • multimedia component 808 includes a front-facing camera and/or a rear-facing camera. When the electronic device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • Audio component 810 is configured to output and/or input audio signals.
  • audio component 810 includes a microphone (MIC) that is configured to receive external audio signals when electronic device 800 is in operating modes, such as calling mode, recording mode, and voice recognition mode.
  • the received audio signal may be stored in memory 804 or transmitted via communication component 816 .
  • audio component 810 also includes a speaker for outputting audio signals.
  • the I/O interface 812 provides an interface between the processing component 802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of electronic device 800 .
  • the sensor assembly 814 can detect the open/closed state of the electronic device 800, the relative positioning of the components, such as the display and the keypad of the electronic device 800, the sensor assembly 814 can also detect the electronic device 800 or one of the electronic device 800 Changes in the position of components, presence or absence of user contact with the electronic device 800 , orientation or acceleration/deceleration of the electronic device 800 and changes in the temperature of the electronic device 800 .
  • Sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a Complementary Metal-Oxide-Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications.
  • CMOS Complementary Metal-Oxide-Semiconductor
  • CCD Charge Coupled Device
  • the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between electronic device 800 and other devices.
  • the electronic device 800 can access a wireless network based on a communication standard, such as a wireless network (WiFi), a second generation mobile communication technology (The 2nd Generation, 2G) or a third generation mobile communication technology (The 3rd Generation, 3G), or their The combination.
  • the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a Near Field Communication (NFC) module to facilitate short-range communication.
  • the NFC module may be based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wide Band (UWB) technology, Bluetooth (Bluetooth, BT) technology and other technology to achieve.
  • RFID Radio Frequency Identification
  • IrDA Infrared Data Association
  • UWB Ultra Wide Band
  • Bluetooth Bluetooth
  • the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (Digital Signal Processing Devices) , DSPD), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above method.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DPD Digital Signal Processing Devices
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above method.
  • a non-volatile computer-readable storage medium is also provided, such as a memory 804 comprising computer program instructions executable by the processor 820 of the electronic device 800 to perform the above method.
  • FIG. 9 is a schematic structural diagram of an electronic device 1900 according to an embodiment of the present disclosure.
  • the electronic device 1900 may be implemented as a server.
  • an electronic device 1900 includes a processing component 1922, which in some embodiments may include one or more processors, and a memory resource, represented by memory 1932, for storing instructions executable by the processing component 1922, such as applications program.
  • An application program stored in memory 1932 may include one or more modules, each corresponding to a set of instructions.
  • the processing component 1922 is configured to execute instructions to perform the above-described methods.
  • the electronic device 1900 may also include a power supply assembly 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input output (I/O) interface 1958 .
  • the electronic device 1900 can operate based on an operating system stored in the memory 1932, such as a Microsoft server operating system (Windows Server TM ), a graphical user interface based operating system (Mac OS X TM ) introduced by Apple, a multi-user multi-process computer operating system (Unix TM ), Free and Open Source Unix-like Operating System (Linux TM ), Open Source Unix-like Operating System (FreeBSD TM ) or the like.
  • Microsoft server operating system Windows Server TM
  • Mac OS X TM graphical user interface based operating system
  • Uniix TM multi-user multi-process computer operating system
  • Free and Open Source Unix-like Operating System Linux TM
  • FreeBSD TM Open Source Unix-like Operating System
  • a non-volatile computer-readable storage medium such as memory 1932 comprising computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described method.
  • Embodiments of the present disclosure may be one or more of a system, a method, a computer-readable storage medium, a computer program, or a computer program product.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling the processor to implement the target detection method provided by any of the above embodiments of the present disclosure.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), Static Random Access Memory (SRAM), Portable Compact Disc Read-Only Memory (CD-ROM), Digital Video Disc (DVD), Memory Stick, Floppy Disk, Mechanical Encoding devices, such as punched cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory Static Random Access Memory
  • SRAM Static Random Access Memory
  • CD-ROM Portable Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • Memory Stick Memory Stick
  • Mechanical Encoding devices such as punched cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing.
  • Computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
  • the computer readable program instructions described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • the computer program instructions for performing the steps of the embodiments of the present disclosure may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or in a Source or object code written in any combination of one or more programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the "C" language or similar Programming language.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or a Wide Area Network (WAN)—or, can be connected to an external computer (e.g. use an internet service provider to connect via the internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • custom electronic circuits such as programmable logic circuits, Field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), are personalized by utilizing state information of computer readable program instructions,
  • the electronic circuit may execute computer-readable program instructions to implement embodiments of the present disclosure.
  • Embodiments of the present disclosure are described herein with reference to flowchart illustrations and/or structural diagrams of methods, apparatus (systems) and computer program products according to embodiments of the present disclosure. It will be understood that each block of the flowcharts and/or structural diagrams, and combinations of blocks in the flowcharts and/or structural diagrams, can be implemented by computer readable program instructions.
  • These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more of the blocks in the flowcharts and/or constituent block diagrams.
  • These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium storing the instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in the flowchart and/or block diagrams.
  • Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks in the flowcharts and/or constituent block diagrams.
  • each block in the flowchart or block diagram may represent a module, segment, or portion of an instruction that contains one or more logic for implementing the specified Executable instructions for the function.
  • the functions noted in the blocks may also occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the constituent block diagrams and/or flowchart illustrations, and combinations of blocks in the constituent block diagrams and/or flowchart illustrations may be implemented using special purpose hardware-based hardware that performs the specified function or action. system, or can be implemented using a combination of dedicated hardware and computer instructions.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in other embodiments, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • Embodiments of the present disclosure provide a target detection method and device, electronic equipment, storage medium, computer program product, and computer program, wherein the method includes: performing feature extraction on a first image to be detected, and obtaining a feature of the first image. First feature maps of multiple scales; processing the first feature maps of multiple scales of the first image through the trained target detection network to obtain the position of the first object of the target category in the first image. According to the embodiments of the present disclosure, the first object of the target category existing in the image to be detected can be detected, and the sensitivity and accuracy of target detection can be improved.

Abstract

Un procédé et un appareil de détection de cible, un dispositif électronique, un support de stockage, un produit de programme informatique et un programme informatique. Le procédé comprend : la réalisation d'une extraction de caractéristiques sur une première image à détecter, de façon à obtenir une première carte de caractéristiques pour une pluralité d'échelles de la première image (S11) ; et le traitement de la première carte de caractéristiques pour la pluralité d'échelles de la première image au moyen d'un réseau de détection cible entraîné, de façon à obtenir l'emplacement d'un premier objet d'une catégorie cible dans la première image (S12).
PCT/CN2021/119982 2021-01-15 2021-09-23 Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique WO2022151755A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110057241.X 2021-01-15
CN202110057241.XA CN112785565B (zh) 2021-01-15 2021-01-15 目标检测方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022151755A1 true WO2022151755A1 (fr) 2022-07-21

Family

ID=75757108

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119982 WO2022151755A1 (fr) 2021-01-15 2021-09-23 Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique

Country Status (2)

Country Link
CN (1) CN112785565B (fr)
WO (1) WO2022151755A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998749A (zh) * 2022-07-28 2022-09-02 北京卫星信息工程研究所 用于目标检测的sar数据扩增方法
CN116152487A (zh) * 2023-04-17 2023-05-23 广东广物互联网科技有限公司 一种基于深度IoU网络的目标检测方法、装置、设备及介质
CN116188502A (zh) * 2023-04-27 2023-05-30 尚特杰电力科技有限公司 光伏板红外图像分割方法、存储介质以及电子设备
CN116310656A (zh) * 2023-05-11 2023-06-23 福瑞泰克智能系统有限公司 训练样本确定方法、装置和计算机设备
CN116342607A (zh) * 2023-05-30 2023-06-27 尚特杰电力科技有限公司 输电线缺陷的识别方法、装置、电子设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785565B (zh) * 2021-01-15 2024-01-05 上海商汤智能科技有限公司 目标检测方法及装置、电子设备和存储介质
CN113749690B (zh) * 2021-09-24 2024-01-30 无锡祥生医疗科技股份有限公司 血管的血流量测量方法、装置及存储介质
CN117437397A (zh) * 2022-07-15 2024-01-23 马上消费金融股份有限公司 模型训练方法、目标检测方法及装置
CN115375955B (zh) * 2022-10-25 2023-04-18 北京鹰瞳科技发展股份有限公司 目标检测模型的训练方法、目标检测的方法及相关产品
CN115439699B (zh) * 2022-10-25 2023-06-30 北京鹰瞳科技发展股份有限公司 目标检测模型的训练方法、目标检测的方法及相关产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003267A (zh) * 2017-08-09 2018-12-14 深圳科亚医疗科技有限公司 从3d图像自动检测目标对象的计算机实现方法和系统
US20190073770A1 (en) * 2017-09-06 2019-03-07 International Business Machines Corporation Disease detection algorithms trainable with small number of positive samples
CN110210535A (zh) * 2019-05-21 2019-09-06 北京市商汤科技开发有限公司 神经网络训练方法及装置以及图像处理方法及装置
CN111368923A (zh) * 2020-03-05 2020-07-03 上海商汤智能科技有限公司 神经网络训练方法及装置、电子设备和存储介质
CN112785565A (zh) * 2021-01-15 2021-05-11 上海商汤智能科技有限公司 目标检测方法及装置、电子设备和存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003260B (zh) * 2018-06-28 2021-02-09 深圳视见医疗科技有限公司 Ct图像肺结节检测方法、装置、设备及可读存储介质
CN110827310A (zh) * 2019-11-01 2020-02-21 北京航空航天大学 Ct图像自动检测方法与系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003267A (zh) * 2017-08-09 2018-12-14 深圳科亚医疗科技有限公司 从3d图像自动检测目标对象的计算机实现方法和系统
US20190073770A1 (en) * 2017-09-06 2019-03-07 International Business Machines Corporation Disease detection algorithms trainable with small number of positive samples
CN110210535A (zh) * 2019-05-21 2019-09-06 北京市商汤科技开发有限公司 神经网络训练方法及装置以及图像处理方法及装置
CN111368923A (zh) * 2020-03-05 2020-07-03 上海商汤智能科技有限公司 神经网络训练方法及装置、电子设备和存储介质
CN112785565A (zh) * 2021-01-15 2021-05-11 上海商汤智能科技有限公司 目标检测方法及装置、电子设备和存储介质

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998749A (zh) * 2022-07-28 2022-09-02 北京卫星信息工程研究所 用于目标检测的sar数据扩增方法
CN116152487A (zh) * 2023-04-17 2023-05-23 广东广物互联网科技有限公司 一种基于深度IoU网络的目标检测方法、装置、设备及介质
CN116188502A (zh) * 2023-04-27 2023-05-30 尚特杰电力科技有限公司 光伏板红外图像分割方法、存储介质以及电子设备
CN116188502B (zh) * 2023-04-27 2023-07-21 尚特杰电力科技有限公司 光伏板红外图像分割方法、存储介质以及电子设备
CN116310656A (zh) * 2023-05-11 2023-06-23 福瑞泰克智能系统有限公司 训练样本确定方法、装置和计算机设备
CN116310656B (zh) * 2023-05-11 2023-08-15 福瑞泰克智能系统有限公司 训练样本确定方法、装置和计算机设备
CN116342607A (zh) * 2023-05-30 2023-06-27 尚特杰电力科技有限公司 输电线缺陷的识别方法、装置、电子设备及存储介质
CN116342607B (zh) * 2023-05-30 2023-08-08 尚特杰电力科技有限公司 输电线缺陷的识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN112785565A (zh) 2021-05-11
CN112785565B (zh) 2024-01-05

Similar Documents

Publication Publication Date Title
WO2022151755A1 (fr) Procédé et appareil de détection de cible, et dispositif électronique, support de stockage, produit de programme informatique et programme informatique
TWI770754B (zh) 神經網路訓練方法及電子設備和儲存介質
TWI755853B (zh) 圖像處理方法、電子設備和電腦可讀儲存介質
CN110458127B (zh) 图像处理方法、装置、设备以及系统
CN110210571B (zh) 图像识别方法、装置、计算机设备及计算机可读存储介质
CN111091576B (zh) 图像分割方法、装置、设备及存储介质
TWI713054B (zh) 圖像分割方法及裝置、電子設備和儲存媒體
WO2021147257A1 (fr) Procédé et appareil d'apprentissage de réseau, procédé et appareil de traitement d'images et dispositif électronique et support de stockage
CN112767329B (zh) 图像处理方法及装置、电子设备
TWI754375B (zh) 圖像處理方法、電子設備、電腦可讀儲存介質
WO2022036972A1 (fr) Procédé et appareil de segmentation d'image, dispositif électronique et support de stockage
CN110473186B (zh) 一种基于医学图像的检测方法、模型训练的方法及装置
JP2022535219A (ja) 画像分割方法及び装置、電子機器並びに記憶媒体
CN112541928A (zh) 网络训练方法及装置、图像分割方法及装置和电子设备
CN114820584B (zh) 肺部病灶定位装置
CN113222038B (zh) 基于核磁图像的乳腺病灶分类和定位方法及装置
WO2022121170A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique, support de stockage et programme
WO2023050691A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique, support de stockage et programme
WO2021259390A2 (fr) Procédé et appareil de détection de plaques calcifiées sur des artères coronaires
KR20220034844A (ko) 이미지 처리 방법 및 장치, 전자 기기, 저장 매체 및 프로그램 제품
CN115170464A (zh) 肺图像的处理方法、装置、电子设备和存储介质
KR20220012407A (ko) 이미지 분할 방법 및 장치, 전자 기기 및 저장 매체
WO2023050690A1 (fr) Procédé et appareil de traitement d'images, dispositif électronique, support de stockage, et programme
CN113902730A (zh) 图像处理和神经网络训练方法及装置
CN112767347A (zh) 一种图像配准方法及装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21918946

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.12.2023)