WO2019015785A1 - METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION - Google Patents

METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION Download PDF

Info

Publication number
WO2019015785A1
WO2019015785A1 PCT/EP2017/068550 EP2017068550W WO2019015785A1 WO 2019015785 A1 WO2019015785 A1 WO 2019015785A1 EP 2017068550 W EP2017068550 W EP 2017068550W WO 2019015785 A1 WO2019015785 A1 WO 2019015785A1
Authority
WO
WIPO (PCT)
Prior art keywords
vectors
neural network
vector
loss function
template image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP2017/068550
Other languages
English (en)
French (fr)
Inventor
Hiroaki Shimizu
Davy NEVEN
Bert DE BRABANDERE
Luc Van Gool
Marc PROESMANS
Nico CORNELIS
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Katholieke Universiteit Leuven
Toyota Motor Europe NV SA
Original Assignee
Katholieke Universiteit Leuven
Toyota Motor Europe NV SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Katholieke Universiteit Leuven, Toyota Motor Europe NV SA filed Critical Katholieke Universiteit Leuven
Priority to PCT/EP2017/068550 priority Critical patent/WO2019015785A1/en
Priority to JP2020502990A priority patent/JP6989688B2/ja
Publication of WO2019015785A1 publication Critical patent/WO2019015785A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn

Definitions

  • the present invention relates to the field of semantic instance segmentation, and more precisely to the training of neural networks used for semantic instance segmentation. Description of the Related Art
  • Semantic instance segmentation is a method for determining the types of objects in an image, for example acquired by a camera, while being able to differentiate objects of a same type.
  • semantic segmentation methods have been used to differentiate objects having different types on an image.
  • a semantic segmentation method cannot differentiate two objects having the same type. For example, if an image to be analyzed comprises two overlapping cars and two overlapping pedestrians, a semantic segmentation method will detect an area in the image corresponding to cars and an area in the image corresponding to pedestrians.
  • Various methods for semantic segmentation have been proposed which generally use (deep) convolutional networks.
  • Instance segmentation methods only aim at identifying separate objects regardless of their type. If the above-mentioned image is analyzed using an instance segmentation method then what will be detected is four separate objects.
  • Various methods have been proposed to achieve this, and most notably methods using (deep) convolutional networks.
  • Some known methods require specific network architectures or rely on object proposals. For example, some methods use a multistage (or cascaded) pipeline in which the object proposal (or bounding box generation) is followed by a separate segmentation and/or classification step. These methods are not satisfying in terms of speed (because of the multistage computations) and segmentation quality (in particular when faced with occlusions).
  • a desired output of a semantic instance segmentation method for the above image could be a mask highlighting each car and each pedestrian with different colors and labels indicating, for example, carl, car2, pedestrianl, pedestrian2.
  • training a neural network is an iterative task which can be performed using template images in which each element has already been identified, and a loss-function.
  • a loss function typically consists in a calculation performed on the output of a neural network to determine if this output is valid, i.e. this output leads to a good detection of each element and its type.
  • the loss function is generally a score which represents how far the output of a neural network is from an expected output.
  • the method of this document uses a loss function which will ensure that pixels that correspond to a same instance object are close in the space of the output of the neural network (typically, for each pixel in an image, a neural network outputs a vector).
  • This loss function also ensures that pixels that correspond to different objects remain far from each other in the network's output representation.
  • the loss function of this document therefore has a lower value when in the output of a neural network the vectors of pixels of a same object are close and when the vectors of pixels of different objects are far, and a lower value otherwise.
  • the neural network is then modified by taking into account the result of the loss function so as to obtain, in the next iteration, a lower score for the loss function.
  • the loss function of this document is unsatisfactory. More precisely, the loss function of this document relies on the random selection of a limited number of vectors for each object in the image, and uses extensive calculations.
  • the present invention overcomes one or more deficiencies of the prior art by proposing a method for training iteratively a neural network to be used for semantic instance segmentation, wherein, for each iteration, the neural network outputs a vector for each pixel of a template image, wherein the template image comprises predefined elements each associated with pixels of the template image and the corresponding vectors.
  • Training the neural network is performed using a loss function in which:
  • the loss function decreases until reaching a target value (for example zero) at least when: - for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases, and
  • the target value is a value which is desired to obtain for the loss function. When the loss reaches or goes below the target value it can be considered that the training is complete.
  • the target value can be predetermined. In some iterations, or when used on a real image, the target value may not be reached.
  • An element may be an object appearing on the template image. In the template image, there is a plurality of elements which may be of the same type or of different types. Each pixel in the template image has a known association with an element.
  • the neural network is a neural network already able to perform semantic segmentation, and the above-defined loss function is defined so as to train the network to also perform instance segmentation.
  • the invention advantageously applies on any already available neural network which may have been trained for semantic segmentation, without requiring modifications of the architecture of the neural network.
  • the inventors of the present invention have observed that using a neural network which has already been trained for semantic segmentation allows obtaining better results for semantic instance segmentation.
  • the loss function of the invention is defined so as to take all the vectors into account (the distance between each vector and the corresponding center is calculated, and the distances between all the centers are calculated).
  • the center of the vectors of an element may be determined as the mean vector of all the vectors belonging to a same element of the template image.
  • the loss function treats all the vectors in a computationally efficient manner, it is possible to obtain a loss which reaches the target value quickly and which actually means that all the vectors meet the expected requirements. A fast convergence of the training is obtained.
  • the loss function decreases until reaching the target value at least when for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases until this distance is inferior or equal to a first predefined distance threshold.
  • the loss function decreases until reaching the target valueat least when the distances between all the centers of the vectors of each element increases until each of the distances is superior or equal to a second predefined distance threshold.
  • the loss function is:
  • Nc the number of vectors in element c
  • This loss function can be computed efficiently.
  • the loss function is further defined so as do decrease until reaching the target value at least when the distances between each center of the vectors of each element and the origin of the space of the vectors decreases.
  • the loss function comprises an additional term and is:
  • Lreg is a term which pulls the vectors towards the origin of the space of the vectors.
  • y is preferably much less than or ⁇ as it plays a less preponderant role in the loss function.
  • a or ⁇ can have a value equal to 1 and ⁇ can be 0,001.
  • the coordinates of each pixel of an image inputted to the neural network are inputted to the neural network.
  • elements which have a similar appearance arranged in a specific manner may not be considered as two separate instances or elements.
  • the neural network receives enough information to differentiate the two elements.
  • the invention also provides a method for semantic instance segmentation comprising using the neural network trained using the above defined method.
  • the method further comprises a post-processing step in which the mean-shift algorithm or the k-means algorithm is applied to the vectors outputted by the neural network.
  • the vectors are likely to be placed in distinct and separate hyperspheres, this facilitates the implementation of the mean-shift algorithm or of the k-means algorithm. These algorithms facilitate the identification of pixels belonging to an object.
  • the invention also provides a system for training iteratively a neural network to be used for semantic instance segmentation, wherein, for each iteration, the neural network is configured to output a vector for each pixel of a template image,
  • the template image comprises predefined elements each associated with pixels of the template image and the corresponding vectors.
  • the system comprises a module for calculating a loss using a loss function for each iteration, the loss function being defined so as to decrease until reaching a target value at least when: - for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases, and
  • This system may be configured to perform all the embodiments of the method for training a neural network as defined above.
  • the invention also provides a system for image semantic instance segmentation comprising the neural network trained using the method for training a network as defined above.
  • the steps of the method for training a neural network and/or the steps of the method for semantic instance segmentation are determined by computer program instructions.
  • the invention is also directed to a computer program for executing the steps of a method as described above when this program is executed by a computer.
  • This program can use any programming language and take the form of source code, object code or a code intermediate between source code and object code, such as a partially compiled form, or any other desirable form.
  • the invention is also directed to a computer-readable information medium containing instructions of a computer program as described above.
  • the information medium can be any entity or device capable of storing the program.
  • the medium can include storage means such as a ROM, for example a CD ROM or a microelectronic circuit ROM, or magnetic storage means, for example a diskette (floppy disk) or a hard disk.
  • the information medium can be an integrated circuit in which the program is incorporated, the circuit being adapted to execute the method in question or to be used in its execution.
  • FIG. 1 is a block diagram of an exemplary method for training a neural network
  • FIG. 2 is a block diagram of an exemplary semantic instance segmentation method
  • FIG. 3 is a schematic diagram of a system for training a neural network and a system for semantic instance segmentation
  • FIG. 4 is a representation of the vectors outputted by a neural network
  • FIG. 5 illustrates the training of a neural network
  • FIG. 6 illustrates the effect of inputting the coordinates of pixels to the neural network.
  • This training is performed using a template image 1 comprising various elements, for example elements of the same type which may or may not overlap (for example two overlapping cars).
  • each element in the template image is previously known, and each pixel of this template image has a previously known association with an element (for example car number 1, car number 2, background, etc.).
  • the neural network to be trained transforms the template image into a plurality of vectors, each vector corresponding to a pixel of the template image.
  • This plurality of vectors is sometimes called a tensor by the person skilled in the art, and this tensor has the same height and width as the template image, but a different depth equal to the length of the vectors.
  • the length of the vectors can be chosen depending on the neural network to be trained, or depending on the application. All the vectors have the same length and they all belong to the same vector space.
  • vectors outputted by a neural network are sometimes called pixel embedding by the person skilled in the art.
  • the neural network is initially, before the training, a neural network already able to perform semantic segmentation.
  • a neural network already able to perform semantic segmentation.
  • the skilled person will know which neural network can already perform semantic segmentation.
  • the neural network may be a neural network known to the skilled person under the name "Segnet” and described in document "A deep convolutional encoder-decoder architecture for image segmentation” (V '. Badrinarayanan et al., arXiv preprint arXiv:1511.00561, 2015. 2), or a neural network described in document “Fully convolutional networks for semantic segmentation” (X Long et al., CVPR, 2015).
  • the neural network outputs vectors referenced 2 on figure 1.
  • step E02. The loss is calculated using a loss function which delivers a scalar value which is positive or zero often called a loss.
  • the loss function L can be a linear combination of two terms:
  • a a predefined constant preferably positive, for example equal to 1
  • ⁇ a predefined constant preferably positive, for example equal to 1
  • Lvar Lvar a term which decreases until reaching zero at least when for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases
  • Ldist a term which decreases until reaching zero at least when the distances between all the centers of the vectors of each element increase.
  • a and ⁇ may be chosen through a grid search or a hyperparameter search with an evaluation performed on a validation set. This can be performed by trying different settings in a structured way so as to choose the best value for and ⁇ .
  • the loss function can decrease until reaching zero at least when for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases until this distance is inferior or equal to a first predefined distance threshold.
  • Nc the number of vectors in element c
  • the distance can be the LI or L2 distance well known to the person skilled in the art.
  • the loss function can decrease until reaching zero at least when for each vector belonging to an element, the distance between the vector and a center of the vectors of this element decreases until this distance is inferior or equal to a first predefined distance threshold.
  • Ldist and Lvar are defined so as to ensure that when the loss is equal to zero all the vectors associated with an object are located inside an hypersphere having a radius equal to ⁇ and the centers of all the hyperspheres are separated by at least 2Sd.
  • 5d is superior to 2 ⁇ .
  • the loss function can be further defined so as do decrease until reaching zero at least when the distances between each center of the vectors of each element and the origin of the space of the vectors decreases.
  • the loss function comprises an additional term Lreg and is:
  • y is preferably much less than a or ⁇ as it plays a less preponderant role in the loss function.
  • a or ⁇ can have a value equal to 1 and y can be 0,001.
  • step E02. it is possible to calculate the loss of step E02. If this loss is equal to zero then it is considered that the neural network is trained. Alternatively, it can be considered that the neural network is trained when the loss is below a predefined threshold.
  • Step E03 is then performed and in this step, the parameters or weights or the neural network are adjusted using the loss calculated in step E02.
  • Step E03 can be performed for example using the method known to the skilled person as stochastic gradient descent (step E03).
  • step E04 which consists in at least performing steps E01 and E02 with the adjusted neural network.
  • training the neural network can be performed on a plurality of different template images. Once the neural network has been trained using the method disclosed on figure 1, it can be used for semantic instance segmentation, as represented on figure 2.
  • the method of figure 2 is performed on an image referenced 3, for example an image which has been acquired by a camera.
  • This image can comprise, for example, two partially overlapping cars and two partially overlapping pedestrians.
  • step Ell the image 3 is inputted to the trained neural network so as to perform semantic instance segmentation.
  • Vectors 4 are obtained as the output of the trained neural network.
  • a post-processing step E12 is carried out.
  • the neural network has been trained using the above-defined loss function, the vectors are close to being in separate hyperspheres. It should be noted that in most cases and when used on a real image (and not a template image), the loss is typically slightly above zero.
  • a sub-step E120 is performed in which the K-means of mean shift algorithm is used on the vectors so as to group together in clusters the pixels which should belong to the same object. This increases the robustness of the method.
  • a final image 5 with semantic instance segmentation is outputted.
  • the system SI which can be a computer, comprises a processor PR1 and a non-volatile memory MEM1.
  • a set of instructions INST1 is stored in the non-volatile memory MEM1.
  • the set of instructions INST1 comprises instructions to perform a method for training a neural network to perform semantic instance segmentation, for example the method described in reference to figure 1.
  • the non-volatile memory MEM1 further comprises a neural network NN and at least one template image TIMG.
  • the neural network NN can be used in a separate system S2 configured to perform semantic instance segmentation.
  • the neural network NN can be communicated to the system S2 using a communication network INT, for example the Internet.
  • a communication network INT for example the Internet.
  • the system S2 comprises a processor PR2 and a non-volatile memory MEM2 in which a set of instructions INST2 is stored to perform semantic instance segmentation using an image IMG stored in the non- volatile memory MEM2 and the trained neural network TNN also stored in the non-volatile memory MEM2.
  • Figure 4 is a schematic representation of the vectors outputted by a neural network.
  • the neural network which has been used is not fully trained and the loss is not equal to zero or below a predefined threshold. Also, in this example and for the sake of simplicity, the neural network outputs vectors having a length of 2, which allows using a two- dimensional representation.
  • the various vectors outputted by the neural network are represented as dots 10, 20, and 30 each associated with a pixel of the template image inputted to the neural network.
  • Each pixel of the template image has a known association with an element visible on the template image.
  • the same is also true for the vectors outputted by the neural network.
  • the vectors referenced 10 are all associated with a first object
  • the vectors referenced 20 are all associated with a second object
  • the vectors referenced 30 are all associated with a third object. Even if the training of the neural network is still being performed, the vectors 10, 20 and 30 already substantially form clusters of vectors respectively referenced CI, C2, and C3. It is then possible to determine the centers 11, 12, and 13, respectively of cluster CI, cluster C2, and cluster C3.
  • the loss function is defined so that the vectors 10 get closer (after each iteration of the training) to the center 11, and more precisely so that the vectors 10 are within a distance from the center which is less than a first predefined distance threshold ⁇ .
  • the vectors 10 are expected to all be inside the circle having a radius ⁇ and a center 11 represented on the figure.
  • the vectors 20 are expected to all be inside the circle having a radius ⁇ and a center 12 represented on the figure
  • the vectors 30 are expected to all be inside the circle having a radius ⁇ and a center 13 represented on the figure.
  • the loss function is further defined so that the centers 11, 12 and 13 get further away from each other (after each iteration of the training). More precisely, so that the centers are each separated by a second predefined distance threshold equal to 2 ⁇ .
  • the circles of radius ⁇ and centers 11, 12 and 13 are also represented on the figures.
  • the movements of the vectors carried out to move the clusters away from each other are represented using thick arrows on the figure.
  • Figure 5 illustrates the training of a neural network through various representations.
  • a template image 100 is represented on the figure, and this image 100 is a photograph of a plant having a variety of leafs and a background to be segmented.
  • Each pixel in the template image 100 has a known association with a specific leaf and it is possible to represent the image 100 as the final segmented image 200 shown below the template image 100.
  • the skilled person may refer to the segmented image 200 as the "ground truth”.
  • the row referenced 300 on figure 5 represents the positions of the vectors outputted by the neural network (in this example, the output of the network is in two dimensions) at seven different stages of the training of the neural network in consecutive order from left to right. This training is done by using the stochastic gradient descent method to adjust the neural network after each training iteration.
  • the seven different stages represented in the row 300 correspond respectively to 0, 2, 4, 8, 16, 32, and 64 adjustments to the neural network using the stochastic gradient descent method.
  • the row referenced 400 represents the output of the neural network without a post-processing step.
  • the images of this row are obtained by taking the output of the neural network which delivers vectors having two dimensions, and using each component of each vector respectively as a red value and as green value, the blue value being set at zero (the figure is in greyscale).
  • the row referenced 500 represents the result of a post-processing step in which a thresholding is performed with a radius equal to Sd.
  • Figure 6 illustrates the effect of inputting the coordinates of pixels to the neural network.
  • the wording location awareness refers to the inputting of the coordinates of each pixel to the neural network.
  • the output of the neural network (vectors and corresponding image) is then shown for the two cases which are with and without location awareness.
  • the neural network has difficulties to differentiate the two squares when they respectively are close to the upper left corner and the lower right corner without location awareness.
  • the neural network is always able to differentiate the two squares.
  • the above described embodiments allow obtaining a neural network which can be used for semantic instance segmentation with good results.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
PCT/EP2017/068550 2017-07-21 2017-07-21 METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION Ceased WO2019015785A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2017/068550 WO2019015785A1 (en) 2017-07-21 2017-07-21 METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION
JP2020502990A JP6989688B2 (ja) 2017-07-21 2017-07-21 セマンティック・インスタンス・セグメンテーションに使用されるニューラルネットワークを訓練するための方法およびシステム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2017/068550 WO2019015785A1 (en) 2017-07-21 2017-07-21 METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION

Publications (1)

Publication Number Publication Date
WO2019015785A1 true WO2019015785A1 (en) 2019-01-24

Family

ID=59581854

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/068550 Ceased WO2019015785A1 (en) 2017-07-21 2017-07-21 METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION

Country Status (2)

Country Link
JP (1) JP6989688B2 (enExample)
WO (1) WO2019015785A1 (enExample)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110751659A (zh) * 2019-09-27 2020-02-04 北京小米移动软件有限公司 图像分割方法及装置、终端、存储介质
CN110765916A (zh) * 2019-10-17 2020-02-07 北京中科原动力科技有限公司 一种基于语义和实例分割的农田苗垄识别方法及系统
CN110766281A (zh) * 2019-09-20 2020-02-07 国网宁夏电力有限公司电力科学研究院 一种基于深度学习的输电导线风害预警方法及终端
CN111028195A (zh) * 2019-10-24 2020-04-17 西安电子科技大学 一种基于实例分割的重定向图像质量信息处理方法及系统
CN111210452A (zh) * 2019-12-30 2020-05-29 西南交通大学 一种基于图割和均值偏移的证件照人像分割方法
CN111507343A (zh) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 语义分割网络的训练及其图像处理方法、装置
CN111709293A (zh) * 2020-05-18 2020-09-25 杭州电子科技大学 一种基于ResUNet神经网络的化学结构式分割方法
CN111967373A (zh) * 2020-08-14 2020-11-20 东南大学 一种基于摄像头和激光雷达的自适应强化融合实时实例分割方法
US20210080590A1 (en) * 2018-08-03 2021-03-18 GM Global Technology Operations LLC Conflict resolver for a lidar data segmentation system of an autonomous vehicle
KR20210074353A (ko) * 2019-02-25 2021-06-21 텐센트 테크놀로지(센젠) 컴퍼니 리미티드 포인트 클라우드 세그먼트화 방법, 컴퓨터로 판독 가능한 저장 매체 및 컴퓨터 기기
CN113673505A (zh) * 2021-06-29 2021-11-19 北京旷视科技有限公司 实例分割模型的训练方法、装置、系统及存储介质
CN114529191A (zh) * 2022-02-16 2022-05-24 支付宝(杭州)信息技术有限公司 用于风险识别的方法和装置
JP2022540582A (ja) * 2019-07-18 2022-09-16 シスピア 3d空間内の物体を自動的に検出、位置特定及び識別するための方法及びシステム
US11562171B2 (en) 2018-12-21 2023-01-24 Osaro Instance segmentation by instance label factorization
US12423812B2 (en) 2020-09-29 2025-09-23 Fujifilm Corporation Information processing apparatus, information processing method, and non-transitory computer-readable storage medium for training an estimation model such that a loss is reduced

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560496B (zh) * 2020-12-09 2024-02-02 北京百度网讯科技有限公司 语义分析模型的训练方法、装置、电子设备及存储介质
JP7718684B2 (ja) * 2021-08-26 2025-08-05 国立大学法人 筑波大学 区分け画像推定装置、区分け画像推定方法、及びプログラム

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080292194A1 (en) * 2005-04-27 2008-11-27 Mark Schmidt Method and System for Automatic Detection and Segmentation of Tumors and Associated Edema (Swelling) in Magnetic Resonance (Mri) Images
CN106897390A (zh) * 2017-01-24 2017-06-27 北京大学 基于深度度量学习的目标精确检索方法
US9704257B1 (en) * 2016-03-25 2017-07-11 Mitsubishi Electric Research Laboratories, Inc. System and method for semantic segmentation using Gaussian random field network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3492991B2 (ja) * 2000-09-13 2004-02-03 株式会社東芝 画像処理装置及び画像処理方法並びに記録媒体
JP4799105B2 (ja) * 2005-09-26 2011-10-26 キヤノン株式会社 情報処理装置及びその制御方法、コンピュータプログラム、記憶媒体
US10043112B2 (en) * 2014-03-07 2018-08-07 Qualcomm Incorporated Photo management

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080292194A1 (en) * 2005-04-27 2008-11-27 Mark Schmidt Method and System for Automatic Detection and Segmentation of Tumors and Associated Edema (Swelling) in Magnetic Resonance (Mri) Images
US9704257B1 (en) * 2016-03-25 2017-07-11 Mitsubishi Electric Research Laboratories, Inc. System and method for semantic segmentation using Gaussian random field network
CN106897390A (zh) * 2017-01-24 2017-06-27 北京大学 基于深度度量学习的目标精确检索方法

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BERT DE BRABANDERE ET AL: "Semantic Instance Segmentation for Autonomous Driving", HTTP://JUXI.NET/WORKSHOP/DEEP-LEARNING-ROBOTIC-VISION-CVPR-2017/PAPERS/, 19 May 2017 (2017-05-19), XP055462033, Retrieved from the Internet <URL:http://juxi.net/workshop/deep-learning-robotic-vision-cvpr-2017/papers/16.pdf> [retrieved on 20180322] *
FATHI ET AL., SEMANTIC INSTANCE SEGMENTATION VIA DEEP METRIC LEARNING, Retrieved from the Internet <URL:https://arxiv.org/pdf/1703.10277.pdf>
H. SCHARR ET AL.: "Leaf segmentation in plant phenotyping: a collation study", MACHINE VISION AND APPLICATIONS, vol. 27, no. 4, pages 585 - 606, XP035857192, DOI: doi:10.1007/s00138-015-0737-3
J. LONG ET AL.: "Fully convolutional networks for semantic segmentation", CVPR, 2015
KILIAN Q WEINBERGER ET AL: "Distance Metric Learning for Large Margin Nearest Neighbor Classification", JOURNAL OF MACHINE LEARNING RESEARCH, MIT PRESS, CAMBRIDGE, MA, US, vol. 10, 2 June 2009 (2009-06-02), pages 207 - 244, XP058264216, ISSN: 1532-4435 *
V. BADRINARAYANAN ET AL., A DEEP CONVOLUTIONAL ENCODER-DECODER ARCHITECTURE FOR IMAGE SEGMENTATION, vol. 2, 2015

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210080590A1 (en) * 2018-08-03 2021-03-18 GM Global Technology Operations LLC Conflict resolver for a lidar data segmentation system of an autonomous vehicle
US11915427B2 (en) * 2018-08-03 2024-02-27 GM Global Technology Operations LLC Conflict resolver for a lidar data segmentation system of an autonomous vehicle
US11562171B2 (en) 2018-12-21 2023-01-24 Osaro Instance segmentation by instance label factorization
CN111507343A (zh) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 语义分割网络的训练及其图像处理方法、装置
CN111507343B (zh) * 2019-01-30 2021-05-18 广州市百果园信息技术有限公司 语义分割网络的训练及其图像处理方法、装置
KR102510745B1 (ko) 2019-02-25 2023-03-15 텐센트 테크놀로지(센젠) 컴퍼니 리미티드 포인트 클라우드 세그먼트화 방법, 컴퓨터로 판독 가능한 저장 매체 및 컴퓨터 기기
KR20210074353A (ko) * 2019-02-25 2021-06-21 텐센트 테크놀로지(센젠) 컴퍼니 리미티드 포인트 클라우드 세그먼트화 방법, 컴퓨터로 판독 가능한 저장 매체 및 컴퓨터 기기
JP2022540582A (ja) * 2019-07-18 2022-09-16 シスピア 3d空間内の物体を自動的に検出、位置特定及び識別するための方法及びシステム
CN110766281A (zh) * 2019-09-20 2020-02-07 国网宁夏电力有限公司电力科学研究院 一种基于深度学习的输电导线风害预警方法及终端
CN110766281B (zh) * 2019-09-20 2022-04-26 国网宁夏电力有限公司电力科学研究院 一种基于深度学习的输电导线风害预警方法及终端
CN110751659A (zh) * 2019-09-27 2020-02-04 北京小米移动软件有限公司 图像分割方法及装置、终端、存储介质
CN110751659B (zh) * 2019-09-27 2022-06-10 北京小米移动软件有限公司 图像分割方法及装置、终端、存储介质
CN110765916A (zh) * 2019-10-17 2020-02-07 北京中科原动力科技有限公司 一种基于语义和实例分割的农田苗垄识别方法及系统
CN110765916B (zh) * 2019-10-17 2022-08-30 北京中科原动力科技有限公司 一种基于语义和实例分割的农田苗垄识别方法及系统
CN111028195A (zh) * 2019-10-24 2020-04-17 西安电子科技大学 一种基于实例分割的重定向图像质量信息处理方法及系统
CN111210452A (zh) * 2019-12-30 2020-05-29 西南交通大学 一种基于图割和均值偏移的证件照人像分割方法
CN111210452B (zh) * 2019-12-30 2023-04-07 西南交通大学 一种基于图割和均值偏移的证件照人像分割方法
CN111709293A (zh) * 2020-05-18 2020-09-25 杭州电子科技大学 一种基于ResUNet神经网络的化学结构式分割方法
CN111709293B (zh) * 2020-05-18 2023-10-03 杭州电子科技大学 一种基于ResUNet神经网络的化学结构式分割方法
CN111967373B (zh) * 2020-08-14 2021-03-30 东南大学 一种基于摄像头和激光雷达的自适应强化融合实时实例分割方法
CN111967373A (zh) * 2020-08-14 2020-11-20 东南大学 一种基于摄像头和激光雷达的自适应强化融合实时实例分割方法
US12423812B2 (en) 2020-09-29 2025-09-23 Fujifilm Corporation Information processing apparatus, information processing method, and non-transitory computer-readable storage medium for training an estimation model such that a loss is reduced
CN113673505A (zh) * 2021-06-29 2021-11-19 北京旷视科技有限公司 实例分割模型的训练方法、装置、系统及存储介质
CN114529191A (zh) * 2022-02-16 2022-05-24 支付宝(杭州)信息技术有限公司 用于风险识别的方法和装置

Also Published As

Publication number Publication date
JP6989688B2 (ja) 2022-01-05
JP2020527812A (ja) 2020-09-10

Similar Documents

Publication Publication Date Title
WO2019015785A1 (en) METHOD AND SYSTEM FOR LEARNING A NEURAL NETWORK TO BE USED FOR SEMANTIC INSTANCE SEGMENTATION
EP3333768A1 (en) Method and apparatus for detecting target
CN117542067B (zh) 一种基于视觉识别的区域标注表单识别方法
CN109740606B (zh) 一种图像识别方法及装置
CN108197644A (zh) 一种图像识别方法和装置
CN111738045B (zh) 一种图像检测方法、装置、电子设备及存储介质
KR102166117B1 (ko) 시멘틱 매칭 장치 및 방법
US9224207B2 (en) Segmentation co-clustering
US11417129B2 (en) Object identification image device, method, and computer program product
CN108090451B (zh) 一种人脸识别方法及系统
CN106408037A (zh) 图像识别方法及装置
CN114548218B (zh) 图像匹配方法、装置、存储介质和电子装置
Zhao et al. Deep Adaptive Log‐Demons: Diffeomorphic Image Registration with Very Large Deformations
CN111160142A (zh) 一种基于数值预测回归模型的证件票据定位检测方法
JP6107531B2 (ja) 特徴抽出プログラム及び情報処理装置
CN112204957A (zh) 白平衡处理方法和设备、可移动平台、相机
CN104765440B (zh) 手检测方法和设备
Perwej et al. The Kingdom of Saudi Arabia Vehicle License Plate Recognition using Learning Vector Quantization Artificial Neural Network
CN114170465A (zh) 基于注意力机制的3d点云分类方法、终端设备及存储介质
Suzuki et al. Superpixel convolution for segmentation
Omarov et al. Machine learning based pattern recognition and classification framework development
RU2672622C1 (ru) Способ распознавания графических образов объектов
Patil et al. Deep learning-based approach for indian license plate recognition using optical character recognition
CN105868789B (zh) 一种基于图像区域内聚测度的目标发现方法
Mitra et al. Machine learning approach for signature recognition by HARRIS and SURF features detector

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17751279

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020502990

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17751279

Country of ref document: EP

Kind code of ref document: A1