JP2023004894A

JP2023004894A - Image processing device, image processing method, apparatus readable storage medium

Info

Publication number: JP2023004894A
Application number: JP2022084540A
Authority: JP
Inventors: 威劉; Wei Liu; ジャン・イン; Ying Zhang; 俊孫; Shun Son
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2021-06-25
Filing date: 2022-05-24
Publication date: 2023-01-17
Also published as: CN115527026A

Abstract

To provide an image processing device, a method and an apparatus readable storage medium.SOLUTION: A device includes: a first acquisition unit in which a first segmentation model acquires a pseudo label for a subject of a basic class in an image; a second acquisition unit for acquiring a new label for a subject of a new class in the image; a processing unit for, based on the pseudo label and the new label, acquiring a second segmentation model for the subjects of the basic class and the new class; a first calculation unit in which the first segmentation model calculates a first dataset of a first parameter associated with the subject of the basic class in the image; a second calculation unit in which the second segmentation model calculates a second dataset of the first parameter associated with the subject of the basic class in the image; and a comparison unit for comparing the first dataset and the second dataset, and providing the second calculation unit with comparison information so as to correct the second segmentation model.SELECTED DRAWING: Figure 1

Description

本開示は、画像処理の技術分野に関し、具体的には、増分学習のための画像処理装置、画像処理方法及び機器読み取り可能な記憶媒体に関する。 TECHNICAL FIELD The present disclosure relates to the technical field of image processing, and more particularly to an image processing apparatus, image processing method, and machine-readable storage medium for incremental learning.

この部分は、本開示に関連する背景情報を提供するが、必ずしも従来技術ではない。 This section provides background information related to the present disclosure, but is not necessarily prior art.

人間は、生涯で知識を獲得、調整、伝達する能力を持っている。新しい知識の学習は、人間が学んだ知識に破局的な影響を及ぼす。この学習能力は、増分学習（ｉｎｃｒｅｍｅｎｔａｌｌｅａｒｎｉｎｇ）の能力と称される。機械学習の分野では、増分学習は、モデルの訓練（トレーニング）の破局的忘却（ｃａｔａｓｔｒｏｐｈｉｃｆｏｒｇｅｔｔｉｎｇ）という一般的な欠点を解決することを目的としている。言い換えれば、一般的な機械学習モデル（特に逆伝播に基づく深層学習手法）を新しいタスクについて訓練する場合、古いタスクでの性能（パフォーマンス）が大幅に低下する場合が多い。破局的忘却を克服するために、該モデルは、新しい知識を統合し、新しいデータから既存の知識を抽出する能力（可塑性）を示すと共に、新しい入力の既存の知識への明らかな干渉を回避する能力（安定性）を示すことが期待されている。この２つの背反の要求は、いわゆる「安定性と可塑性のジレンマ（Ｓｔａｂｉｌｉｔｙ－ＰｌａｓｔｉｃｉｔｙＤｉｌｅｍｍａ）」を構成する。 Humans have the capacity to acquire, adjust, and transmit knowledge throughout their lives. The learning of new knowledge has a catastrophic effect on human learned knowledge. This learning ability is referred to as the ability of incremental learning. In the field of machine learning, incremental learning aims to solve a common drawback of catastrophic forgetting in model training. In other words, when general machine learning models (especially backpropagation-based deep learning methods) are trained on new tasks, they often perform significantly worse on old tasks. To overcome catastrophic forgetting, the model exhibits the ability to integrate new knowledge and extract existing knowledge from new data (plasticity), while avoiding apparent interference of new inputs with existing knowledge. It is expected to demonstrate competence (stability). These two conflicting requirements constitute the so-called "Stability-Plasticity Dilemma".

破局的忘却を解決する最も簡単な方法は、データ分布の経時的な変化に適応するために、全ての既知のデータを使用してネットワークパラメータを再訓練することである。モデルを最初から訓練する方法は、破局的忘却の問題を完全に解決できるが、該方法の効率が非常に低く、モデルによる新しいデータのリアルタイム学習を大幅に妨げてしまう。増分学習の主な目標は、限られたコンピューティングリソースとストレージリソースの条件下で、安定性と可塑性のジレンマの中で最も効果的なバランスポイントを見つけることである。 The simplest way to solve catastrophic forgetting is to retrain the network parameters using all known data to adapt to changes in data distribution over time. The method of training the model from scratch can completely solve the problem of catastrophic forgetting, but the efficiency of the method is very low, which greatly prevents the model from learning new data in real time. The main goal of incremental learning is to find the most effective balance point in the stability-plasticity dilemma under the condition of limited computing and storage resources.

現在、増分学習は、ルールに基づく学習、リハーサルに基づく学習、及びバイアス補正に基づく学習の３つの方法に大まかに分けてもよい。セマンティックセグメンテーションとインスタンスセグメンテーションの分野では、増分学習方法に関するものが少ない。 At present, incremental learning may be broadly divided into three methods: rule-based learning, rehearsal-based learning, and bias correction-based learning. In the fields of semantic segmentation and instance segmentation, there is little about incremental learning methods.

この部分は、本開示の一般的な概要を提供し、その全範囲又はその全ての特徴を完全に開示するものではない。 This section provides a general overview of the disclosure and does not fully disclose its full scope or all of its features.

本開示は、対象セグメンテーションの増分学習のための画像処理装置、画像処理方法及び機器読み取り可能な記憶媒体を提供することを目的とする。本開示は、新しい観点から対象セグメンテーションの増分学習問題を解決した。 An object of the present disclosure is to provide an image processing apparatus, an image processing method, and a machine-readable storage medium for incremental learning of object segmentation. The present disclosure solves the incremental learning problem of object segmentation from a new perspective.

本開示の１つの態様では、画像処理装置であって、第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得する第１の取得部であって、前記第１のセグメンテーションモデルは、基本クラスの対象のセグメンテーションに使用される、第１の取得部と、前記画像における前記基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得する第２の取得部と、前記疑似ラベル及び前記新規ラベルに基づいて、前記基本クラス及び前記新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得する処理部と、を含み、前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、ニューラルネットワークにより実現され、前記画像処理装置は、前記第１のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算する第１の計算部と、前記第２のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算する第２の計算部と、前記第１のデータセットと前記第２のデータセットとを比較し、前記第２のセグメンテーションモデルを修正するように前記第２の計算部に比較情報を提供する比較部と、をさらに含む、画像処理装置を提供する。 In one aspect of the present disclosure, an image processing apparatus, a first obtaining unit for obtaining pseudo labels associated with an object segmentation of a base class object in an image according to a first segmentation model, the first is used for segmentation of objects of a base class, and a second acquisition unit for obtaining new labels associated with the segmentation of objects of a new class different from the base class in the image. and a processing unit for obtaining a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label, wherein the first segmentation model and said second segmentation model is implemented by a neural network, said image processing device generating a first data set of first parameters associated with said base class of objects in said image by said first segmentation model and a second computation unit that computes a second data set of first parameters associated with the base class objects in the image according to the second segmentation model; a comparison unit that compares the first data set and the second data set and provides comparison information to the second calculation unit to modify the second segmentation model. Provide equipment.

本開示のもう１つの態様では、画像処理方法であって、第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得するステップであって、前記第１のセグメンテーションモデルは、基本クラスの対象のセグメンテーションに使用される、ステップと、前記画像における前記基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得するステップと、前記疑似ラベル及び前記新規ラベルに基づいて、前記基本クラス及び前記新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得するステップと、を含み、前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、ニューラルネットワークにより実現され、前記画像処理方法は、前記第１のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算するステップと、前記第２のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算するステップと、前記第１のデータセットと前記第２のデータセットとを比較し、前記第２のセグメンテーションモデルを修正するステップと、をさらに含む、画像処理方法を提供する。 In another aspect of the present disclosure, an image processing method, the step of obtaining pseudo-labels associated with an object segmentation of a base class of objects in an image by a first segmentation model, comprising: is used for segmentation of objects of a base class; obtaining new labels associated with the segmentation of objects of a new class of objects in the image different from the base class; obtaining a second segmentation model for segmentation of objects of the base class and the new class based on The image processing method is implemented, comprising the steps of calculating a first data set of first parameters associated with objects of the base class in the image according to the first segmentation model; calculating a second data set of first parameters associated with objects of the base class in the image; comparing the first data set and the second data set; modifying the segmentation model of .

本開示のもう１つの態様では、機器読み取り可能な命令コードを記憶しているプログラムプロダクトが記録された機器読み取り可能な記憶媒体であって、前記命令コードがコンピュータにより読み取られて実行される際に、前記コンピュータに本開示の画像処理方法を実行させることができる、記憶媒体を提供する。 In another aspect of the present disclosure, a machine-readable storage medium recording a program product storing machine-readable instruction code, wherein when the instruction code is read and executed by a computer, , a storage medium capable of causing the computer to execute the image processing method of the present disclosure.

本開示に係る画像処理装置、画像処理方法及び機器読み取り可能な記憶媒体は、インスタンスセグメンテーションの増分学習を好適に実現し、訓練モデルの破局的忘却を効果的に防止することができる。 The image processing apparatus, image processing method, and machine-readable storage medium according to the present disclosure can suitably implement incremental learning of instance segmentation and effectively prevent catastrophic forgetting of a training model.

ここで行われる説明により、本開示の適用可能な範囲はより明確になる。この部分における説明及び特定の例は、単なる例示するためのものであり、本開示の範囲を限定するものではない。 With the description provided herein, the scope of applicability of the present disclosure will become clearer. The descriptions and specific examples in this section are for illustrative purposes only and are not intended to limit the scope of the disclosure.

ここで説明される図面は、好ましい実施例を例示するためのものであり、全ての可能な実施例ではなく、本開示の範囲を限定するものではない。
本開示の実施例に係る画像処理装置の構成を示すブロック図である。本開示の実施例に係る第１のセグメンテーションモデルの基本構造を示すブロック図である。本開示の実施例に係る第２のセグメンテーションモデルの基本構造を示すブロック図である。本開示の実施例に係る画像処理装置の部分構造の詳細な動作を示すブロック図である。本開示の実施例に係る計算された分類の信頼度を示す概略図である。本開示の実施例に係る画像処理装置の内部コンピューティングの詳細を示す概略図である。本開示の実施例に係る画像処理方法を示すフローチャートである。本開示の実施例に係る画像処理装置及び方法を実現可能な汎用パーソナルコンピュータの例示的な構成を示すブロック図である。本開示に対して各種の変更及び代替を行うことができるが、その特定の実施例は図面を参照しながら詳細に説明される。なお、特定の実施例の説明は本開示を開示の具体的な態様に限定するものではなく、本開示の主旨及び範囲内で各種の変更、均等的なものへの変形、代替を行ってもよい。なお、図面において、同一の構成部は同一の符号で示されている。 The drawings described herein are intended to illustrate preferred embodiments, are not all possible embodiments, and are not intended to limit the scope of the disclosure.
1 is a block diagram showing the configuration of an image processing device according to an embodiment of the present disclosure; FIG. 1 is a block diagram illustrating the basic structure of a first segmentation model according to an embodiment of the present disclosure; FIG. FIG. 4 is a block diagram illustrating the basic structure of a second segmentation model according to an embodiment of the present disclosure; FIG. 2 is a block diagram showing detailed operations of a partial structure of an image processing apparatus according to an embodiment of the present disclosure; FIG. 4 is a schematic diagram illustrating calculated classification confidence according to an embodiment of the present disclosure; 2 is a schematic diagram showing internal computing details of an image processing device according to an embodiment of the present disclosure; FIG. 4 is a flowchart illustrating an image processing method according to an embodiment of the present disclosure; 1 is a block diagram showing an exemplary configuration of a general-purpose personal computer capable of implementing an image processing apparatus and method according to embodiments of the present disclosure; FIG. While various modifications and alternatives may be made to the disclosure, specific examples thereof will be described in detail with reference to the drawings. The description of specific embodiments does not limit the present disclosure to specific aspects of the disclosure, and various changes, equivalent modifications, and substitutions may be made within the spirit and scope of the present disclosure. good. In the drawings, the same components are denoted by the same reference numerals.

以下は、図面を参照しながら本開示の例示的な実施例を詳細に説明する。以下の説明は単なる例示的なものであり、本開示、応用及び用途を限定するものではない。 Exemplary embodiments of the present disclosure are described in detail below with reference to the drawings. The following description is merely exemplary and is not intended to limit the disclosure, applications and uses.

以下は、本開示を詳細に説明し、当業者が本開示の範囲を十分に理解するために、例示的な実施例を提供する。本開示の実施例を詳細に理解させるために、多くの特定の細部、例えば特定の手段、装置及び方法の例を説明する。なお、当業者が分かるように、特定の細部を用いる必要がなく、異なる方式を用いて例示的な実施例を実施してもよく、これらの実施例は本開示の範囲を制限するものではない。一部の例示的な実施例では、周知のプロセス、周知の構成及び周知の技術が詳細に説明されていない。 The following describes the present disclosure in detail and provides illustrative examples so that those skilled in the art can fully appreciate the scope of the present disclosure. Numerous specific details, such as examples of specific means, devices and methods, are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. However, as will be appreciated by those skilled in the art, the exemplary embodiments may be implemented using different schemes without the need for specific details, and these embodiments do not limit the scope of the present disclosure. . In some example embodiments, well-known processes, well-known configurations, and well-known techniques have not been described in detail.

本開示では、基本クラスと新規クラスのシーン（画像の背景）は類似すると想定されている。基本クラスの対象を含む元の画像データについて、優れたインスタンスセグメンテーションモデル（以下は、第１のセグメンテーションモデルとも称される）が訓練された。画像データに新規クラスが出現する場合、新しいデータは、元の基本クラスと新規クラスのインスタンスを含む。元のインスタンスセグメンテーションモデル（第１のセグメンテーションモデル）は、基本クラスを検出してセグメント化する機能のみを有し、これに対し、本開示は、基本クラス及び新規クラスを最小のコストで検出してセグメント化するために、インスタンスセグメンテーションモデル（以下は、第２のセグメンテーションモデルとも称される）を訓練することを目的とする。 In this disclosure, it is assumed that the scenes (image backgrounds) of the base and new classes are similar. A good instance segmentation model (hereinafter also referred to as the first segmentation model) was trained on the original image data containing the base class of objects. When a new class appears in the image data, the new data contains instances of the original base class and the new class. The original instance segmentation model (first segmentation model) only has the ability to detect and segment base classes, whereas the present disclosure detects base classes and new classes with minimal cost. For the segmentation, we aim to train an instance segmentation model (hereinafter also referred to as the second segmentation model).

本開示は、ソースデータを必要とせずに、新規クラスを含む新規サンプルにおいて部分データにラベルを付けるだけでよい革新的な方法を提案する。元のインスタンスセグメンテーションモデルから基本クラスの疑似ラベル及び新規クラスの新規ラベルを取得し、これらのラベルを教師ラベルとして組み合わせる。また、訓練されたモデルの破局的忘却を防止するために、他の損失関数を追加し、例えば検出ブランチ及び分類ブランチに２つの蒸留損失を追加する。該方法は、インスタンスセグメンテーションの増分学習について良い効果を有する。以下は、図１を参照しながら、上記の目的を達成するための本開示の実施例に係る画像処理装置の構成を説明する。 This disclosure proposes an innovative method that does not require source data and only needs to label partial data in new samples containing new classes. We obtain the pseudo-labels of the base class and the new labels of the new class from the original instance segmentation model, and combine these labels as teacher labels. Also, to prevent catastrophic forgetting of the trained model, we add another loss function, eg two distillation losses in the detection and classification branches. The method has a good effect on incremental learning of instance segmentation. A configuration of an image processing apparatus according to an embodiment of the present disclosure for achieving the above object will be described below with reference to FIG.

図１は、本開示の実施例に係る画像処理装置の構成を示すブロック図である。図１に示すように、本開示の実施例に係る画像処理装置１００は、第１の取得部１１０、第２の取得部１２０、処理部１３０、第１の計算部１４０、第２の計算部１５０及び比較部１６０を含んでもよい。 FIG. 1 is a block diagram showing the configuration of an image processing device according to an embodiment of the present disclosure. As shown in FIG. 1, the image processing apparatus 100 according to the embodiment of the present disclosure includes a first acquisition unit 110, a second acquisition unit 120, a processing unit 130, a first calculation unit 140, a second calculation unit 150 and a comparison unit 160 may be included.

第１の取得部１１０は、第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得してもよい。第１のセグメンテーションモデルは、ニューラルネットワークにより実現され、基本クラスの対象のセグメンテーションに使用される。例えば、第１のセグメンテーションモデルは、ｙｏｌａｃｔ＋＋インスタンスセグメンテーションモデルであってもよい。ここで、疑似ラベルは、基本クラスの対象のラベル付け又はラベル情報として理解されてもよい。 The first obtaining unit 110 may obtain the pseudo-labels associated with the object segmentation of the base class objects in the image according to the first segmentation model. A first segmentation model is implemented by a neural network and is used for segmentation of a base class of objects. For example, the first segmentation model may be a yolact++ instance segmentation model. Here, a pseudo-label may be understood as a labeling or label information of objects of a base class.

また、第２の取得部１２０は、画像における基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得してもよい。ここで、新規ラベルは、画像における基本クラスとは異なる新規クラスの対象の注釈又はラベル情報として理解されてもよい。新規ラベルは、手動のラベル付けにより取得されてもよく、例えば、画像における新規クラスの対象の分類識別子、バウンディングボックス及びマスクを手動でラベル付けしてもよい。分類識別子は、新規クラス対象が属するクラスを表し、バウンディングボックスは、該対象の位置を表してもよく、マスクは、該対象のマスクを表す。それに応じて、第１の取得部１１０により取得された基本クラスの対象のラベル付け情報（疑似ラベル）は、例えば、分類の信頼度、アンカーボックスのオフセット、及びマスクである。ここで、アンカーボックスのオフセットは、バウンディングボックスの位置、即ち対象の位置を表す。 The second obtaining unit 120 may also obtain a new label associated with the target segmentation of the target of a new class different from the base class in the image. Here, a new label may be understood as an annotation or labeling information of a new class of objects different from the base class in the image. The new labels may be obtained by manual labeling, for example by manually labeling the classification identifiers, bounding boxes and masks of new classes of objects in the image. The class identifier represents the class to which the new class object belongs, the bounding box may represent the location of the object, and the mask represents the mask of the object. Accordingly, the base class object labeling information (pseudo-labels) obtained by the first obtaining unit 110 is, for example, classification confidence, anchor box offset and mask. Here, the offset of the anchor box represents the position of the bounding box, ie the position of the object.

なお、本開示では、画像は、特定のクラスの対象（人、猫、犬、ボトルなど）の画像であり、このクラスの対象は、異なるサブクラスの対象を含んでもよい。従って、基本クラスは、既存の第１のセグメンテーションモデルでセグメント化できる該クラスの対象のサブクラスの総称であり、新規クラスの対象は、基本クラスとは異なる該クラスの対象を意味する。例えば、既存の第１のセグメンテーションモデルは、ボトルに関連し、この場合、基本クラスは、プラスチックボトル、茶色のガラスボトル、青又は緑のガラスボトルを含んでもよく、新規クラスは、透明なガラスボトルを含んでもよい。 Note that in this disclosure, an image is an image of a particular class of objects (human, cat, dog, bottle, etc.), and objects of this class may include objects of different subclasses. Thus, a base class is a generic term for the subclasses of objects of this class that can be segmented with an existing first segmentation model, and a new class of objects means objects of this class that are different from the base class. For example, the first existing segmentation model relates to bottles, where the base class may include plastic bottles, brown glass bottles, blue or green glass bottles, and the new class is clear glass bottles. may include

また、処理部１３０は、疑似ラベル及び新規ラベルに基づいて、基本クラス及び新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得してもよい。なお、第２のセグメンテーションモデルは、ニューラルネットワークにより実現され、基本クラスと新規クラスの対象を含む画像のインスタンスセグメンテーションに使用される。例えば、第２のセグメンテーションモデルは、ｙｏｌａｃｔ＋＋インスタンスセグメンテーションモデルであってもよい。また、処理部１３０の動作は、第２のセグメンテーションモデルの増分学習の訓練プロセスに対応する。 The processing unit 130 may also obtain a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label. Note that the second segmentation model is implemented by a neural network and used for instance segmentation of images containing objects of base and novel classes. For example, the second segmentation model may be a yolact++ instance segmentation model. Also, the operation of the processing unit 130 corresponds to the training process of incremental learning of the second segmentation model.

これによって、第１のセグメンテーションモデルにより取得された基本クラスの疑似ラベル及び新しくラベル付けされた新規クラスの新規ラベルを使用し、これらのラベルを教師ラベルとして組み合わせることで、基本クラスと新規クラスの対象をセグメント化できる第２のセグメンテーションモデルを取得することができる。 By using the pseudo-labels of the base class and the novel labels of the newly labeled novel class obtained by the first segmentation model, and combining these labels as teacher labels, the target A second segmentation model can be obtained that can segment .

また、第１の計算部１４０は、第１のセグメンテーションモデルにより、画像における基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算してもよい。また、第２の計算部１５０は、処理部１３０により取得された第２のセグメンテーションモデルにより、画像における基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算してもよい。画像における基本クラスの対象に関連する第１のパラメータは、例えばアンカーボックスのオフセットであってもよい。 The first computation unit 140 may also compute a first data set of first parameters associated with the base class objects in the image according to the first segmentation model. The second computing unit 150 may also compute a second data set of first parameters associated with the base class objects in the image according to the second segmentation model obtained by the processing unit 130 . A first parameter associated with the base class object in the image may be, for example, the offset of the anchor box.

また、比較部１６０は、第１のデータセットと第２のデータセットとを比較し、第２のセグメンテーションモデルを修正するように第２の計算部１５０に比較情報を提供してもよい。第２の計算部１５０は、比較情報に基づいて第２のセグメンテーションモデルを継続的に最適化してもよい。好ましくは、第２の計算部１５０は、第１のデータセットと第２のデータセットとの差を最小化するように第２のセグメンテーションモデルを修正する。 The comparison unit 160 may also compare the first data set and the second data set and provide comparison information to the second calculation unit 150 to modify the second segmentation model. The second computing unit 150 may continuously optimize the second segmentation model based on the comparison information. Preferably, the second computation unit 150 modifies the second segmentation model to minimize the difference between the first data set and the second data set.

これによって、さらに、基本クラスの対象に関連するパラメータの比較情報をフィードバックし、比較情報に基づいて第２のセグメンテーションモデルを最適化することで、本開示の実施例に係る画像処理装置１００は、インスタンスセグメンテーションの増分学習を好適に実現し、訓練モデルの破局的忘却を効果的に防止することができる。 Thereby, by further feeding back the comparison information of the parameters related to the objects of the base class and optimizing the second segmentation model based on the comparison information, the image processing apparatus 100 according to the embodiment of the present disclosure can: Incremental learning of instance segmentation can be favorably realized, and catastrophic forgetting of the training model can be effectively prevented.

以下は、図２及び図３を参照しながら、第１のセグメンテーションモデル及び第２のセグメンテーションモデルをさらに説明する。第１の取得部１１０及び第１の計算部１４０の機能は第１のセグメンテーションモデルにより実現されるため、図２における第１のセグメンテーションモデルの構成も、第１の取得部１１０及び第１の計算部１４０の構成に対応する。 The following further describes the first segmentation model and the second segmentation model with reference to FIGS. Since the functions of the first acquisition unit 110 and the first calculation unit 140 are realized by the first segmentation model, the configuration of the first segmentation model in FIG. It corresponds to the configuration of the unit 140 .

図２は、本開示の実施例に係る第１のセグメンテーションモデルの基本構造を示すブロック図である。図２に示すように、第１のセグメンテーションモデル２００は、特徴抽出部２１０、予測部２２０、マスクテンプレート取得部２３０、及び後処理部２４０を含んでもよい。 FIG. 2 is a block diagram illustrating the basic structure of a first segmentation model according to an embodiment of the disclosure. As shown in FIG. 2 , the first segmentation model 200 may include a feature extractor 210 , a predictor 220 , a mask template acquirer 230 and a post-processor 240 .

特徴抽出部２１０は、入力画像についての特徴情報を取得してもよい。例えば、特徴情報は特徴マップである。また、特徴抽出部２１０は、予測部２２０及びマスクテンプレート取得部２３０に特徴情報を提供してもよい。 The feature extractor 210 may acquire feature information about the input image. For example, the feature information is a feature map. Also, the feature extraction unit 210 may provide feature information to the prediction unit 220 and the mask template acquisition unit 230 .

予測部２２０は、特徴情報に基づいて、入力画像に関する予測情報、例えばマスク係数、バウンディングボックス又はアンカーボックスのオフセット、及び分類の信頼度などを取得してもよい。分類の信頼度は、入力画像における対象が特定のクラスに属する確率を表してもよい。例えば、複数のボトルを含む入力画像について、対象が特定のクラスに属する確率は、対象が複数のクラスのボトルのうちの特定のクラスに属する確率を意味してもよい。 Based on the feature information, the prediction unit 220 may obtain prediction information about the input image, such as mask coefficients, bounding box or anchor box offsets, classification reliability, and the like. Classification confidence may represent the probability that an object in an input image belongs to a particular class. For example, for an input image containing multiple bottles, the probability that an object belongs to a particular class may refer to the probability that an object belongs to a particular class among multiple classes of bottles.

マスクテンプレート取得部２３０は、特徴情報に基づいてマスクテンプレートを取得してもよい。マスクテンプレートは、マスクベースベクトルであってもよい。入力画像の元のマスクは、マスクベースベクトル及び予測部２２０により取得されたマスク係数により計算されてもよい。例えば、入力画像の元のマスクは、マスクベースベクトルにマスク係数を乗算することによって計算されてもよい。 The mask template acquisition unit 230 may acquire the mask template based on the feature information. A mask template may be a mask base vector. The original mask of the input image may be calculated from the mask base vector and the mask coefficients obtained by the predictor 220 . For example, the original mask of the input image may be computed by multiplying the mask base vector by the mask coefficients.

後処理部２４０は、予測情報及びマスクテンプレートに基づいて、画像における基本クラスの対象のマスクを取得すると共に、基本クラスの対象の疑似ラベルを取得してもよい。 The post-processing unit 240 may obtain the mask of the base class objects in the image and obtain the pseudo-labels of the base class objects based on the prediction information and the mask template.

図３は、本開示の実施例に係る第２のセグメンテーションモデルの基本構造を示すブロック図である。なお、図３は、第２のセグメンテーションモデルの全体的な構造を示している。図３に示すように、第２のセグメンテーションモデル３００は、特徴抽出部３１０、予測部３２０、マスクテンプレート取得部３３０、後処理部３４０、及び制限部３５０を含んでもよい。特徴抽出部３１０、予測部３２０、マスクテンプレート取得部３３０及び後処理部３４０の動作は、図２における特徴抽出部２１０、予測部２２０、マスクテンプレート取得部２３０及び後処理部２４０の動作と同様である。 FIG. 3 is a block diagram illustrating the basic structure of a second segmentation model according to an embodiment of the disclosure. Note that FIG. 3 shows the overall structure of the second segmentation model. As shown in FIG. 3 , the second segmentation model 300 may include a feature extractor 310 , a predictor 320 , a mask template acquirer 330 , a post-processor 340 and a restrictor 350 . The operations of the feature extraction unit 310, the prediction unit 320, the mask template acquisition unit 330, and the post-processing unit 340 are the same as the operations of the feature extraction unit 210, the prediction unit 220, the mask template acquisition unit 230, and the post-processing unit 240 in FIG. be.

なお、第２のセグメンテーションモデルの訓練フェーズでは、後処理部３４０は必要ない。処理部１３０が第１のセグメンテーションモデルに基づいて基本クラス及び新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得するため、処理部１３０の構成は、図３における第２のセグメンテーションモデルの後処理部３４０が含まれない構成に対応する。 Note that the post-processing unit 340 is not required during the training phase of the second segmentation model. In order for the processing unit 130 to obtain the second segmentation model for segmentation of the target of the base class and the new class based on the first segmentation model, the configuration of the processing unit 130 is similar to that of the second segmentation model in FIG. This corresponds to a configuration in which the post-processing section 340 is not included.

また、第２のセグメンテーションモデルが訓練された後、第２のセグメンテーションモデルは、制限部３５０を含まなくてもよい。この場合、第２のセグメンテーションモデルの構成は、第１のセグメンテーションモデルの構成と同様である。第２の計算部１５０の構成は、この場合の第２のセグメンテーションモデルの構成に対応する。 Also, after the second segmentation model is trained, the second segmentation model may not include restriction 350 . In this case, the configuration of the second segmentation model is similar to that of the first segmentation model. The configuration of the second calculation unit 150 corresponds to the configuration of the second segmentation model in this case.

以下は、第２のセグメンテーションモデル３００の各部の基本的な動作を説明する。ここで、第２のセグメンテーションモデルの入力には、基本クラスの疑似ラベルと新規クラスの新規ラベルの両方を含む。 The basic operation of each part of the second segmentation model 300 is described below. Here, the input of the second segmentation model includes both pseudo-labels of the base class and novel labels of the novel class.

特徴抽出部３１０は、入力画像についての特徴情報を取得してもよい。予測部３２０は、特徴情報に基づいて入力画像についての予測情報を取得してもよく、マスクテンプレート取得部３３０は、特徴情報に基づいてマスクテンプレートを取得してもよい。 The feature extractor 310 may acquire feature information about the input image. The prediction section 320 may acquire prediction information about the input image based on the feature information, and the mask template acquisition section 330 may acquire the mask template based on the feature information.

また、制限部３５０は、特徴情報及びマスクテンプレートに基づいて、ニューラルネットワークにおけるターゲット制約条件に従って増分訓練を行い、第２のセグメンテーションモデルを取得してもよい。 The restriction unit 350 may also perform incremental training according to target constraints in the neural network based on the feature information and the mask template to obtain a second segmentation model.

第２のセグメンテーションモデルが訓練された後、後処理部３４０は、予測部３２０により提供された予測情報及びマスクテンプレート取得部３３０により提供されたマスクテンプレートに基づいて、入力画像における基本クラスの対象及び新規クラスの対象のマスクを取得すると同時に、対応する対象のラベル付け情報を取得してもよい。 After the second segmentation model is trained, the post-processing unit 340 computes base class objects and At the same time that a mask of objects for a new class is obtained, labeling information for the corresponding objects may be obtained.

本開示の技術をよりよく理解するために、以下は、図４を参照しながら、図１における第１の計算部１４０、第２の計算部１５０及び比較部１６０の動作を詳細に説明する。 For a better understanding of the techniques of the present disclosure, the operations of the first calculation unit 140, the second calculation unit 150 and the comparison unit 160 in FIG. 1 are described in detail below with reference to FIG.

図４は、本開示の実施例に係る画像処理装置の部分構造の詳細な動作を示すブロック図である。図４の画像処理装置４００では、上記の図１～図３を参照しながら詳細に説明された第１の取得部、第２の取得部及び処理部が省略されている。画像処理装置４００は、第１の計算部４４０、第２の計算部４５０、及び比較部４６０をさらに含む。 FIG. 4 is a block diagram illustrating detailed operation of a partial structure of an image processing apparatus according to an embodiment of the present disclosure. In the image processing apparatus 400 of FIG. 4, the first acquisition unit, second acquisition unit, and processing unit described in detail with reference to FIGS. 1 to 3 above are omitted. The image processing device 400 further includes a first calculator 440 , a second calculator 450 and a comparator 460 .

（１）セグメンテーションモデルの検出ブランチに蒸留損失を追加する例
第１の計算部４４０、第２の計算部４５０及び比較部４６０の動作は、図１における第１の計算部１４０、第２の計算部１５０及び比較部１６０の説明と同様である。ここで、第１のパラメータは、アンカーボックスのオフセット、即ちバウンディングボックスの位置であってもよい。第１のパラメータのデータセットは、アンカーボックスのオフセットの計算結果である。この例では、ＭＳＥ（ＭｅａｎＳｑｕａｒｅｄＥｒｒｏｒ：平均二乗誤差）損失は、検出ブランチのバウンディングボックス回帰層に追加される。 (1) Example of adding distillation loss to the detection branch of the segmentation model The description of the unit 150 and the comparison unit 160 is the same. Here, the first parameter may be the offset of the anchor box, ie the position of the bounding box. The first parameter data set is the result of calculating the offset of the anchor box. In this example, the MSE (Mean Squared Error) loss is added to the bounding box regression layer of the detection branch.

具体的には、第１の計算部４４０は、第１のセグメンテーションモデルにより、画像における基本クラスの対象に関連するアンカーボックスのオフセット［ｘ，ｙ，ｘ＿ｏｆｆｓｅｔ，ｙ＿ｏｆｆｓｅｔ］を計算してもよい。 Specifically, the first computation unit 440 may compute the offset [x, y, x_offset, y_offset] of the anchor box associated with the base class object in the image according to the first segmentation model.

第２の計算部４５０は、処理部により提供された第２のセグメンテーションモデルにより、画像における基本クラスの対象に関連するアンカーボックスのオフセット［ｘ，ｙ，ｘ＿ｏｆｆｓｅｔ，ｙ＿ｏｆｆｓｅｔ］を計算してもよい。 The second computation unit 450 may compute offsets [x, y, x_offset, y_offset] of anchor boxes associated with base class objects in the image according to a second segmentation model provided by the processing unit.

また、比較部４６０は、第１の計算部４４０及び第２の計算部４５０によりそれぞれ計算されたアンカーボックスのオフセットを比較し、例えば、両者のＭＳＥ損失を計算してもよい。 Also, the comparison unit 460 may compare the anchor box offsets calculated by the first calculation unit 440 and the second calculation unit 450, respectively, and calculate, for example, the MSE loss of both.

また、第２の計算部４５０は、アンカーボックスのオフセットのＭＳＥ損失に基づいて第２のセグメンテーションモデルを修正する。好ましくは、第２の計算部４５０は、第１のデータセットと第２のデータセットとの差を最小化することで、第２のセグメンテーションモデルを修正する。 The second computation unit 450 also modifies the second segmentation model based on the MSE loss of the anchor box offsets. Preferably, the second computation unit 450 modifies the second segmentation model by minimizing the difference between the first data set and the second data set.

（２）セグメンテーションモデルの分類ブランチに蒸留損失を追加する例
好ましくは、第１の計算部４４０及び第２の計算部４５０は、画像における基本クラスの対象に関連する他のパラメータをさらに計算してもよく、比較部４６０は、他のパラメータを比較することで、第２のセグメンテーションモデルをさらに修正してもよい。例えば、他のパラメータ（第２のパラメータ）は、分類の信頼度であってもよい。第２のパラメータのデータセットは、分類の信頼度の計算結果である。この例では、ＫＬＤｉｖ（Ｋｕｌｌｂａｃｋ－Ｌｅｉｂｌｅｒｄｉｖｅｒｇｅｎｃｅ：ＫＬダイバージェンス）損失は、分類ブランチのｌｏｇｉｔｓ層に追加される。 (2) Example of adding distillation loss to the classification branch of the segmentation model Preferably, the first computation unit 440 and the second computation unit 450 further compute other parameters related to the base class objects in the image. Alternatively, the comparison unit 460 may further modify the second segmentation model by comparing other parameters. For example, another parameter (second parameter) may be the confidence of the classification. The second parameter data set is the calculated result of the classification confidence. In this example, the KLDiv (Kullback-Leibler divergence: KL divergence) loss is added to the logs layer of the classification branch.

具体的には、第１の計算部４４０は、第１のセグメンテーションモデルにより、画像における基本クラスの対象に関連する分類の信頼度を計算してもよい。 Specifically, the first computation unit 440 may compute the confidence of the classification associated with the base class object in the image according to the first segmentation model.

また、第２の計算部４５０は、処理部により提供された第２のセグメンテーションモデルにより、画像における基本クラスの対象に関連する分類の信頼度を計算してもよい。 The second computing unit 450 may also compute the confidence of the classification associated with the base class object in the image according to the second segmentation model provided by the processing unit.

また、比較部４６０は、第１の計算部４４０と第２の計算部４５０により計算された基本クラスの分類の信頼度を比較し、例えば、両者のＫＬＤｉｖ損失を計算し、第２の計算部４５０に第２のセグメンテーションモデルをさらに修正するための分類の信頼度の比較情報を提供してもよい。同様に、第１の計算部４４０と第２の計算部４５０により計算された基本クラスの分類の信頼度の間の差を最小化することによって、第２のセグメンテーションモデルを修正する。 Further, the comparison unit 460 compares the reliability of the base class classification calculated by the first calculation unit 440 and the second calculation unit 450, for example, calculates the KLDiv loss of both, and calculates the second calculation unit At 450, classification confidence comparison information may be provided for further refinement of the second segmentation model. Similarly, the second segmentation model is modified by minimizing the difference between the base class classification confidences computed by the first computational unit 440 and the second computational unit 450 .

図５は、本開示の実施例に係る計算された分類の信頼度を示す概略図であり、それぞれ第１のセグメンテーションモデル及び第２のセグメンテーションモデルにより計算された信頼度の例を示している。図５に示すように、上部は第１のセグメンテーションモデルにより計算された分類の信頼度の例
（外１）

であり、ここで、ｎは、基本クラスの数を表す。図５の下部は、第２のセグメンテーションモデルにより計算された分類の信頼度の例［ｏ_１，ｏ_２，…，ｏ_ｎ，ｏ_ｎ＋１，…，ｏ_ｎ＋ｍ］を示し、ここで、前のｎ個は、ｎ個の基本クラスの分類の信頼度であり、後のｍ個は、ｍ個の新規クラスの分類の信頼度である。比較部４６０は、
（外２）

と［ｏ_１，ｏ_２，…，ｏ_ｎ］とを比較し、例えば両者の間のＫＬＤｉｖ損失を計算する。 FIG. 5 is a schematic diagram illustrating calculated classification confidences according to an embodiment of the present disclosure, showing examples of confidences calculated by a first segmentation model and a second segmentation model, respectively. As shown in FIG. 5, the top is an example of the classification confidence calculated by the first segmentation model (Extra 1)

where n represents the number of base classes. The lower part of FIG. 5 shows an example of classification confidences [o ₁ , o ₂ , . . . , o _n , o _n ₊₁ , . is the confidence of the classification of the n base classes, and the latter m is the confidence of the classification of the m new classes. The comparison unit 460
(outside 2)

and [ _o ₁ , o ₂ , .

また、第２の計算部４５０は、分類の信頼度のＫＬＤｉｖ損失に基づいて、第２のセグメンテーションモデルをさらに修正する。 The second computation unit 450 also further modifies the second segmentation model based on the KLDiv loss of classification confidence.

図６は、本開示の実施例に係る画像処理装置の内部コンピューティングの詳細を示す概略図である。アンカーボックスのオフセット及び分類の信頼度の損失を計算する上記のプロセスでは、第１のセグメンテーションモデルと第２のセグメンテーションモデルにより計算された複数のアンカーボックスに対応するアンカーボックスのオフセット及び分類の信頼度に対してマッチング及び選択を行い、比較のためのアンカーボックスのオフセットのデータセット及び分類の信頼度のデータセットを取得してもよい。以下は詳細に説明する。 FIG. 6 is a schematic diagram illustrating internal computing details of an image processing apparatus according to an embodiment of the present disclosure. In the above process of calculating anchor box offsets and classification confidence losses, anchor box offsets and classification confidences corresponding to a plurality of anchor boxes calculated by the first segmentation model and the second segmentation model are calculated. to obtain a dataset of anchor box offsets and a dataset of classification confidences for comparison. Details are provided below.

具体的には、第２の計算部４５０は、第２のセグメンテーションモデルにより、第２のセグメンテーションモデルにおける複数のアンカーボックスのそれぞれと画像における対象との重複度、例えばＪａｃｃａｒｄ係数を計算する。また、第２の計算部４５０は、重複度が所定の閾値よりも大きく、且つ重複する対象のクラスが基本クラスであるアンカーボックスを選択する。 Specifically, the second calculation unit 450 calculates the overlap between each of the plurality of anchor boxes in the second segmentation model and the object in the image, such as the Jaccard coefficient, using the second segmentation model. In addition, the second calculation unit 450 selects anchor boxes whose overlap degree is greater than a predetermined threshold and whose overlap target class is the base class.

また、第２の計算部４５０は、選択されたアンカーのインデックスに基づいて、第２のセグメンテーションモデルにより取得された複数のアンカーボックスに対応する第１のパラメータ又は第２のパラメータのデータセットから、アンカーボックスのインデックスにマッチしたデータセット、例えば図６の下部のマッチしたアンカーボックスのオフセット及びマッチした分類の信頼度のデータセットを選択する。 Also, the second calculation unit 450, based on the index of the selected anchor, from the data set of the first parameter or the second parameter corresponding to the plurality of anchor boxes obtained by the second segmentation model, Select the data set that matches the index of the anchor box, eg, the data set of the matched anchor box offset and the matched classification confidence at the bottom of FIG.

また、第２の計算部４５０は、選択されたアンカーボックスのインデックスを第１の計算部４４０に提供する。 The second calculator 450 also provides the index of the selected anchor box to the first calculator 440 .

次に、第１の計算部４４０は、選択されたアンカーボックスのインデックスに基づいて、第１のセグメンテーションモデルにより取得された複数のアンカーボックスに対応する第１のパラメータ又は第２のパラメータのデータセットから、アンカーボックスのインデックスにマッチしたデータセット、例えば図６の上部のマッチしたアンカーボックスのオフセットのデータセット及びマッチした分類の信頼度のデータセットを選択する。 Next, the first computation unit 440 computes a data set of first parameters or second parameters corresponding to the plurality of anchor boxes obtained by the first segmentation model, based on the indices of the selected anchor boxes. , select the dataset that matches the index of the anchor box, eg, the dataset of matched anchor box offsets at the top of FIG. 6 and the dataset of matched classification confidence.

比較部４６０は、第１の計算部４４０及び第２の計算部４５０により選択されたマッチしたアンカーボックスのオフセット及び、第１の計算部４４０及び第２の計算部４５０により選択されたマッチした分類の信頼度を比較してもよい。 The comparator 460 calculates the matched anchor box offsets selected by the first calculator 440 and the second calculator 450 and the matched classifications selected by the first calculator 440 and the second calculator 450 . You can compare the reliability of

一例として、第２のセグメンテーションモデルには、５７７４４個のアンカーボックスがある。上記の重複度を０．５より大きくすることで、６００～７００個のアンカーボックスを選択することができる。そのうちの４００～５００個のアンカーボックスのみが基本クラスに属する。これらの４００～５００個のアンカーボックスのインデックスは、記録され、第１のセグメンテーションモデルと第２のセグメンテーションモデルにより計算されたアンカーボックスに対応するパラメータのデータセットをフィルタリングするために使用される。選択されたパラメータのデータセットは、損失関数の計算に使用され、第２のセグメンテーションモデルを修正するために使用される。 As an example, the second segmentation model has 57744 anchor boxes. 600 to 700 anchor boxes can be selected by making the above multiplicity greater than 0.5. Only 400-500 anchor boxes among them belong to the base class. The indices of these 400-500 anchor boxes are recorded and used to filter the dataset of parameters corresponding to the anchor boxes calculated by the first segmentation model and the second segmentation model. A dataset of selected parameters is used to calculate the loss function and is used to modify the second segmentation model.

これによって、本開示の実施例に係る画像処理装置１００は、２つの蒸留損失（ＭＳＥ損失、ＫＬＤｉｖ損失）を追加することで、訓練モデルの破局的忘却を防止することができる。 Accordingly, the image processing apparatus 100 according to the embodiment of the present disclosure can prevent catastrophic forgetting of the training model by adding two distillation losses (MSE loss, KLDiv loss).

以下は、図７を参照しながら、本開示の実施例に係る画像処理方法を説明する。 An image processing method according to an embodiment of the present disclosure will be described below with reference to FIG.

図７に示すように、本開示の実施例に係る画像処理方法は、ステップＳ１１０から開始する。ステップＳ１１０において、第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得する。第１のセグメンテーションモデルは、基本クラスの対象のセグメンテーションに使用される。 As shown in FIG. 7, the image processing method according to embodiments of the present disclosure begins at step S110. In step S110, the pseudo-labels associated with the object segmentation of the base class objects in the image are obtained by the first segmentation model. A first segmentation model is used for segmentation of base class objects.

次に、ステップＳ１２０において、画像における基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得する。 Next, in step S120, the new label associated with the object segmentation of the object of the new class different from the base class in the image is obtained.

次に、ステップＳ１３０において、疑似ラベル及び新規ラベルに基づいて、基本クラス及び新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得する。 Then, in step S130, obtain a second segmentation model for segmentation of the object of the base class and the new class based on the pseudo-label and the new label.

次に、ステップＳ１４０において、第１のセグメンテーションモデルにより、画像における基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算する。 Next, in step S140, the first segmentation model computes a first data set of first parameters associated with the base class objects in the image.

次に、ステップＳ１５０において、第２のセグメンテーションモデルにより、画像における基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算する。 Next, in step S150, a second data set of first parameters associated with the base class objects in the image is calculated by a second segmentation model.

次に、ステップＳ１６０において、第１のデータセットと第２のデータセットとを比較し、第２のセグメンテーションモデルを修正する。この後、プロセスは終了する。 Next, in step S160, the first data set and the second data set are compared to modify the second segmentation model. After this the process ends.

本開示の実施例では、第１のデータセットと第２のデータセットとの差を最小化するように第２のセグメンテーションモデルを修正する。 Embodiments of the present disclosure modify the second segmentation model to minimize the difference between the first data set and the second data set.

本開示の実施例では、該方法は、第２のセグメンテーションモデルにより、第２のセグメンテーションモデルにおける複数のアンカーボックスのそれぞれと画像における対象との重複度を計算するステップ、をさらに含む。第１のデータセットを計算するステップは、重複度に基づいて、第１のセグメンテーションモデルにより取得された複数のアンカーボックスに対応する第１のパラメータのデータセットから第１のデータセットを選択するステップを含む。第２のデータセットを計算するステップは、重複度に基づいて、第２のセグメンテーションモデルにより取得された複数のアンカーボックスに対応する第１のパラメータのデータセットから第２のデータセットを選択するステップを含む。 In an embodiment of the present disclosure, the method further includes calculating, with the second segmentation model, the degree of overlap between each of the plurality of anchor boxes in the second segmentation model and the object in the image. Calculating the first data set comprises selecting the first data set from the first parameter data sets corresponding to the plurality of anchor boxes obtained by the first segmentation model based on the degree of multiplicity. including. Calculating the second data set comprises selecting the second data set from the first parameter data set corresponding to the plurality of anchor boxes obtained by the second segmentation model based on the degree of multiplicity. including.

本開示の実施例では、第２のセグメンテーションモデルは、重複度が所定の閾値よりも大きく、且つ重複する対象のクラスが基本クラスであるアンカーボックスを選択する。第１のセグメンテーションモデル及び第２のセグメンテーションモデルは、選択されたアンカーボックスのインデックスに基づいて第１のデータセット及び第２のデータセットをそれぞれ選択する。 In an embodiment of the present disclosure, the second segmentation model selects anchor boxes whose degree of overlap is greater than a predetermined threshold and whose overlapping target class is the base class. The first segmentation model and the second segmentation model respectively select the first dataset and the second dataset based on the index of the selected anchor box.

本開示の実施例では、該方法は、第１のセグメンテーションモデルにより、入力画像における基本クラスの対象に関連する第２のパラメータの第３のデータセットを計算するステップと、第２のセグメンテーションモデルにより、入力画像における基本クラスの対象に関連する第２のパラメータの第４のデータセットを計算するステップと、第３のデータセットと第４のデータセットとを比較し、第２のセグメンテーションモデルをさらに修正するステップと、をさらに含む。 In an embodiment of the present disclosure, the method comprises computing a third data set of second parameters associated with base class objects in the input image according to the first segmentation model; , computing a fourth dataset of second parameters associated with objects of the base class in the input image; comparing the third dataset with the fourth dataset to further generate a second segmentation model; and modifying.

本開示の実施例では、第３のデータセットと第４のデータセットとの差を最小化するように第２のセグメンテーションモデルを修正する。 Embodiments of the present disclosure modify the second segmentation model to minimize the difference between the third data set and the fourth data set.

本開示の実施例では、第１のパラメータは、アンカーボックスのオフセットであり、第２のパラメータは、分類の信頼度である。この場合、ＭＳＥ損失関数を使用して第１のデータセットと第２のデータセットとを比較し、ＫＬＤｉｖ損失関数を使用して第３のデータセットと第４のデータセットとを比較する。 In an embodiment of the present disclosure, the first parameter is the anchor box offset and the second parameter is the classification confidence. In this case, the MSE loss function is used to compare the first data set to the second data set, and the KLDiv loss function is used to compare the third data set to the fourth data set.

本開示の実施例では、疑似ラベル及び新規ラベルは、画像における対応する対象の分類識別子、バウンディングボックス及びマスクを含む。また、第１のセグメンテーションモデル及び第２のセグメンテーションモデルは、ｙｏｌａｃｔ＋＋インスタンスセグメンテーションモデルである。 In embodiments of the present disclosure, the pseudo-labels and novel labels include the classification identifier, bounding box and mask of the corresponding object in the image. Also, the first segmentation model and the second segmentation model are yolact++ instance segmentation models.

これによって、本開示の実施例に係る画像処理方法は、新規クラスのラベルのみでインスタンスセグメンテーションのクラス増分学習を実現し、破局的忘却の問題を解決することができる。 Accordingly, the image processing method according to the embodiment of the present disclosure can realize class increment learning for instance segmentation only with the label of the new class, and solve the problem of catastrophic forgetting.

本開示の実施例に係る画像処理方法の上記ステップの各態様は、既に詳細に説明されており、ここでその説明を省略する。 Each aspect of the above steps of the image processing method according to the embodiments of the present disclosure has already been described in detail, and the description thereof is omitted here.

なお、本開示に係る画像処理方法の各処理は、各種の機器が読み取り可能な記憶媒体に記憶されたコンピュータ実行可能なプログラムにより実現されてもよい。 Each process of the image processing method according to the present disclosure may be realized by a computer-executable program stored in a storage medium readable by various devices.

また、本開示の目的は、上記実行可能なプログラムコードを記憶した記憶媒体をシステム又は装置に直接的又は間接的に提供し、該システム又は装置におけるコンピュータ又は中央処理装置（ＣＰＵ）が該プログラムコードを読み出して実行することによって実現されてもよい。この場合は、該システム又は装置はプログラムを実行可能な機能を有すればよく、本開示の実施形態はプログラムに限定されない。また、該プログラムは任意の形式であってもよく、例えば対象プログラム、インタプリタによって実行されるプログラム、又はオペレーティングシステムに提供されるスクリプトプログラム等であってもよい。 Another object of the present disclosure is to provide, directly or indirectly, a storage medium storing the above executable program code to a system or device, so that a computer or central processing unit (CPU) in the system or device executes the program code. may be implemented by reading and executing In this case, the system or device only needs to have a function capable of executing a program, and the embodiments of the present disclosure are not limited to programs. Also, the program may be in any form, such as a target program, a program executed by an interpreter, or a script program provided to the operating system.

上記の機器が読み取り可能な記憶媒体は、各種のメモリ、記憶部、半導体装置、光ディスク、磁気ディスク及び光磁気ディスクのようなディスク、並びに情報を記憶可能な他の媒体等を含むが、これらに限定されない。 The above machine-readable storage media include various memories, storage units, semiconductor devices, disks such as optical disks, magnetic disks and magneto-optical disks, and other media capable of storing information. Not limited.

また、コンピュータがインターネット上の対応するウェブサイトに接続され、本開示のコンピュータプログラムコードをコンピュータにダウンロード、インストール、そして実行することによって、本開示の実施形態を実現することができる。 Also, the embodiments of the present disclosure can be realized by connecting the computer to a corresponding website on the Internet and downloading, installing, and executing the computer program code of the present disclosure on the computer.

図８は、本開示の実施例に係る画像処理装置及び方法を実現可能な汎用パーソナルコンピュータの例示的な構成を示すブロック図である。 FIG. 8 is a block diagram showing an exemplary configuration of a general-purpose personal computer that can implement the image processing apparatus and method according to the embodiments of the present disclosure.

図８に示すように、ＣＰＵ８０１は、読み出し専用メモリ（ＲＯＭ）８０２に記憶されているプログラム、又は記憶部８０８からランダムアクセスメモリ（ＲＡＭ）８０３にロードされたプログラムにより各種の処理を実行する。ＲＡＭ８０３には、必要に応じて、ＣＰＵ８０１が各種の処理を実行するに必要なデータが記憶されている。ＣＰＵ８０１、ＲＯＭ８０２、及びＲＡＭ８０３は、バス８０４を介して互いに接続されている。入力／出力インターフェース８０５もバス８０４に接続されている。 As shown in FIG. 8, the CPU 801 executes various processes using programs stored in a read only memory (ROM) 802 or programs loaded from a storage unit 808 to a random access memory (RAM) 803 . The RAM 803 stores data necessary for the CPU 801 to execute various processes as needed. The CPU 801 , ROM 802 and RAM 803 are interconnected via a bus 804 . Input/output interface 805 is also connected to bus 804 .

入力部８０６（キーボード、マウスなどを含む）、出力部８０７（ディスプレイ、例えばブラウン管（ＣＲＴ）、液晶ディスプレイ（ＬＣＤ）など、及びスピーカなどを含む）、記憶部８０８（例えばハードディスクなどを含む）、通信部８０９（例えばネットワークのインタフェースカード、例えばＬＡＮカード、モデムなどを含む）は、入力／出力インターフェース８０５に接続されている。通信部８０９は、ネットワーク、例えばインターネットを介して通信処理を実行する。必要に応じて、ドライバ８１０は、入力／出力インターフェース８０５に接続されてもよい。取り外し可能な媒体８１１は、例えば磁気ディスク、光ディスク、光磁気ディスク、半導体メモリなどであり、必要に応じてドライバ８１０にセットアップされて、その中から読みだされたコンピュータプログラムは必要に応じて記憶部８０８にインストールされている。 Input unit 806 (including keyboard, mouse, etc.), output unit 807 (display, such as cathode ray tube (CRT), liquid crystal display (LCD), etc., speaker, etc.), storage unit 808 (including hard disk, etc.), communication Unit 809 (eg, including network interface cards, such as LAN cards, modems, etc.) is connected to input/output interface 805 . A communication unit 809 executes communication processing via a network such as the Internet. A driver 810 may be connected to the input/output interface 805 as desired. The removable medium 811 is, for example, a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory. 808 installed.

ソフトウェアにより上記処理を実施する場合、ネットワーク、例えばインターネット、又は記憶媒体、例えば取り外し可能な媒体８１１を介してソフトウェアを構成するプログラムをインストールする。 When the above processing is performed by software, a program that constitutes the software is installed via a network such as the Internet or a storage medium such as the removable medium 811 .

なお、これらの記憶媒体は、図８に示されている、プログラムを記憶し、機器と分離してユーザへプログラムを提供する取り外し可能な媒体８１１に限定されない。取り外し可能な媒体８１１は、例えば磁気ディスク（フロッピーディスク（登録商標）を含む）、光ディスク（光ディスク－読み出し専用メモリ（ＣＤ－ＲＯＭ）、及びデジタル多目的ディスク（ＤＶＤ）を含む）、光磁気ディスク（ミニディスク（ＭＤ）（登録商標））及び半導体メモリを含む。或いは、記憶媒体は、ＲＯＭ８０２、記憶部８０８に含まれるハードディスクなどであってもよく、プログラムを記憶し、それらを含む機器と共にユーザへ提供される。 Note that these storage media are not limited to the removable media 811 shown in FIG. 8 that stores the program and provides the program to the user separately from the device. Removable media 811 may be, for example, magnetic disks (including floppy disks), optical disks (including optical disks—read only memory (CD-ROM) and digital versatile disks (DVD)), magneto-optical disks (mini disk (MD) (registered trademark)) and semiconductor memory. Alternatively, the storage medium may be the ROM 802, the hard disk included in the storage unit 808, or the like, which stores the program and is provided to the user together with the device including them.

なお、本開示のシステム及び方法では、各ユニット又は各ステップを分解且つ、或いは再組み合わせてもよい。これらの分解及び／又は再組み合わせは、本開示と同等であると見なされる。また、本開示の方法は、明細書に説明された時間的順序で実行するものに限定されず、他の時間的順序で順次、並行、又は独立して実行されてもよい。このため、本明細書に説明された方法の実行順序は、本開示の技術的な範囲を限定するものではない。 It should be noted that in the system and method of the present disclosure, each unit or each step may be disassembled and/or recombined. These disassembly and/or recombination are considered equivalents of this disclosure. Also, the methods of the present disclosure are not limited to performing in the chronological order set forth herein, and may be performed sequentially, in parallel, or independently in other chronological orders. As such, the order in which the methods described herein are performed should not limit the scope of this disclosure.

以上は図面を参照しながら本開示の実施例を詳細に説明しているが、上述した実施形態及び実施例は単なる例示的なものであり、本開示を限定するものではない。当業者は、特許請求の範囲の主旨及び範囲内で本開示に対して各種の修正、変更を行ってもよい。これらの修正、変更は本開示の保護範囲に含まれるものである。 Although the embodiments of the present disclosure are described in detail above with reference to the drawings, the above-described embodiments and examples are merely illustrative and are not intended to limit the present disclosure. Those skilled in the art may make various modifications and changes to this disclosure within the spirit and scope of the claims. These modifications and changes fall within the protection scope of this disclosure.

また、上述の各実施例を含む実施形態に関し、更に以下の付記を開示する。
（付記１）
画像処理装置であって、
第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得する第１の取得部であって、前記第１のセグメンテーションモデルは、基本クラスの対象のセグメンテーションに使用される、第１の取得部と、
前記画像における前記基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得する第２の取得部と、
前記疑似ラベル及び前記新規ラベルに基づいて、前記基本クラス及び前記新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得する処理部と、を含み、
前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、ニューラルネットワークにより実現され、
前記画像処理装置は、
前記第１のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算する第１の計算部と、
前記第２のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算する第２の計算部と、
前記第１のデータセットと前記第２のデータセットとを比較し、前記第２のセグメンテーションモデルを修正するように前記第２の計算部に比較情報を提供する比較部と、をさらに含む、画像処理装置。
（付記２）
前記第２の計算部は、前記第１のデータセットと前記第２のデータセットとの差を最小化するように前記第２のセグメンテーションモデルを修正する、付記１に記載の画像処理装置。
（付記３）
前記第２の計算部は、前記第２のセグメンテーションモデルにより、前記第２のセグメンテーションモデルにおける複数のアンカーボックスのそれぞれと前記画像における対象との重複度を計算し、
前記第１の計算部は、前記重複度に基づいて、前記第１のセグメンテーションモデルにより取得された前記複数のアンカーボックスに対応する第１のパラメータのデータセットから前記第１のデータセットを選択し、
前記第２の計算部は、前記重複度に基づいて、前記第２のセグメンテーションモデルにより取得された前記複数のアンカーボックスに対応する第１のパラメータのデータセットから前記第２のデータセットを選択する、付記１に記載の画像処理装置。
（付記４）
前記第２の計算部は、前記重複度が所定の閾値よりも大きく、且つ重複する対象のクラスが基本クラスであるアンカーボックスを選択し、
前記第１の計算部及び前記第２の計算部は、選択されたアンカーボックスのインデックスに基づいて前記第１のデータセット及び前記第２のデータセットをそれぞれ選択する、付記３に記載の画像処理装置。
（付記５）
前記第１の計算部は、前記第１のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第２のパラメータの第３のデータセットを計算し、
前記第２の計算部は、前記第２のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第２のパラメータの第４のデータセットを計算し、
前記比較部は、前記第３のデータセットと前記第４のデータセットとを比較し、前記第２のセグメンテーションモデルをさらに修正するように前記第２の計算部に他の比較情報を提供する、付記１乃至４の何れかに記載の画像処理装置。
（付記６）
前記第２の計算部は、前記第３のデータセットと前記第４のデータセットとの差を最小化するように前記第２のセグメンテーションモデルを修正する、付記５に記載の画像処理装置。
（付記７）
前記第１のパラメータは、アンカーボックスのオフセットであり、
前記第２のパラメータは、分類の信頼度である、付記５に記載の画像処理装置。
（付記８）
ＭＳＥ損失関数を使用して前記第１のデータセットと前記第２のデータセットとを比較し、
ＫＬＤｉｖ損失関数を使用して前記第３のデータセットと前記第４のデータセットとを比較する、付記７に記載の画像処理装置。
（付記９）
前記疑似ラベル及び前記新規ラベルは、前記画像における対応する対象の分類識別子、バウンディングボックス及びマスクを含む、付記１に記載の画像処理装置。
（付記１０）
画像処理方法であって、
第１のセグメンテーションモデルにより画像における基本クラスの対象の対象セグメンテーションに関連する疑似ラベルを取得するステップであって、前記第１のセグメンテーションモデルは、基本クラスの対象のセグメンテーションに使用される、ステップと、
前記画像における前記基本クラスとは異なる新規クラスの対象の対象セグメンテーションに関連する新規ラベルを取得するステップと、
前記疑似ラベル及び前記新規ラベルに基づいて、前記基本クラス及び前記新規クラスの対象のセグメンテーションのための第２のセグメンテーションモデルを取得するステップと、を含み、
前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、ニューラルネットワークにより実現され、
前記画像処理方法は、
前記第１のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第１のデータセットを計算するステップと、
前記第２のセグメンテーションモデルにより、前記画像における前記基本クラスの対象に関連する第１のパラメータの第２のデータセットを計算するステップと、
前記第１のデータセットと前記第２のデータセットとを比較し、前記第２のセグメンテーションモデルを修正するステップと、をさらに含む、画像処理方法。
（付記１１）
前記第１のデータセットと前記第２のデータセットとの差を最小化するように前記第２のセグメンテーションモデルを修正する、付記１０に記載の画像処理方法。
（付記１２）
前記第２のセグメンテーションモデルにより、前記第２のセグメンテーションモデルにおける複数のアンカーボックスのそれぞれと前記画像における対象との重複度を計算するステップ、をさらに含み、
前記第１のデータセットを計算するステップは、前記重複度に基づいて、前記第１のセグメンテーションモデルにより取得された前記複数のアンカーボックスに対応する第１のパラメータのデータセットから前記第１のデータセットを選択するステップ、を含み、
前記第２のデータセットを計算するステップは、前記重複度に基づいて、前記第２のセグメンテーションモデルにより取得された前記複数のアンカーボックスに対応する第１のパラメータのデータセットから前記第２のデータセットを選択するステップ、を含む、付記１０に記載の画像処理方法。
（付記１３）
前記第２のセグメンテーションモデルは、前記重複度が所定の閾値よりも大きく、且つ重複する対象のクラスが基本クラスであるアンカーボックスを選択し、
前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、選択されたアンカーボックスのインデックスに基づいて前記第１のデータセット及び前記第２のデータセットをそれぞれ選択する、付記１２に記載の画像処理方法。
（付記１４）
前記第１のセグメンテーションモデルにより、前記入力画像における前記基本クラスの対象に関連する第２のパラメータの第３のデータセットを計算するステップと、
前記第２のセグメンテーションモデルにより、前記入力画像における前記基本クラスの対象に関連する第２のパラメータの第４のデータセットを計算するステップと、
前記第３のデータセットと前記第４のデータセットとを比較し、前記第２のセグメンテーションモデルをさらに修正するステップと、をさらに含む、付記１０乃至１３の何れかに記載の画像処理方法。
（付記１５）
前記第３のデータセットと前記第４のデータセットとの差を最小化するように前記第２のセグメンテーションモデルを修正する、付記１４に記載の画像処理方法。
（付記１６）
前記第１のパラメータは、アンカーボックスのオフセットであり、
前記第２のパラメータは、分類の信頼度である、付記１４に記載の画像処理方法。
（付記１７）
ＭＳＥ損失関数を使用して前記第１のデータセットと前記第２のデータセットとを比較し、
ＫＬＤｉｖ損失関数を使用して前記第３のデータセットと前記第４のデータセットとを比較する、付記１６に記載の画像処理方法。
（付記１８）
前記疑似ラベル及び前記新規ラベルは、前記画像における対応する対象の分類識別子、バウンディングボックス及びマスクを含む、付記１０に記載の画像処理方法。
（付記１９）
前記第１のセグメンテーションモデル及び前記第２のセグメンテーションモデルは、ｙｏｌａｃｔ＋＋インスタンスセグメンテーションモデルである、付記１０に記載の画像処理方法。
（付記２０）
機器読み取り可能な命令コードを記憶しているプログラムプロダクトが記録された機器読み取り可能な記憶媒体であって、前記命令コードがコンピュータにより読み取られて実行される際に、前記コンピュータに付記１０乃至１９の何れかに記載の画像処理方法を実行させることができる、記憶媒体。 In addition, the following additional remarks will be disclosed regarding the embodiments including the above-described examples.
(Appendix 1)
An image processing device,
A first acquisition unit for acquiring pseudo-labels associated with object segmentation of base class objects in an image by a first segmentation model, wherein the first segmentation model is used for segmentation of base class objects. , a first obtaining unit;
a second obtaining unit for obtaining a new label associated with an object segmentation of an object of a new class different from the base class in the image;
a processing unit for obtaining a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label;
The first segmentation model and the second segmentation model are realized by a neural network,
The image processing device is
a first computation unit for computing a first data set of first parameters associated with the base class of objects in the image according to the first segmentation model;
a second computing unit that computes a second data set of first parameters associated with the base class objects in the image according to the second segmentation model;
a comparison unit that compares the first data set and the second data set and provides comparison information to the second computation unit to modify the second segmentation model. processing equipment.
(Appendix 2)
2. The image processing apparatus according to appendix 1, wherein the second calculation unit modifies the second segmentation model to minimize a difference between the first data set and the second data set.
(Appendix 3)
The second calculation unit calculates a degree of overlap between each of the plurality of anchor boxes in the second segmentation model and the target in the image, using the second segmentation model;
The first calculation unit selects the first data set from first parameter data sets corresponding to the plurality of anchor boxes obtained by the first segmentation model, based on the multiplicity. ,
The second calculation unit selects the second data set from first parameter data sets corresponding to the plurality of anchor boxes obtained by the second segmentation model, based on the degree of redundancy. , the image processing apparatus according to appendix 1.
(Appendix 4)
The second calculation unit selects an anchor box whose degree of overlap is greater than a predetermined threshold and whose overlapping target class is a base class;
3. The image processing according to attachment 3, wherein the first calculation unit and the second calculation unit select the first data set and the second data set, respectively, based on the index of the selected anchor box. Device.
(Appendix 5)
the first computation unit computes a third data set of second parameters associated with the base class objects in the image according to the first segmentation model;
the second computation unit computes a fourth data set of second parameters associated with the base class objects in the image according to the second segmentation model;
the comparison unit compares the third data set and the fourth data set and provides other comparison information to the second computation unit to further refine the second segmentation model; 5. The image processing apparatus according to any one of Appendices 1 to 4.
(Appendix 6)
6. The image processing device according to appendix 5, wherein the second calculation unit modifies the second segmentation model to minimize a difference between the third data set and the fourth data set.
(Appendix 7)
the first parameter is the offset of the anchor box;
6. The image processing device according to appendix 5, wherein the second parameter is the reliability of classification.
(Appendix 8)
comparing the first data set and the second data set using an MSE loss function;
8. The image processing apparatus of claim 7, wherein the third data set and the fourth data set are compared using a KLDiv loss function.
(Appendix 9)
2. The image processing apparatus of claim 1, wherein the pseudo-labels and the new labels include classification identifiers, bounding boxes and masks of corresponding objects in the images.
(Appendix 10)
An image processing method comprising:
obtaining pseudo-labels associated with object segmentation of base class objects in an image by a first segmentation model, wherein the first segmentation model is used for segmentation of base class objects;
obtaining new labels associated with object segmentations of objects of a new class different from the base class in the image;
obtaining a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label;
The first segmentation model and the second segmentation model are realized by a neural network,
The image processing method includes
calculating a first data set of first parameters associated with the base class of objects in the image with the first segmentation model;
calculating a second data set of first parameters associated with the base class of objects in the image according to the second segmentation model;
comparing the first data set and the second data set and modifying the second segmentation model.
(Appendix 11)
11. The image processing method of clause 10, wherein the second segmentation model is modified to minimize differences between the first data set and the second data set.
(Appendix 12)
calculating, with the second segmentation model, the degree of overlap between each of a plurality of anchor boxes in the second segmentation model and the object in the image;
The step of calculating the first data set includes calculating the first data from a first parameter data set corresponding to the plurality of anchor boxes obtained by the first segmentation model, based on the multiplicity. selecting a set;
The step of calculating the second data set includes calculating the second data from a first parameter data set corresponding to the plurality of anchor boxes obtained by the second segmentation model, based on the multiplicity. 11. The image processing method of clause 10, comprising selecting a set.
(Appendix 13)
The second segmentation model selects anchor boxes whose degree of overlap is greater than a predetermined threshold and whose overlapping target class is a base class;
13. The image processing of clause 12, wherein the first segmentation model and the second segmentation model select the first dataset and the second dataset, respectively, based on indices of selected anchor boxes. Method.
(Appendix 14)
calculating a third data set of second parameters associated with the base class of objects in the input image with the first segmentation model;
calculating a fourth data set of second parameters associated with the base class of objects in the input image with the second segmentation model;
14. The image processing method according to any one of Appendices 10 to 13, further comprising comparing the third data set and the fourth data set and further modifying the second segmentation model.
(Appendix 15)
15. The image processing method of clause 14, wherein the second segmentation model is modified to minimize differences between the third data set and the fourth data set.
(Appendix 16)
the first parameter is the offset of the anchor box;
15. The image processing method according to appendix 14, wherein the second parameter is a classification confidence.
(Appendix 17)
comparing the first data set and the second data set using an MSE loss function;
17. The image processing method of clause 16, wherein a KLDiv loss function is used to compare the third data set and the fourth data set.
(Appendix 18)
11. The image processing method of clause 10, wherein the pseudo-labels and the new labels comprise classification identifiers, bounding boxes and masks of corresponding objects in the images.
(Appendix 19)
11. The image processing method of claim 10, wherein the first segmentation model and the second segmentation model are yolact++ instance segmentation models.
(Appendix 20)
A machine-readable storage medium in which a program product storing machine-readable instruction code is recorded, wherein when the instruction code is read and executed by the computer, the computer can read the instructions 10 to 19. A storage medium capable of executing any of the image processing methods described above.

Claims

An image processing device,
A first acquisition unit for acquiring pseudo-labels associated with object segmentation of base class objects in an image by a first segmentation model, wherein the first segmentation model is used for segmentation of base class objects. , a first obtaining unit;
a second obtaining unit for obtaining a new label associated with an object segmentation of an object of a new class different from the base class in the image;
a processing unit for obtaining a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label;
The first segmentation model and the second segmentation model are realized by a neural network,
The image processing device is
a first computation unit for computing a first data set of first parameters associated with the base class of objects in the image according to the first segmentation model;
a second computing unit that computes a second data set of first parameters associated with the base class objects in the image according to the second segmentation model;
a comparison unit that compares the first data set and the second data set and provides comparison information to the second computation unit to modify the second segmentation model. processing equipment.

2. The image processing apparatus according to claim 1, wherein said second calculation unit modifies said second segmentation model so as to minimize a difference between said first data set and said second data set.

The second calculation unit calculates a degree of overlap between each of the plurality of anchor boxes in the second segmentation model and the target in the image, using the second segmentation model;
The first calculation unit selects the first data set from first parameter data sets corresponding to the plurality of anchor boxes obtained by the first segmentation model, based on the multiplicity. ,
The second calculation unit selects the second data set from first parameter data sets corresponding to the plurality of anchor boxes obtained by the second segmentation model, based on the degree of redundancy. 2. The image processing apparatus according to claim 1.

The second calculation unit selects an anchor box whose degree of overlap is greater than a predetermined threshold and whose overlapping target class is a base class;
4. The image of claim 3, wherein the first computation unit and the second computation unit select the first data set and the second data set, respectively, based on the index of the selected anchor box. processing equipment.

the first computation unit computes a third data set of second parameters associated with the base class objects in the image according to the first segmentation model;
the second computation unit computes a fourth data set of second parameters associated with the base class objects in the image according to the second segmentation model;
the comparison unit compares the third data set and the fourth data set and provides other comparison information to the second computation unit to further refine the second segmentation model; 5. The image processing apparatus according to claim 1.

6. The image processing apparatus according to claim 5, wherein said second calculation unit modifies said second segmentation model so as to minimize a difference between said third data set and said fourth data set.

the first parameter is the offset of the anchor box;
6. The image processing apparatus according to claim 5, wherein said second parameter is a reliability of classification.

comparing the first data set and the second data set using an MSE loss function;
8. The image processing apparatus of claim 7, wherein a KLDiv loss function is used to compare the third data set and the fourth data set.

An image processing method comprising:
obtaining pseudo-labels associated with object segmentation of base class objects in an image by a first segmentation model, wherein the first segmentation model is used for segmentation of base class objects;
obtaining new labels associated with object segmentations of objects of a new class different from the base class in the image;
obtaining a second segmentation model for segmentation of objects of the base class and the new class based on the pseudo-label and the new label;
The first segmentation model and the second segmentation model are realized by a neural network,
The image processing method includes
calculating a first data set of first parameters associated with the base class of objects in the image with the first segmentation model;
calculating a second data set of first parameters associated with the base class of objects in the image according to the second segmentation model;
comparing the first data set and the second data set and modifying the second segmentation model.

10. A machine-readable storage medium in which a program product storing machine-readable instruction code is recorded, wherein when the instruction code is read and executed by the computer, the computer reads and executes the machine-readable instruction code. A storage medium capable of executing the image processing method of