KR102412337B1

KR102412337B1 - Dynamic learning apparatus and method for the classification of human epithelial cell image

Info

Publication number: KR102412337B1
Application number: KR1020190157602A
Authority: KR
Inventors: 권기룡; 부누누 칼렙
Original assignee: 부경대학교 산학협력단
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2022-06-23
Also published as: KR20210067709A

Abstract

본 발명은 인간 상피 세포 이미지 분류를 위한 동적 학습 장치 및 방법에 관한 것으로, 더욱 상세하게는 이미지가 입력되면, 이산 웨이블릿 변환을 수행하는 단계; 이산 웨이블릿 변환 계수를 네트워크에 공급하기 위해 정규화 하는 단계; 4 개의 서로 다른 CNN이 병렬로 사용되어 각 CNN이 서로 다른 정규화 된 웨이블릿 계수를 입력으로 하여 최종 잔차 빌딩 블록을 형성하는 단계; 및 글로벌 평균 풀링 및 분류 계층을 통과시키는 단계를 포함한다.The present invention relates to a dynamic learning apparatus and method for classifying images of human epithelial cells, and more particularly, when an image is input, performing discrete wavelet transformation; normalizing the discrete wavelet transform coefficients to feed the network; 4 different CNNs are used in parallel so that each CNN forms a final residual building block with different normalized wavelet coefficients as input; and passing the global average pooling and classification layer.

Description

DYNAMIC LEARNING APPARATUS AND METHOD FOR THE CLASSIFICATION OF HUMAN EPITHELIAL CELL IMAGE

본 발명은 인간 상피 세포 이미지 분류를 위한 동적 학습 장치 및 방법에 관한 것으로, 보다 상세하게는 인간 상피 세포의 강도 수준 및 모양과 관련하여 특정 균질화를 장려함으로써 클래스 내 불일치를 최소화하도록 하는 인간 상피 세포 이미지 분류를 위한 동적 학습 장치 및 방법에 관한 것이다.The present invention relates to a dynamic learning apparatus and method for classifying human epithelial cells images, and more particularly, to human epithelial cell images that minimize intra-class discrepancies by encouraging specific homogenization with respect to the intensity level and shape of human epithelial cells. It relates to a dynamic learning apparatus and method for classification.

CAD(Computer-aided diagnostic) 시스템은 질병 진단 프로세스의 자동화를 통합하기위한 모든 기술을 말한다. 이 시스템은 자동 작업을 훨씬 신뢰할 수 있는 방식으로 수행 할 수 있는 능력으로 인해 지난 수십 년 동안 다양한 머신 러닝 방법이 개발 된 이후로 큰 관심을 끌었다. 이러한 CAD 시스템과 관련하여 가장 어려운 작업 중 하나는 생물학적 유기체를 나타내는 이미지를 완전히 분석하고 이해하는 것이다. 자가 면역 질환의 경우, 인간 상피 형2(HEp-2) 세포 패턴에 대한 간접 면역 형광법 (IIF)이 가장 권장되는 진단 방법이다. 그러나 IIF 이미지의 수동 분석은 상당한 시간이 소요될 수 있는 힘든 작업을 나타낸다. 또한 이미지의 복잡성으로 인해 병리학 자의 주관성에 중요한 부분이 생겨 진단 결과에 약간의 불일치가 발생할 수 있다. 그렇기 때문에 주로 다양한 유형의 HEp-2 세포의 자동 분류를 위해 병리학자가 진단에 도움을주기 위해 CAD 시스템이 중요한 관심을 기울인 이유가 있다.Computer-aided diagnostic (CAD) systems refer to any technology for integrating automation of the disease diagnosis process. These systems have attracted great interest since the development of various machine learning methods over the past few decades due to their ability to perform automated tasks in a much more reliable manner. One of the most difficult tasks associated with these CAD systems is to fully analyze and understand images that represent biological organisms. For autoimmune diseases, indirect immunofluorescence (IIF) for human epithelial type 2 (HEp-2) cell pattern is the most recommended diagnostic method. However, manual analysis of IIF images represents a arduous task that can take considerable time. In addition, the complexity of the images introduces an important part of the pathologist's subjectivity, which can lead to some discrepancies in diagnostic results. This is the reason why CAD systems are of great interest to pathologists to aid in diagnosis, mainly for the automatic classification of different types of HEp-2 cells.

ICPR(International Conference on Pattern Recognition)에 의해 개최된 HEp-2 세포 분류 경연의 다른 판에서 제시된 방법과 문헌에서 다양한 방법이 논의되었다. 고전적인 패턴 인식 작업으로서, HEp-2 세포 분류 방법은 특징 추출 또는 선별 과정에이어서 분류 방법이 뒤 따른다. 특징 추출은 상이한 세포 유형의 정확한 식별을 도울 수있는 관련 정보를 추출하는 것으로 구성되기 때문에 절차의 가장 중요한 부분으로 남아있다. 일반적으로 우리는 문헌에서 기존의 기계 학습 기반 방법과 딥 러닝 기반 방법의 두 가지 방법 그룹을 구별 할 수 있다.Various methods were discussed in the literature and methods presented in other editions of the HEp-2 cell sorting contest held by the International Conference on Pattern Recognition (ICPR). As a classic pattern recognition task, the HEp-2 cell sorting method is a feature extraction or selection process followed by a sorting method. Feature extraction remains the most important part of the procedure as it consists of extracting relevant information that can aid in the accurate identification of different cell types. In general, we can distinguish two method groups in the literature: conventional machine learning-based methods and deep learning-based methods.

디스크립터에서 생성 된 코드북을 채택한 사람이다. Nosaka 등은 로컬 이진 패턴 (LBP)을 기능으로 사용하여 분류 단계에서 선형 지원 벡터 머신 (SVM)에 대한 입력으로 제공하였다. 황 등은 조직적, 통계적 특징을 혼성 방식으로 활용하여 분류 프로세스를 위해 자체 구성 맵에 제공했다.It is the person who adopted the codebook generated from the descriptor. Nosaka et al. used the local binary pattern (LBP) as a feature and provided it as an input to a linear support vector machine (SVM) in the classification step. Hwang et al. utilized organizational and statistical features in a hybrid manner and provided their own construct maps for the classification process.

그레이-레벨 크기 대역 매트릭스와 같은 다른 통계적 특징은 가장 가까운 이웃 분류기가 차별 부분에 채택되었다. Wiliem 등에 의해 동일한 통계적 특징이 SVM에 공급되었다. Xu 등에 의해 선형 SVM의 입력으로서 사용 된 특징을 추출하기 위해 선형 국부 거리 코딩 방법이 사용되었다.For other statistical features such as gray-level magnitude band matrix, nearest neighbor classifier was adopted for the discrimination part. The same statistical features were fed to the SVM by Wiliem et al. A linear local distance coding method was used to extract the features used as inputs of the linear SVM by Xu et al.

이 분야의 연구원들은 하이브리드 피처 학습 방법을 활용하였다. 실제로, Cataldo 등은 형태 학적 특징과 같은 다양한 특징, 회전 불변 가버 특징과 같은 전역 텍스처 디스크립터와 회전 불 변형 균일 LBP와 같은 다른 종류의 LBP 디스크립터의 조합 사용을 제안하였다, 인접 LBP 의 동시 발생, 완성 된 LBP에서도 채택 된 인접 LBP의 회전-불변 동시 발생. 다른 하이브리드 특징 추출 방법은 Theodorakopoulos 등에 의해 저자는 LBP와 SIFT 디스크립터의 조합을 제안하였다. Foggia 등의 방법의 성능은 추출 된 특징에 의해 제공되는 식별 가능성에 전적으로 의존하여 사용자의 주관성에 중요한 부분을 남겨둔다는 것을 알 수 있다.Researchers in this field have utilized a hybrid feature learning method. Indeed, Cataldo et al. proposed the use of combinations of various features such as morphological features, global texture descriptors such as rotational invariant disappearance features, and other kinds of LBP descriptors such as rotational invariant uniform LBPs, simultaneous occurrence of adjacent LBPs, and completed Rotation-invariant co-occurrence of adjacent LBPs also adopted in LBPs. As another hybrid feature extraction method, the authors proposed a combination of LBP and SIFT descriptors by Theodorakopoulos et al. It can be seen that the performance of the method of Foggia et al depends entirely on the identifiability provided by the extracted features, leaving an important part to the subjectivity of the user.

딥 러닝이 전개 된 이후 자동 기능 학습 방법이 널리 채택되었다. 그들은 물체 인식 문제에서 탁월한 결과를 보여 주었고, 많은 연구자들이 이를 HEp-2 세포 분류를 위한 주요 도구로 채택했다. 기능의 주관적인 선택에 따라 정확도가 결정되는 기존의 방법과 달리, 현재 널리 사용되는 CNN (Convolutional Neural Network)과 같은 딥 러닝 방법은 자동 기능 학습 프로세스를 제공한다는 이점이 있다. 실제로, 많은 연구 결과는 HEp-2 세포 분류 작업을 위해 수작업으로 만든 것보다 딥 러닝 기반 기능의 우수성을 입증했다.Since deep learning has been deployed, automatic feature learning methods have been widely adopted. They showed excellent results in the object recognition problem, and many researchers have adopted it as a major tool for HEp-2 cell sorting. Unlike conventional methods in which accuracy is determined by subjective selection of features, deep learning methods such as Convolutional Neural Networks (CNNs), which are currently widely used, have the advantage of providing an automatic feature learning process. Indeed, many studies have demonstrated the superiority of deep learning-based functions over hand-made ones for HEp-2 cell sorting tasks.

HEp-2 세포 분류 문제에 CNN을 적용하는 첫 번째 작업은 Foggia 등이 2012 년 ICPR HEp-2 세포 분류 컨테스트에 제시하였다. 비록 그 결과는 탁월했지만 당시 사용 가능한 데이터 세트는 이기종이 아니 었으며 많은 개선이 필요하였다. 그 이후로, 사용 가능한 많은 데이터 세트가 상당히 다양해졌으며 제안 된 다른 CNN 모델은 분류 정확도 측면에서 한계를 계속 추진하고 있다. Gao 등은 다른 데이터 셋에 대해 테스트 된 간단한 CNN 아키텍처를 제시했다. 그들은 HEp-2 세포 이미지에 대해 다른 각도에서의 회전과 같은 데이터 증강 기술을 테스트 한 최초의 사람이었다. Li 등은 가장 널리 사용되는 CNN 모델인 ResNet 와 GoogleNet의 "Inception"모듈 중 하나를 결합한 DRI (Deep Residual Inception Model) 를 채택했다. Phan 등은 ImageNet 데이터 셋에 대해 훈련 된 모델을 사용하여 이미 훈련 된 네트워크를 새로운 데이터 세트에 사용하는 것으로 구성된 전송 학습을 수행했다.The first work to apply CNN to the HEp-2 cell sorting problem was presented by Foggia et al. in the 2012 ICPR HEp-2 cell sorting contest. Although the results were excellent, the available data sets at the time were not heterogeneous, and many improvements were needed. Since then, the many data sets available have varied considerably, and other proposed CNN models continue to push their limits in terms of classification accuracy. Gao et al. presented a simple CNN architecture that was tested on different datasets. They were the first to test data augmentation techniques, such as rotation at different angles, on HEp-2 cell images. Li et al. adopted the Deep Residual Inception Model (DRI), which combines ResNet, the most widely used CNN model, and one of GoogleNet's "Inception" modules. Phan et al performed transfer learning, which consisted of using an already trained network on a new dataset using a model trained on the ImageNet dataset.

Lei 등은 복잡한 전이 학습 방법을 제안했다. 사전 훈련된 ResNet 모델의 서로 다른 제안된 아키텍처를 사용하고 이를 교차 모달 전송 학습 접근법이라고 명명한 것을 만들기 위해 함께 혼합하였다. 이 방법을 통해 얻은 결과는 오늘날 HEp-2 세포 분류에 대한 최신 성능 중 하나를 나타낸다. Shen 등이 제시한 연구에서 또 다른 최신 성능이 얻어졌다. 저자들은 ResNet 접근 방식을 다시 사용했지만, 데이터 확장이 큰 DCR(Deep Cross Residual) 모듈이라는 더 깊은 잔차 모듈을 사용했다. Lei et al. proposed a complex transfer learning method. We used the different proposed architectures of pre-trained ResNet models and blended them together to create what we termed a cross-modal transfer learning approach. The results obtained through this method represent one of the latest advances in sorting HEp-2 cells today. Another state-of-the-art performance was obtained in the study presented by Shen et al. The authors used the ResNet approach again, but with a deeper residual module called the Deep Cross Residual (DCR) module with large data extension.

HEp-2 세포 이미지 데이터 세트는 일반적으로 유의미한 이질성을 나타낸다. 이것은 다른 강도 레벨의 존재로 인한 클래스 내 변형에 의해 설명된다. HEp-2 cell image data sets generally exhibit significant heterogeneity. This is explained by the variation within the class due to the presence of different intensity levels.

도 1a 내지 도 1d는 데이터 집합 중 하나의 셀룰러 이미지의 일 예를 나타내는 도면이다.1A to 1D are diagrams illustrating an example of a cellular image of one of a data set.

도 1a 및 도 1b의 두 이미지는 미세한 얼룩진 세포 유형을 나타낸다. 첫 번째 이미지는 양의 강도 레벨에서 오고 두 번째 이미지는 음의 강도 레벨에서 온다. 같은 클래스에 속하는 것으로 가정 된이 두 이미지에서 클래스 내 변형을 명확하게 볼 수 있다. 이 두 이미지를 같은 클래스에서 출력하려고 시도하는 분류자가 직면한 복잡성을 이미지화 할 수 있다. The two images in FIGS. 1A and 1B show microscopically stained cell types. The first image comes from a positive intensity level and the second image comes from a negative intensity level. The variation within the class is clearly visible in these two images, which are assumed to belong to the same class. We can image the complexity faced by classifiers trying to output these two images from the same class.

여기서 논의된 이질성은 도 1c 및 도 1d에 도시된 이미지를 볼 때 훨씬 더 명확해진다. 이전에 언급한 유형은 두 이미지 모두 동일한 세포 유형인 동종 세포에 속한다. 다시 한 번 클래스 내 변형을 주목할 수 있다. 또한, 도 1b 및 도 1d에 표시된 두 개의 네거티브 강도 이미지를 비교하면 복잡성이 훨씬 눈에 띄게 된다. 두 이미지는 세포 모양이 지각력이 부족하고 조명이 취약하다는 점에서 유사하다. 두 이미지는 구별하기 쉽지 않지만 서로 다른 두 클래스에 속해야 한다.The heterogeneity discussed here becomes even clearer when looking at the images shown in Figures 1c and 1d. The previously mentioned types belong to allogeneic cells, which are the same cell type in both images. Again you can notice the variation within the class. Also, the complexity becomes even more noticeable when comparing the two negative intensity images shown in Figs. 1b and 1d. The two images are similar in that the cell shape is poorly perceived and the lighting is poor. The two images are not easy to distinguish, but they must belong to two different classes.

이러한 현저한 이질성은 차별 효율 측면에서 특정 문제를 야기할 수 있다. 실제 최첨단 방법 중 일부는 특히 데이터 세트가 강한 차이를 보일 때이 문제를 해결하는 데 여전히 몇 가지 문제가 있다. 이 연구에서 제안된 방법은 이 이질성 관련 문제를 구체적으로 해결하려고 한다. 클래스 내 변형이 강한 두 가지 데이터 세트를 선택했다. Nigam 등이 제안한 방법과 같은 일부 방법, 세포 유형 분류 자체에 앞서 예비 강도 기반 분리를 수행하여 이 문제를 해결하려고 한다. 이러한 진행 방식이 최종 결과를 합리적으로 좋게 만들 수 있지만, 향후 추출 프로세스에 부담을 가한 후 분류 부분에서 두 단계를 사용한다는 사실은 글로벌 처리 시간을 분명히 연장시킨다. This marked heterogeneity can cause certain problems in terms of differential efficiency. Some of the real state-of-the-art methods still have some problems with solving this problem, especially when the data sets show strong differences. The method proposed in this study specifically attempts to solve this heterogeneity-related problem. Two data sets with strong intra-class variation were selected. Some methods, such as the one proposed by Nigam et al., try to solve this problem by performing preliminary intensity-based separation prior to the cell type classification itself. While this approach can make the end result reasonably good, the fact that the classification part uses two steps after burdening the future extraction process obviously extends the global processing time.

따라서, 한 단계에서 그 이질성을 최소화할 수 있도록 하는 기술이 개발될 필요가 있다.Therefore, there is a need to develop a technology capable of minimizing the heterogeneity in one step.

따라서, 본 발명은 상기한 바와 같은 문제점을 해결하기 위하여 제안된 것으로, 인간 상피 세포의 강도 수준 및 모양과 관련하여 특정 균질화를 장려함으로써 클래스 내 불일치를 최소화하도록 하는 인간 상피 세포 이미지 분류를 위한 동적 학습 장치 및 방법을 제공함에 있다.Therefore, the present invention has been proposed to solve the above problems, and dynamic learning for human epithelial cell image classification to minimize intra-class discrepancy by encouraging specific homogenization with respect to the intensity level and shape of human epithelial cells To provide an apparatus and method.

본 발명의 목적은 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 본 발명이 속하는 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.Objects of the present invention are not limited to those mentioned above, and other objects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description.

상기와 같은 목적을 달성하기 위한 본 발명에 따른 인간 상피 세포 이미지 분류를 위한 동적 학습 방법은, 이미지가 입력되면, 이산 웨이블릿 변환을 수행하는 단계; 이산 웨이블릿 변환 계수를 네트워크에 공급하기 위해 정규화 하는 단계; 4 개의 서로 다른 CNN이 병렬로 사용되어 각 CNN이 서로 다른 정규화 된 웨이블릿 계수를 입력으로 하여 최종 잔차 빌딩 블록을 형성하는 단계; 및 글로벌 평균 풀링 및 분류 계층을 통과시키는 단계를 포함한다.A dynamic learning method for human epithelial cell image classification according to the present invention for achieving the above object includes: performing discrete wavelet transformation when an image is input; normalizing the discrete wavelet transform coefficients to feed the network; 4 different CNNs are used in parallel so that each CNN forms a final residual building block with different normalized wavelet coefficients as input; and passing the global average pooling and classification layer.

또한, 상기와 같은 목적을 달성하기 위한 본 발명에 따른 인간 상피 세포 이미지 분류를 위한 동적 학습 장치는, 이미지가 입력되면, 이산 웨이블릿 변환을 수행하는 변환부; 이산 웨이블릿 변환 계수를 네트워크에 공급하기 위해 정규화 하는 정규화부; 4 개의 서로 다른 CNN이 병렬로 사용되어 각 CNN이 서로 다른 정규화 된 웨이블릿 계수를 입력으로 하여 최종 잔차 빌딩 블록을 형성하는 블록 형성부; 및 글로벌 평균 풀링 및 분류 계층을 통과시키는 분류부를 포함한다.In addition, a dynamic learning apparatus for human epithelial cell image classification according to the present invention for achieving the above object includes: a transformation unit that performs discrete wavelet transformation when an image is input; a regularizer that normalizes the discrete wavelet transform coefficients to supply the network; a block forming unit in which four different CNNs are used in parallel to form a final residual building block with each CNN receiving different normalized wavelet coefficients as input; and a classifier that passes the global average pooling and classification layer.

본 발명에 의하면, 인간 상피 세포의 강도 수준 및 모양과 관련하여 특정 균질화를 장려함으로써 클래스 내 불일치를 최소화하도록 한다.In accordance with the present invention, a specific homogenization with respect to the intensity level and shape of human epithelial cells is encouraged to minimize intraclass discrepancies.

본 발명의 효과는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the following description.

도 1은 데이터 집합 중 하나의 셀룰러 이미지의 일 예를 나타내는 도면,
도 2는 본 발명의 실시예에 따른 인간 상피 세포 이미지 분류를 위한 동적 학습 단계를 개략적으로 나타내는 도면,
도 3은 본 발명의 실시예에 따라 데이터 세트에서 양의 강도 셀룰러 이미지의 웨이블릿 계수를 나타내는 도면,
도 4는 본 발명의 실시예에 따라 데이터 세트에서 음의 강도 셀룰러 이미지의 웨이블릿 계수를 나타내는 도면,
도 5는 본 발명의 실시예에 따른 네 개의 다른 네트워크의 구조를 나타내는 도면,
도 6은 본 발명의 실시예에 따른 도 5의 2개의 잔여 빌딩 블록의 구조를 구체화한 도면,
도 7은 본 발명의 실시예에 따른 요약 계층 구조를 나타내는 도면,
도 8은 본 발명의 실시예에 따른 동적 학습 방법을 개략적으로 나타내는 도면.
도 9는 본 발명의 실시예에 따른 동적 학습 장치를 나타내는 블록도.1 is a diagram illustrating an example of a cellular image of one of a data set;
2 is a diagram schematically showing a dynamic learning step for human epithelial cell image classification according to an embodiment of the present invention;
3 is a diagram illustrating wavelet coefficients of positive intensity cellular images in a data set according to an embodiment of the present invention;
4 is a diagram illustrating wavelet coefficients of negative intensity cellular images in a data set according to an embodiment of the present invention;
5 is a diagram showing the structure of four different networks according to an embodiment of the present invention;
6 is a view embodying the structure of two remaining building blocks of FIG. 5 according to an embodiment of the present invention;
7 is a diagram illustrating a summary hierarchy structure according to an embodiment of the present invention;
8 is a diagram schematically illustrating a dynamic learning method according to an embodiment of the present invention.
9 is a block diagram illustrating a dynamic learning apparatus according to an embodiment of the present invention.

본 발명의 목적 및 효과, 그리고 그것들을 달성하기 위한 기술적 구성들은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 본 발명을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명에서의 기증을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. Objects and effects of the present invention, and technical configurations for achieving them will become clear with reference to the embodiments described below in detail in conjunction with the accompanying drawings. In the description of the present invention, if it is determined that a detailed description of a well-known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. And the terms described below are terms defined in consideration of donation in the present invention, which may vary according to the intention or custom of the user or operator.

그러나, 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있다. 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. Only the present embodiments are provided so that the disclosure of the present invention is complete, and to completely inform those of ordinary skill in the art to which the present invention belongs, the scope of the invention, the present invention is defined by the scope of the claims will only be Therefore, the definition should be made based on the content throughout this specification.

한편, 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 부재를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 구비할 수 있다는 것을 의미한다.On the other hand, throughout the specification, when a certain part is said to be "connected" with another part, it is not only "directly connected" but also "indirectly connected" with another member interposed therebetween. include In addition, when a part "includes" a certain component, this means that other components may be further provided without excluding other components unless otherwise stated.

본 발명은 서로 다른 입력 이미지의 변형을 동시에 취하는 다른 네트워크를 가진 동적 학습 프로세스를 제안한다. 이산 웨이블릿 변환 (DWT)은 다른 변형을 생성하기 위해 사용된다. 이미지들에 대해 2 차원 DWT가 수행된다. DWT 계수는 네트워크에 공급하기 위해 정규화되고 이미지 형태로 표현된다. 4 개의 서로 다른 CNN이 병렬로 사용되며 각 CNN은 서로 다른 정규화 된 웨이블릿 계수를 입력으로 사용한다. 오차의 전파는 병렬적으로 행해지고, 네트워크의 특정 지점에서 4개의 CNN의 피쳐 맵(feature map)에서 나온 높은 수준의 특징들이 4개의 네트워크가 학습한 다른 정보를 혼합하기 위해 결합되어 비선형성 함수를 통과한다. 그런 다음, 연결 과정을 수행하는 층의 출력을 이용하여 최종 잔여 건물 블록을 형성한다. 마지막으로, 이러한 비선형성은 지구 평균 풀링 계층과 최종 분류를 수행할 완전히 연결된 계층을 통과한다.The present invention proposes a dynamic learning process with different networks taking different input image transformations simultaneously. Discrete wavelet transform (DWT) is used to generate different transforms. A two-dimensional DWT is performed on the images. The DWT coefficients are normalized to feed the network and expressed in image form. Four different CNNs are used in parallel, each taking a different normalized wavelet coefficient as input. The propagation of the error is done in parallel, and at a specific point in the network, high-level features from the feature maps of the four CNNs are combined to mix different information learned by the four networks and passed through a non-linear function. do. Then, the final remaining building block is formed using the output of the floor performing the linking process. Finally, this nonlinearity passes through the global average pooling layer and the fully connected layer to perform the final classification.

서로 다른 파동계수는 국부적인 강도의 변화를 노출하고 유의하게 보강할 것으로 예상되며, 특히 광도 조명의 수준에 강한 차이를 보이는 세포형의 경우 강도는 물론 모양도 균질화된다. 주어진 세포의 이미지의 대부분을 그 후 네 개의 다른 CNN은 이러한 공개된 속성을 역동적으로 배운다. 이 방법은 공개적으로 이용 가능한 SNPHEp-2 및 ICPR 데이터 집합에서 테스트되었으며, 그 결과 데이터셋의 강력한 불균형이 실제로 최소화되어 최종적으로 얻은 차별 결과를 증가시켰다.Different wave coefficients are expected to expose and significantly reinforce changes in local intensity, and especially in the case of cell types showing strong differences in the level of luminous illumination, the intensity as well as the shape are homogenized. After most of the images of a given cell, four different CNNs dynamically learn these published properties. This method was tested on publicly available SNPHEp-2 and ICPR datasets, as a result of which strong disparities in the datasets were actually minimized, increasing the finally obtained discrimination results.

도 2는 본 발명의 실시예에 따른 인간 상피 세포 이미지 분류를 위한 동적 학습 단계를 개략적으로 나타내는 도면이다.2 is a diagram schematically illustrating a dynamic learning step for classifying images of human epithelial cells according to an embodiment of the present invention.

도 2에서 AC, HDC, VDC, DDC는 각각 근사 계수, 수평 상세 계수, 수직 상세 계수 및 대각선 상세 계수를 나타낸다.In FIG. 2, AC, HDC, VDC, and DDC represent an approximate coefficient, a horizontal detail coefficient, a vertical detail coefficient, and a diagonal detail coefficient, respectively.

이하에서는 도 3 내지 도 7을 참조하여 도 2에 도시된 각 단계의 동작을 구체적으로 설명한다.Hereinafter, the operation of each step shown in FIG. 2 will be described in detail with reference to FIGS. 3 to 7 .

이미지가 입력되면, 201단계에서 Two-dimensional discrete wavelet transform decomposition를 수행하여 입력 이미지 위에 DWT 분해를 적용한다. 첫 번째 레벨의 2 차원 분해는 4 가지 다른 출력을 생성한다. DWT에 의해 제공된 고주파 및 저주파 대역 분해 후에, 근사 계수는 저주파 대역 정보를 포함한다. 고주파 관련 정보는 주로 수평, 수직 및 대각선 세부의 세 가지 나머지 계수로 검색할 수 있다. 이 네 가지 다른 계수 매트릭스는 이미지로 표현된 후 병렬로 사용될 네 가지 다른 CNN의 입력으로 간주된다.When an image is input, two-dimensional discrete wavelet transform decomposition is performed in step 201 to apply DWT decomposition on the input image. The two-dimensional decomposition of the first level produces four different outputs. After the high and low frequency band decomposition provided by the DWT, the approximation coefficient contains the low frequency band information. High frequency related information can be retrieved mainly by the three remaining coefficients: horizontal, vertical and diagonal detail. These four different coefficient matrices are represented as images and then considered as inputs to four different CNNs to be used in parallel.

도 3은 본 발명의 실시예에 따라 데이터 세트에서 양의 강도 셀룰러 이미지의 웨이블릿 계수를 나타내는 도면이다. 3 is a diagram illustrating wavelet coefficients of positive intensity cellular images in a data set according to an embodiment of the present invention.

도 3을 참조하면, 양의 강도 미세 얼룩덜룩 한 셀과 DWT 분해의 네 가지 출력 결과를 보여준다. (a)는 조명이 강한 원본 이미지를 나타내는 것이고, (b)는 근사 계수를 나타낸다. DWT 분해에 포함된 저역 통과 필터링 프로세스의 결과를 나타낸다. 이 저역 통과 필터링은 중대한 그레이 변화를 나타내지 않는 영역이 강화되는 효과가 있다. 한편, (b)에 도시된 결과 이미지에서 픽셀의 강도 값이 약간 강화된 것을 알 수 있다. (a)에 도시된 원본 이미지와 (b)에 도시된 근사 계수를 비교하면 전체 조명의 약간의 강화가 명확해진다. Referring to Figure 3, it shows the four output results of the DWT digestion with the positive intensity microstained cells. (a) shows the original image with strong illumination, and (b) shows the approximate coefficients. The results of the low-pass filtering process involved in the DWT decomposition are shown. This low-pass filtering has the effect of enhancing areas that do not exhibit significant gray changes. On the other hand, it can be seen that the intensity value of the pixel is slightly enhanced in the result image shown in (b). Comparing the original image shown in (a) with the approximation coefficients shown in (b), a slight enhancement of the overall illumination becomes clear.

한편, 도(c), (d), (e)에 도시된 이미지는 세부 계수를 나타낸다. 이 계수는 원본 이미지의 고주파 요소를 나타내는 것으로, 세 가지 다른 세부 계수는 모두 실제로 셀의 모양을 나타낸다.Meanwhile, the images shown in FIGS. (c), (d), and (e) show detailed coefficients. This coefficient represents the high-frequency component of the original image, while all three other detailed coefficients actually represent the shape of the cell.

도 4는 본 발명의 실시예에 따라 데이터 세트에서 음의 강도 셀룰러 이미지의 웨이블릿 계수를 나타내는 도면이다.4 is a diagram illustrating wavelet coefficients of a negative intensity cellular image in a data set according to an embodiment of the present invention.

도 4를 참조하면, 음의 강도 미세 얼룩덜룩 한 셀과 DWT 분해의 네 가지 출력 결과를 보여준다. (a)는 조명이 매우 약한 원본 이미지를 나타내는 것으로. 음의 강도 이미지가 어떻게 셀 모양에 관한 중요한 신호를 제공하지 않는지 알 수 있다. (b)의 이미지는 근사 계수를 나타내는 것으로, 이미지에서 강도 레벨이 눈에 띄게 강화되었다. 도 3의 (b)와 도 4의 (b)에 도시된 이미지를 비교하면, 강도 측면에서 발생하는 균질화를 명확하게 식별할 수 있다. 두 개의 다른 형광 조명에서 이미지의 균질화이다. 비록 도 3의 (a) 및 도 4의 (a)에 도시된 원본 이미지가 전역 조명 측면에서 큰 차이를 나타내지만, 도 3의 (b) 및 도 4의 (b)에 각각 도시된 이들의 근사 계수는 조명에서 특정 균질화를 나타낸다.Referring to Figure 4, it shows the four output results of the negative-intensity fine speckled cell and DWT digestion. (a) shows the original image with very poor lighting. It can be seen how the negative intensity image does not provide a significant signal about cell shape. The image in (b) shows the approximation coefficient, in which the intensity level is noticeably enhanced in the image. Comparing the images shown in Fig. 3 (b) and Fig. 4 (b), it is possible to clearly identify the homogenization occurring in terms of intensity. Homogenization of images under two different fluorescent lights. Although the original images shown in FIGS. 3A and 4A show a large difference in terms of global illumination, their approximations shown in FIGS. 3B and 4B, respectively The coefficient represents a specific homogenization in illumination.

도 4의 (c), (d), (e)에 도시된 이미지는 3 개의 상이한 세부 계수를 나타낸다. 여기서, 각각 수평, 수직 및 대각선 세부 사항을 가지고 있다. 도 4의 (a)에서 원본 이미지가 셀 모양과 관련하여 눈에 띄는 정보를 제공하지는 않지만 세부 계수가 셀의 기하학적 신호를 어떻게 노출시키는 방법을 설명할 수 있다. 그리고, 도 3의 포지티브 일루미네이션 이미지와 도 4의 네거티브 일루미네이션 이미지의 세부 계수를 비교함으로써, 도 3의 (c), (d), (e) 및 도 4의 (c), (d), (e)에 도시된 이미지를 엔 글로브함으로써, 기하학적 신호의 지각 측면에서 발생하는 균질화는 세포의 모양에 대한 포괄적인 이해를 제공한다. 세포의 기하학적 패턴이 식별에 실제로 중요하다는 것을 알면, 강화가 실제로 필요한 것으로 보인다. 또한 세 가지 세부 계수는 서로 다른 방향의 그래디언트처럼 작용한다.The images shown in (c), (d) and (e) of Fig. 4 show three different coefficients of detail. Here, we have horizontal, vertical and diagonal details, respectively. Although the original image in Fig. 4(a) does not provide striking information regarding the cell shape, it can explain how the coefficient of detail exposes the cell's geometric signal. And, by comparing the detailed coefficients of the positive illumination image of FIG. 3 and the negative illumination image of FIG. 4 , (c), (d), (e) and FIGS. 4 (c), (d), (e) ), the homogenization that occurs in terms of perception of geometric signals provides a comprehensive understanding of the shape of cells. Knowing that the geometrical patterns of cells are indeed important for identification, it seems that enhancement is indeed necessary. Also, the three sub-coefficients act like gradients in different directions.

마지막으로, 근사 및 세부 계수는 두 종류의 조명 (강도 레벨) 사이의 차이를 최소화하고 근사 계수를 사용하여 강도와 기하학적 패턴을 세 가지 세부 사항으로 특정 균질화를 촉진한다. 세포 형태를 나타내는 계수. 이 네 가지 다른 계수는 네 가지 다른 네트워크의 입력으로 사용되어 다른 셀룰러 유형의 구별에 도움이 될 모든 필요한 정보를 동적으로 캡슐화 하는 방법을 동시에 학습한다.Finally, the approximation and detail coefficients minimize the differences between the two kinds of illumination (intensity levels) and the approximate coefficients are used to promote specific homogenization of intensity and geometric patterns into three details. A count representing the cell morphology. These four different coefficients are used as inputs to four different networks, simultaneously learning how to dynamically encapsulate all the necessary information that will help distinguish different cellular types.

이후, 203단계에서 네 개의 다른 계수를 다른 네트워크에 공급한다. CNN은 그러한 목적으로 사용된다. 네트워크는 병렬 패러다임에서 활용될 것이다. 도 2에서 볼 수 있듯이 제안된 방식의 첫 번째 부분에는 서로 다른 계수에서 특징을 추출하는 네 개의 네트워크가 있다. 이 프로세스 후에는 네트워크에서 학습한 기능이 연결되어 포함된 다양한 정보를 혼합하고 최종 고급 기능 벡터를 제공한다. 이 벡터는 전체 평균 풀링 프로세스 후에 획득되며 최종 분류를 수행할 완전히 연결된 레이어를 통과한다. 4 개의 CNN은 도 5에 표시된 것과 동일한 구조를 가지고 있다.Thereafter, in step 203, four different coefficients are supplied to other networks. CNNs are used for that purpose. Networks will be utilized in a parallel paradigm. As shown in Fig. 2, in the first part of the proposed scheme, there are four networks that extract features from different coefficients. After this process, features learned from the network are linked, blending the various information contained therein and providing the final advanced feature vector. This vector is obtained after the overall average pooling process and passes through the fully connected layer to perform the final classification. The four CNNs have the same structure as shown in FIG. 5 .

도 5는 본 발명의 실시예에 따른 네 개의 다른 네트워크의 구조를 나타내는 도면이다.5 is a diagram showing the structure of four different networks according to an embodiment of the present invention.

첫 번째 레벨의 DWT 분해 프로세스는 원본 이미지의 절반으로 크기가 줄어 든다. 원래 셀룰러 이미지의 크기는 112x112이므로 네트워크는 도 5와 같이 56x56 크기의 입력을 받는다. 컨볼루션 연산의 전체를 위해 3x3의 수신 파일을 사용했다. 컨볼루션 동작은 출력 볼륨이 입력 볼륨의 공간 범위를 유지하여 다운 샘플링 프로세스가 풀링 계층에 의해서만 수행되도록하는 방식으로 수행된다. 이는 컨벌루션 연산에 의해 수행되는 특징 추출 프로세스 동안 공간 정보의 손실을 피하기 위한 것이다. 도 5에서 "Conv 3x3, m"은 3x3의 필터 크기와 m 개의 다른 필터를 가진 컨볼루션 연산을 나타낸다. m의 값은 32로 설정된다.The first level of the DWT decomposition process is reduced in size to half of the original image. Since the size of the original cellular image is 112x112, the network receives a 56x56 input as shown in FIG. We used a 3x3 receive file for the whole of the convolution operation. The convolution operation is performed in such a way that the output volume maintains the spatial extent of the input volume so that the downsampling process is performed only by the pooling layer. This is to avoid loss of spatial information during the feature extraction process performed by the convolution operation. In FIG. 5, "Conv 3x3, m" indicates a convolution operation having a filter size of 3x3 and m different filters. The value of m is set to 32.

도 5에서 입력 계수가 먼저 컨볼루션 레이어에 주어지고, 그 특성은 이전 단락에서 설명된 다음 풀링 레이어가 나타낸다. 이 풀링 레이어는 컨볼루션 레이어에서 입력 볼륨을 절반으로 다운 샘플링하기 위해 보폭과 패딩이 2이다. 따라서 네트워크의이 부분에서 출력 볼륨은 28x28의 공간 범위를 가지며 32 개의 서로 다른 맵을 포함한다. 그 후, 잔여 빌딩 블록은 이전 레이어의 출력을 사용하여 구성된다. 첫 번째 잔차 빌딩 블록(도 5의 잔차 빌딩 블록 1)의 출력은 입력 볼륨과 동일한 특성을 갖는다. 이는 28x28 크기의 32 가지 기능 맵을 여전히 포함하고 있음을 의미한다. 두 번째 풀링 작업은 두 번째 잔여 빌딩 블록 직전에 수행된다. 잔차 빌딩 블록 2의 출력량은 64 개(도 5의 2m) 기능 맵과 14x14의 공간 범위를 갖는다. In Fig. 5, the input coefficients are first given to the convolutional layer, and the properties of which are described in the previous paragraph are indicated by the next pooling layer. This pooling layer has a stride length and padding of 2 to downsample the input volume by half in the convolutional layer. Thus, the output volume in this part of the network has a spatial extent of 28x28 and contains 32 different maps. Then, the remaining building blocks are constructed using the output of the previous layer. The output of the first residual building block (residual building block 1 in Fig. 5) has the same characteristics as the input volume. This means that it still contains 32 functional maps of size 28x28. The second pooling operation is performed just before the second remaining building block. The output of the residual building block 2 has 64 (2m in Fig. 5) functional maps and a spatial extent of 14x14.

잔여 빌딩 블록은 2 개의 컨볼루션 레이어, 2 개의 Rectified Linear Unit (ReLU) 활성화 레이어 및 2 개의 Batch Normalization (BN) 레이어로 구성된다. 주어진 변수 x에 대해 ReLU 활성화 기능은 하기 <수학식 1>과 같이 정의된다.The remaining building blocks consist of two convolutional layers, two Rectified Linear Unit (ReLU) activation layers and two Batch Normalization (BN) layers. For a given variable x, the ReLU activation function is defined as in Equation 1 below.

<수학식 1><Equation 1>

BN은 내부 공변량 편이 문제를 해결하기 위해 CNN의 계층간에 사용되는 정규화 방법 유형이다. 실제로, 다른 계층에서 기능 맵의 분포는 훈련 과정에서 지속적으로 변경된다. BN 층은 훈련 과정 동안 분포 변화를 최소화하는 변수를 찾는 것을 목표로 한다. BN is a type of regularization method used between layers of CNNs to solve the internal covariate shift problem. Indeed, the distribution of functional maps in different layers is constantly changing during the training process. The BN layer aims to find variables that minimize distribution changes during the training process.

도 6은 본 발명의 실시예에 따른 도 5의 2개의 잔여 빌딩 블록의 구조를 구체화한 도면으로, (a)는 4개의 서로 다른 네트워크에서 잔여 빌딩 블록(1 및 2)의 구조를 나타내고, (b)는 합산 층(SumL) 이후의 최종 잔여 빌딩 블록의 구조를 나타낸다.FIG. 6 is a detailed view of the structures of the two residual building blocks of FIG. 5 according to an embodiment of the present invention, (a) shows the structures of the residual building blocks 1 and 2 in four different networks, ( b) shows the structure of the final residual building block after the summing layer (SumL).

여기서, identity short-cut은 하기 <수학식 2>와 같이 정의된다.Here, the identity short-cut is defined as in Equation 2 below.

<수학식 2>

여기서 f (x, W)라는 용어는 최종 컨벌루션 연산의 출력, x는 입력 볼륨, W는 학습 과정에서 학습할 매개 변수를 나타내고, y는 잔차 빌딩 블록의 최종 출력 볼륨을 나타낸다. 한편, 도 6의 (b)는 4 개의 서로 다른 네트워크에서 기능 맵을 요약한 후 수행할 때 identity short-cut를 보여준다. 이 경우에는 하기 <수학식 3>과 같이 정의되며, 여기서 Sum(

)이라는 용어는 4 개의 다른 네트워크에서 4 개의 볼륨을 합한 것을 나타낸다.Here, the term f (x, W) denotes the output of the final convolution operation, x denotes the input volume, W denotes the parameters to be learned in the learning process, and y denotes the final output volume of the residual building block. On the other hand, (b) of FIG. 6 shows an identity short-cut when performing after summarizing the functional maps in four different networks. In this case, it is defined as in the following <Equation 3>, where Sum(

) refers to the sum of 4 volumes from 4 different networks.

<수학식 3><Equation 3>

도 7은 본 발명의 실시예에 따른 요약 계층 구조를 나타내는 도면이다. 7 is a diagram illustrating a summary hierarchical structure according to an embodiment of the present invention.

도 7에는 요약 계층(summation layer)라고하는 것을 보여준다. 4 개의 네트워크의 두 번째 잔여 빌딩 블록 이후에, 네트워크로부터의 모든 최종 볼륨은 앞서 논의되고 도 5에 도시된 바와 같이, 크기가 14x14 인 64 개의 피쳐 맵을 갖는다. 요약 계층의 첫 번째 동작은 피쳐의 연결을 수행하는 것이다. 다른 네트워크에서지도. 서로 다른 네트워크에서 학습한 유용한 패턴은 하나의 볼륨으로 구성된다. 이 단일 볼륨에는 256 개의 서로 다른 기능 맵이 있다. 모든 정보를 캡슐화하고 다음 계층의 매개 변수 수를 줄이기 위해 연결 프로세스에서 상속된 볼륨에 대해 필터 크기 1x1의 컨볼루션 작업이 수행된다. 이 작업은 다른 네트워크에서 학습한 다양한 기능을 선형으로 조합한 것과 유사하다. 1x1 컨볼루션 작업은 기능을 결합하여 깊이를 줄이지 만 입력 볼륨의 공간 범위를 보존하므로 공간 정보가 손실되지 않는다. 이 합산 프로세스 후의 최종 볼륨에는 도 7에서 볼 수 있듯이 크기가 14x14 인 64 개의 기능 맵이 있다.7 shows what is called a summary layer. After the second residual building block of 4 networks, all final volumes from the network have 64 feature maps of size 14x14, as discussed above and shown in FIG. 5 . The first operation of the summary layer is to perform the concatenation of features. Maps from other networks. Useful patterns learned from different networks are organized into a volume. There are 256 different functional maps in this single volume. To encapsulate all information and reduce the number of parameters in the next layer, a convolution operation with filter size 1x1 is performed on the inherited volume in the concatenation process. This task is similar to a linear combination of various features learned from other networks. The 1x1 convolution operation reduces depth by combining features, but preserves the spatial extent of the input volume, so no spatial information is lost. In the final volume after this summing process, there are 64 functional maps of size 14x14 as shown in Fig.

이 최종 볼륨은 도 6의 (b)와 <수학식 3>에 표시된 최종 빌딩 블록의 입력으로 사용된다. 이 잔여 블록에는 도 6의 (b)에서 명확하게 볼 수 있듯이 두 개의 컨볼루션 연산이 포함된다. 이러한 컨볼루션 작업은 각각 128 개의 서로 다른 필터를 사용한다. 컨벌루션 동작이 공간 범위를 유지하므로, 잔여 빌딩 블록 (3)의 출력 볼륨은 14x14x128의 형태일 것이다. BN 및 ReLU 활성화 계층 이후에는이 볼륨이 글로벌 평균 풀링에 사용된다. 전체 평균 풀링을 적용한 후 최종 상위 레벨 기능 벡터는 1x1x128 형식이되어 128 개의 요소를 포함하는 1 차원 벡터를 제공한다.This final volume is used as an input of the final building block shown in Fig. 6 (b) and <Equation 3>. As can be clearly seen in FIG. 6(b), two convolution operations are included in this residual block. Each of these convolution operations uses 128 different filters. Since the convolution operation maintains the spatial extent, the output volume of the residual building block 3 will be in the form of 14x14x128. After the BN and ReLU activation layers, this volume is used for global average pooling. After applying full mean pooling, the final high-level feature vector is of the form 1x1x128, giving a one-dimensional vector containing 128 elements.

고급 기능 벡터는 분류 프로세스를 수행할 완전히 연결된 레이어에 제공된다. 이 계층은 다음과 같이 정의된 softmax 기능을 사용하며, 하기 <수학식 4>와 같이 정의된다.Advanced feature vectors are provided to fully connected layers that will perform the classification process. This layer uses the softmax function defined as follows, and is defined as in Equation 4 below.

<수학식 4><Equation 4>

여기서, N은 클래스 수이며, 이는 softmax 함수의 입력 특징 벡터의 길이임을 의미한다. zj 값은 해당 입력 벡터의 j 번째 요소이다. 즉, 함수의 입력 특성 벡터에서 j 번째 클래스와 연관된 기능을 나타낸다. j가 1에서 총 클래스 수 N까지 변하는

값은 softmax 함수의 출력을 나타내며, 이는 데이터가 주어진 모든 단일 클래스의 정규화 된 확률로 해석될 수 있다. 네트워크는 오류를 역전 파하고 다음에 의해 정의된 교차 엔트로피 오류 함수를 사용하여 학습하며, 하기 <수학식 5>와 같이 정의된다.Here, N is the number of classes, which means the length of the input feature vector of the softmax function. The zj value is the j-th element of the corresponding input vector. That is, it represents the function associated with the j-th class in the input feature vector of the function. where j varies from 1 to the total number of classes N

The value represents the output of the softmax function, which can be interpreted as the normalized probability of any single class given the data. The network backpropagates the error and learns using the cross-entropy error function defined by the following, and is defined as in Equation 5 below.

<수학식 5><Equation 5>

여기서, 값 yj는 주어진 데이터에 대한 N 클래스의 실제 레이블을 나타낸다. 반면에, 값 zj는 방정식 4를 사용하여 계산된 softmax 함수의 출력 벡터 z의 요소를 나타낸다.Here, the value yj represents the actual label of class N for the given data. On the other hand, the value zj represents the elements of the output vector z of the softmax function computed using Equation 4.

도 8은 본 발명의 실시예에 따른 동적 학습 방법을 개략적으로 나타내는 도면이다. 중복을 피하기 위해 4 개의 네트워크는 합산 계층 이전에 동일하다. 첫 번째 네트워크에 대해서만 다른 볼륨의 크기(계층의 출력)를 언급했고, 나머지 3 개 네트워크에 동일한 크기가 적용된다. 앞에서 언급한 것처럼 모든 입력의 크기도 동일하다.8 is a diagram schematically illustrating a dynamic learning method according to an embodiment of the present invention. To avoid duplication, the four networks are identical before the summing layer. For only the first network we mentioned different volume sizes (output of the layer), the same size applies to the remaining 3 networks. As mentioned earlier, all inputs are the same size.

원본 이미지를 입력으로 사용하는 하나의 단일 네트워크를 사용하여 얻은 결과로 시작한다. 이 네트워크의 구조는 단일 네트워크만 있기 때문에 합산 계층이 없다는 사실을 제외하고 제안된 방법과 동일한 구조를 따른다.We start with the results obtained by using one single network with the original image as input. The structure of this network follows the same structure as the proposed method except for the fact that there is no aggregation layer because there is only a single network.

네트워크의 학습 능력을 향상시키기 위해 데이터 집합에 데이터 보강이 적용되었다. 학습 데이터의 모든 분할 세트에서, 세포를 18º 단계로 360º 회전시켰으며, 이는 Bayramoglu 등에 의해 실험된 것과 유사하다. 이는 원래 훈련 세트가 18º의 20 개 부분을 포함하는 360º 사분면 20 배로 확장되었음을 의미한다. 학습 세트를 보강하면 테스트 세트에 대한 정확도가 실제로 향상된다. Data augmentation was applied to the dataset to improve the learning ability of the network. For all split sets of training data, the cells were rotated 360º in 18º steps, similar to the experiment by Bayramoglu et al. This means that the original training set has been expanded by a factor of 20 in the 360º quadrant containing 20 parts of 18º. Reinforcing the training set actually improves the accuracy on the test set.

따라서, 본 발명은 근사 및 상이한 강도 레벨을 갖는 이미지로부터 추출된 특징을 효율적으로 균질화하기 위해 병렬 학습 패러다임에서 4개의 상이한 딥 네트워크에 공급되고, 그 상이한 네트워크로부터의 특징 맵은 연결되어 분류층으로 전달되어 최종 유형의 셀룰러 이미지를 생성하도록 한다. Therefore, the present invention is fed to four different deep networks in a parallel learning paradigm to efficiently homogenize features extracted from images with approximations and different intensity levels, and the feature maps from those different networks are connected and passed to the classification layer. to produce a final type of cellular image.

세포의 강도 수준 및 모양과 관련하여 특정 균질화를 장려함으로써 클래스 내 불일치를 최소화하기 위해 동적 학습 방법을 수행함으로써, 이러한 문제를 최소화하여 차별적인 결과를 향상시키도록 한다.By performing a dynamic learning method to minimize intra-class discrepancies by encouraging specific homogenization with respect to the intensity level and shape of cells, this problem is minimized to improve differential outcomes.

본 명세서와 도면에는 본 발명의 바람직한 실시예에 대하여 개시하였으며, 비록 특정 용어들이 사용되었으나, 이는 단지 본 발명의 기술 내용을 쉽게 설명하고 발명의 이해를 돕기 위한 일반적인 의미에서 사용된 것이지, 본 발명의 범위를 한정하고자 하는 것은 아니다. 여기에 개시된 실시예 외에도 본 발명의 기술적 사상에 바탕을 둔 다른 변형예들이 실시 가능하다는 것은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다.In the present specification and drawings, preferred embodiments of the present invention have been disclosed, and although specific terms are used, these are only used in a general sense to easily explain the technical content of the present invention and help the understanding of the present invention. It is not intended to limit the scope. It is apparent to those of ordinary skill in the art to which the present invention pertains that other modifications based on the technical spirit of the present invention can be implemented in addition to the embodiments disclosed herein.

1: 동적 학습 장치 10: 변환부
30: 정규화부 50: 블록 형성부
70: 분류부1: Dynamic learning device 10: Transformer
30: normalization unit 50: block forming unit
70: classification unit

Claims

In the dynamic learning method for human epithelial cell image classification,
when an image is input, performing discrete wavelet transform to generate four different discrete wavelet transform coefficients;
normalizing the four different discrete wavelet transform coefficients to feed a network;
4 different CNNs are used in parallel so that each CNN forms a final residual building block with different normalized wavelet transform coefficients as input; and
passing the final residual building blocks through a global average pooling and classification layer to produce a final type of cellular image;
The different normalized wavelet coefficients include approximate coefficients and detailed coefficients,
The detail coefficient includes a horizontal detail coefficient, a vertical detail coefficient and a diagonal detail coefficient,
the intensity value of the pixel is enhanced over the original of the image by the approximation coefficient, thereby facilitating a specific homogenization in terms of intensity;
Facilitates a specific homogenization of the geometric pattern of cells into three details, including horizontal, vertical and diagonal, compared to the original of the image by the factor of detail,
In the step of forming the final residual building block, a volume consisting of features extracted from wavelet transform coefficients by each CNN operating in parallel is concatenated, and then a convolution operation is performed, and generated through the convolution operation A dynamic learning method for human epithelial cell image classification, characterized in that the final volume is used as an input of the final building block to form the final residual building block.

A dynamic learning device for human epithelial cell image classification, comprising:
a transform unit that generates four different discrete wavelet transform coefficients by performing discrete wavelet transform when an image is input;
a normalizer for normalizing the four different discrete wavelet transform coefficients to supply the network;
a block forming unit in which four different CNNs are used in parallel to form a final residual building block with each CNN receiving different normalized wavelet transform coefficients as input; and
a classifier for passing the final residual building blocks through a global average pooling and classification layer to generate a final type of cellular image;
The different normalized wavelet coefficients include approximate coefficients and detailed coefficients,
The detail coefficient includes a horizontal detail coefficient, a vertical detail coefficient and a diagonal detail coefficient,
the intensity value of the pixel is enhanced over the original of the image by the approximation coefficient, thereby facilitating a specific homogenization in terms of intensity;
Facilitates a specific homogenization of the geometric pattern of cells into three details, including horizontal, vertical and diagonal, compared to the original of the image by the factor of detail,
The block forming unit performs a convolution operation after concatenating a volume composed of features extracted from wavelet transform coefficients by each CNN, and using the final volume generated through the convolution operation as an input of the final building block. Dynamic learning device for human epithelial cell image classification, characterized in that it forms the final residual building block.