TWI832032B - System of generating training data by questions and answers and method thereof - Google Patents
System of generating training data by questions and answers and method thereof Download PDFInfo
- Publication number
- TWI832032B TWI832032B TW110103801A TW110103801A TWI832032B TW I832032 B TWI832032 B TW I832032B TW 110103801 A TW110103801 A TW 110103801A TW 110103801 A TW110103801 A TW 110103801A TW I832032 B TWI832032 B TW I832032B
- Authority
- TW
- Taiwan
- Prior art keywords
- view
- question
- answer
- image
- display field
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000004044 response Effects 0.000 claims abstract description 25
- 230000009471 action Effects 0.000 claims description 40
- 238000004458 analytical method Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 14
- 230000010365 information processing Effects 0.000 claims description 6
- 238000010801 machine learning Methods 0.000 claims description 4
- 238000010191 image analysis Methods 0.000 claims description 2
- 238000013473 artificial intelligence Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 21
- 230000006870 function Effects 0.000 description 13
- 230000001575 pathological effect Effects 0.000 description 10
- 206010028980 Neoplasm Diseases 0.000 description 8
- 201000011510 cancer Diseases 0.000 description 8
- 230000000007 visual effect Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 4
- 230000007170 pathology Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 206010008088 Cerebral artery embolism Diseases 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 102000005711 Keratin-7 Human genes 0.000 description 1
- 108010070507 Keratin-7 Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000000013 bile duct Anatomy 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003364 immunohistochemistry Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 201000010849 intracranial embolism Diseases 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000004789 organ system Anatomy 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000005476 soldering Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 210000005239 tubule Anatomy 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/60—ICT specially adapted for the handling or processing of medical references relating to pathologies
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Public Health (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Primary Health Care (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Description
本發明涉及一種生成訓練資料的系統及其方法,特別是涉及一種透過問答生成訓練資料的系統及其方法。 The present invention relates to a system and method for generating training data, and in particular to a system and method for generating training data through question and answer.
對於病理判讀,目前都是由專家(如病理科醫師)進行人工判讀,於人工判讀過程中,專家必須使用顯微鏡觀看實體玻片,同時使用錄音筆、紙筆等工具手動記錄觀看的結果,最後根據觀看結果撰寫病理報告。 For pathological interpretation, manual interpretation is currently performed by experts (such as pathologists). During the manual interpretation process, the experts must use a microscope to view the physical slides, and at the same time use recording pens, paper and pen and other tools to manually record the viewing results. Finally, Write a pathology report based on the viewing results.
然而,上述方式存在以下問題: However, the above method has the following problems:
1.專家判讀不一致:不同專家對於同一實體玻片通常有不同的閱片角度(即所認定的感興趣區域不同),這會導致不同專家對於同一實體玻片產生不同想法,最終導致病理報告內容的不一致。 1. Inconsistent interpretations by experts: Different experts usually have different reading angles for the same physical slide (that is, the identified areas of interest are different). This will lead to different experts having different ideas about the same physical slide, and ultimately lead to differences in the content of the pathology report. Inconsistent.
2.多人共同閱讀不易:受限於實體玻片及顯微鏡同時僅容許一人進行觀看,人工判讀無法實現多人共同觀看,也無法確保多人輪流觀看的過程中,都能聚焦於同一個感興趣區域。 2. It is not easy for multiple people to read together: Due to the limitation that the physical slides and microscopes only allow one person to watch at the same time, manual interpretation cannot achieve multiple people watching at the same time, nor can it ensure that multiple people can focus on the same feeling during the viewing process. area of interest.
3.病理特徵的多樣性:隨著醫學技術進展,越來越多的病理特徵被發現。對於快速增加的病理特徵,專家必須於繁重工作之外,提撥大量時間學習新的病理特徵,以正確聚焦病理特徵所在的感興趣區域,而造成專家的嚴重負擔。 3. Diversity of pathological characteristics: With the advancement of medical technology, more and more pathological characteristics have been discovered. For the rapidly increasing number of pathological features, experts must allocate a lot of time to learn new pathological features in addition to heavy work in order to correctly focus on the area of interest where the pathological features are located, which causes a serious burden on the experts.
因此,現有對實體玻片進行人工判讀的方式存在前述問題,而亟待更有效的方案被提出。 Therefore, the existing manual interpretation of physical slides has the aforementioned problems, and a more effective solution is urgently needed.
本發明提供一種透過問答生成訓練資料的系統與方法,可於用戶瀏覽影像期間,提示感興趣區域,並透過問答蒐集感興趣區域的訓練資料。 The present invention provides a system and method for generating training data through question and answer, which can prompt a region of interest to a user during image browsing, and collect training data for the region of interest through question and answer.
本發明提出一種透過問答生成訓練資料的方法,包括以下步驟:於一平台端提供一目標影像的瀏覽;接受操作來調整該目標影像的一顯示視野;呈現用以分析該顯示視野的一問題,並取得一回應答案;及,關聯該顯示視野與對應的一問答集來產生用於自動分析該目標影像的一訓練資料,其中該問答集包括該問題與該回應答案。 The present invention proposes a method for generating training data through question and answer, which includes the following steps: providing browsing of a target image on a platform; accepting operations to adjust a display field of view of the target image; presenting a question for analyzing the display field of view, and obtain a response answer; and, associate the display field of view with a corresponding question and answer set to generate training data for automatically analyzing the target image, wherein the question and answer set includes the question and the response answer.
本發明另提出一種透過問答生成訓練資料的系統,包括一資料庫與一平台端,該資料庫用以儲存一目標影像,該平台端連接該資料庫,並經由網路連接一用戶端,該平台端被配置來於該用戶端提供一瀏覽介面來以瀏覽該目標影像,接受來自該用戶端的操作來調整該目標影像的一顯示視野,呈現用以分析該顯示視野的一問題,取得一回應答案,並關聯該顯示視野與對應的一問答集來產生用於自動分析該目標影像的一訓練資料,其中該問答集包括該問題與該回應答案。 The present invention also proposes a system for generating training data through question and answer, including a database and a platform. The database is used to store a target image. The platform is connected to the database and connected to a client through the network. The platform is configured to provide a browsing interface for the client to browse the target image, accept operations from the client to adjust a display field of view of the target image, present a question for analyzing the display field of view, and obtain a response. The answer is generated by associating the display field of view with a corresponding question and answer set to generate training data for automatically analyzing the target image, wherein the question and answer set includes the question and the response answer.
本發明可自動提示感興趣區域,並大幅降低為了收集訓練資料對用戶造成的負擔。 The invention can automatically prompt areas of interest and greatly reduce the burden on users in collecting training data.
10:平台端 10:Platform side
100:處理模組 100: Processing module
101:儲存模組 101:Storage module
102:通訊模組 102:Communication module
103、110:人機介面 103, 110: Human-computer interface
104:web模組 104:web module
105:管理模組 105: Management module
11:用戶端 11: Client
12:資料庫 12:Database
120:影像庫 120:Image library
121:知識庫 121:Knowledge Base
122:瀏覽記錄 122: Browsing history
123:用戶資料庫 123:User database
13:AI模組 13:AI module
130:學習模型 130: Learning model
131:資料轉換模組 131:Data conversion module
132:訓練模組 132:Training module
20:影像瀏覽與記錄模組 20:Image browsing and recording module
200:影像檢視模組 200:Image viewing module
201:影像操作模組 201:Image operation module
202:影像資訊處理模組 202:Image information processing module
21:知識獲取模組 21: Knowledge acquisition module
210:問答模組 210: Q&A module
211:動作模組 211:Action Module
212:標的模組 212:Target module
213:知識處理與提供模組 213: Knowledge processing and provision module
22:知識紀錄與處理模組 22:Knowledge recording and processing module
220:影像擷取模組 220:Image capture module
221:訓練資料產生模組 221: Training data generation module
30-32:目標影像 30-32: Target image
40-42:顯示基準點 40-42: Display reference point
50-51:範圍 50-51: Range
60-62、70-72、80-81:介面 60-62, 70-72, 80-81: Interface
800-802:區域 800-802: Area
S10-S13:第一訓練資料產生步驟 S10-S13: First training data generation step
S20-S27:第二訓練資料產生步驟 S20-S27: Second training data generation step
S30-S34:顯示步驟 S30-S34: Display steps
S40-S42:瀏覽步驟 S40-S42: Browsing steps
S50-S52:問答步驟 S50-S52: Question and Answer Steps
S60、S70-S73:訓練步驟 S60, S70-S73: training steps
S61、S80-S81:自動分析步驟 S61, S80-S81: Automatic analysis steps
圖1為本發明一實施例的系統的架構圖;圖2為本發明一實施例的平台端的部分架構圖;圖3為本發明一實施例的學習模型的輸出入的示意圖;圖4為本發明一實施例的學習模型的輸出入的示意圖;圖5為本發明一實施例的學習模型的輸出入的示意圖;圖6為本發明一實施例的方法的流程圖;圖7A為本發明一實施例的方法的第一部分流程圖;圖7B為本發明一實施例的方法的第二部分流程圖;圖8為本發明一實施例的方法的訓練與自動分析的流程圖;圖9為本發明一實施例的目標影像的顯示視野的示意圖;圖10為圖9的另一顯示視野的示意圖;圖11為圖9的另一顯示視野的示意圖;圖12為本發明一實施例的動作歷程的示意圖;及圖13為本發明一實施例的熱點圖的示意圖。 Figure 1 is an architecture diagram of the system according to an embodiment of the present invention; Figure 2 is a partial architecture diagram of the platform side according to an embodiment of the present invention; Figure 3 is a schematic diagram of the input and output of the learning model according to an embodiment of the present invention; Figure 4 is a schematic diagram of the input and output of the learning model according to an embodiment of the present invention. Figure 5 is a schematic diagram of the input and output of the learning model according to one embodiment of the present invention; Figure 6 is a flow chart of the method according to one embodiment of the present invention; Figure 7A is a schematic diagram of the input and output of the learning model according to one embodiment of the present invention; The first part of the flow chart of the method of the embodiment; Figure 7B is the second part of the flow chart of the method of an embodiment of the present invention; Figure 8 is a flow chart of the training and automatic analysis of the method of an embodiment of the present invention; Figure 9 is a flow chart of the method of the present invention. Figure 10 is a schematic diagram of another display field of view of Figure 9; Figure 11 is a schematic diagram of another display field of view of Figure 9; Figure 12 is an action process of an embodiment of the present invention. schematic diagram; and FIG. 13 is a schematic diagram of a heat map according to an embodiment of the present invention.
下面結合圖式和具體實施例對本發明技術方案進行詳細的描述,以更進一步瞭解本發明的目的、方案及功效,但並非作為本發明所附申請專利範圍的限制。 The technical solution of the present invention will be described in detail below in conjunction with the drawings and specific embodiments to further understand the purpose, solutions and effects of the present invention, but this is not intended to limit the scope of the patent application attached to the present invention.
人工智慧(AI,Artificial Intelligence)的影像自動判讀必須使用大量的訓練資料來訓練學習模型。 Automatic image interpretation by artificial intelligence (AI) must use a large amount of training data to train the learning model.
於瀏覽影像期間,專家可對大量影像逐張進行人工判讀與特徵標記(labeling),再依據這些標記產生訓練資料,但這使得專家必須中斷瀏覽並耗費大量時間進行標記與註解。 During image browsing, experts can manually interpret and label a large number of images one by one, and then generate training data based on these labels. However, this requires experts to interrupt browsing and spend a lot of time labeling and annotating.
以判讀醫學影像(如數位病理影像)為例,醫師在判讀醫學影像過程中(如判斷是否有癌細胞),當找到癌細胞時,必須先中斷判讀來對癌細胞進行標記與註解,若同時在跟病患進行說明或進行教學時,會造成極大的不便。 Take the interpretation of medical images (such as digital pathology images) as an example. During the process of interpreting medical images (such as determining whether there are cancer cells), when a doctor finds cancer cells, he must first interrupt the interpretation to mark and annotate the cancer cells. If at the same time This can cause great inconvenience when explaining or teaching to patients.
對此,本發明主要是提供一種透過問答生成訓練資料的系統與方法。本發明主要原理在於,(以人工或AI技術)預先對待分析影像(即後述之目標影像)的一或多個感興趣區域(即後述之顯示視野)設定與此區域有關的問題。 In this regard, the present invention mainly provides a system and method for generating training data through question and answer. The main principle of the present invention is to (using manual or AI technology) preset one or more areas of interest (ie, the display field of view to be described later) of the image to be analyzed (ie, the target image to be described later) and set questions related to this area.
並且,於用戶(例如是醫師、專家或檢查人員等)判讀目標影像過程中,當瀏覽至已設定問題的感興趣區域時,本發明會自動呈現問題,用戶可以知道目前的顯示視野為感興趣區域,而可以更為專注在目前的顯示視野的觀察。並且,用戶僅需簡單回答與目前區域有關的問題,即能完成此感興趣區域的訓練參數的輸入,由於不會中斷判讀,不會花費過多時間,進而提升用戶進行回饋的意願,而提升訓練資料的精確性。 Moreover, when the user (such as a doctor, expert or examiner) interprets the target image, when browsing to the area of interest for which the question has been set, the present invention will automatically present the question, and the user can know that the current displayed field of view is the area of interest. area, and you can focus more on the current display field of view. Moreover, the user only needs to simply answer questions related to the current area to complete the input of training parameters for this area of interest. Since the interpretation will not be interrupted, it will not take too much time, thus increasing the user's willingness to give feedback and improving training. Accuracy of data.
並且,本發明由於免除了用戶手動標記與註解,可以大幅降低用戶負擔,讓用戶可以更為專心進行判讀,而提升判讀的精確性。 Moreover, since the present invention eliminates the need for manual marking and annotation by the user, it can significantly reduce the user's burden, allowing the user to concentrate more on interpretation, thereby improving the accuracy of interpretation.
並且,本發明經由提出與目前瀏覽區域有關的問題,可讓用戶在回答問題時,注意到此問題有關的觀察點,而提升判讀的精確性。 Furthermore, by raising questions related to the current browsing area, the present invention allows the user to pay attention to the observation points related to the question when answering the question, thereby improving the accuracy of interpretation.
請參閱圖1,為本發明一實施例的系統的架構圖。本發明的透過問答生成訓練資料的系統可包括平台端10(例如是伺服器、雲端服務平台、桌
機、筆記型電腦等通用電腦或其任意組合)、資料庫12(例如是網路資料庫、本地資料庫、關聯式資料庫等資料庫或其組合)與AI模組13。
Please refer to FIG. 1 , which is an architectural diagram of a system according to an embodiment of the present invention. The system for generating training data through question and answer of the present invention may include a platform 10 (for example, a server, a cloud service platform, a desktop
General-purpose computers such as computers and laptops or any combination thereof), database 12 (for example, a network database, a local database, a relational database and other databases or a combination thereof) and the
平台端10可包括儲存模組101、通訊模組102、人機介面103與處理模組100。
The
儲存模組101(例如是RAM、EEPROM、固態硬碟、磁碟硬碟、快閃記憶體等儲存裝置或其任意組合)用以儲存資料。通訊模組102(例如是網路介面卡,NIC)用以連接網路(如網際網路),並透過網路與外部設備(如資料庫12、AI模組13及/或用戶端11)通訊。人機介面103(包括輸入介面與輸出介面,如滑鼠、鍵盤、各式按鍵、觸控板、顯示器、觸控螢幕、投影模組等)用以與用戶進行互動。處理模組100(可為CPU、GPU、TPU、MCU等處理器或其任意組合),用以控制平台端並實現本發明所提出之功能。
The storage module 101 (for example, a storage device such as RAM, EEPROM, solid state drive, magnetic hard drive, flash memory or any combination thereof) is used to store data. The communication module 102 (such as a network interface card, NIC) is used to connect to a network (such as the Internet) and communicate with external devices (such as the
AI模組13可建置於伺服器或雲端服務平台(例如是Amazon Web Service、Google Cloud Platform或Microsoft Azure等),或建置於平台端10。AI模組13包括學習模型130。本發明所產生的訓練資料即是用來建立並訓練學習模型130,以提升學習模型130的精確度。
The
學習模型130是基於機器學習技術所建立並接受訓練,可自動分析輸入影像(如後述之目標影像)進行來產生輸入影像的特定顯示視野的預測知識資訊。前述預測知識資訊即可作為用戶分析輸入影像時的參考資訊。
The
於一實施例中,學習模型130包括VQA(Visual Question Answering,視覺問答)架構模型,能夠基於輸入影像的影像特徵(顯示視野)對文字特徵(問答集、標的資訊等)進行訓練。
In one embodiment, the
於另一實施例中,學習模型130包括DQN(Deep Q-Learning,深度Q學習)架構模型,能夠參考用戶瀏覽影像的流程,在每次顯示視野調整後,對於標的的量化數值(標的資訊)進行調整,並在最終產生成AI自動瀏覽影像的
預測標的資訊。具體而言,可先透過DQN架構模型產生對於輸入影像每個瀏覽動作(可以是用戶輸入的瀏覽動作或是自動產生的預測瀏覽動作)的分數(Q-Value),依據分數最高的瀏覽動作模擬對輸入影像的瀏覽以變換顯示視野,並將變換後的顯示視野的影像再次輸入至DQN架構模型,以獲得下一瀏覽動作與其顯示視野,以此類推,直到停止瀏覽動作,並可基於顯示視野(可以是所變換的任一個顯示視野或是最後的顯示視野)的影像分析預測標的資訊。
In another embodiment, the
舉例來說,請參閱圖3至圖5,圖3為本發明一實施例的學習模型的輸出入的示意圖,圖4為本發明一實施例的學習模型的輸出入的示意圖,圖5為本發明一實施例的學習模型的輸出入的示意圖。 For example, please refer to Figures 3 to 5. Figure 3 is a schematic diagram of the input and output of the learning model according to an embodiment of the present invention. Figure 4 is a schematic diagram of the input and output of the learning model according to an embodiment of the present invention. Figure 5 is a schematic diagram of the input and output of the learning model according to an embodiment of the present invention. A schematic diagram of the input and output of a learning model according to an embodiment of the invention.
於一實施例中,如圖3所示,當將目標影像的一顯示視野的影像(即目標影像的特定子影像)輸入至學習模型130後,學習模型130可自動分析並產生預測標的資訊(如透過DQN架構模型)。前述標的資訊是對顯示視野的影像特徵進行分析後生成的分析結果,並用來描述此影像特徵的特性,如影像特徵的類型(例如是細胞的病變種類)、範圍(如所占比例)或程度(例如是嚴重、輕微、不影響)等。
In one embodiment, as shown in FIG. 3 , after an image displaying the field of view of the target image (ie, a specific sub-image of the target image) is input to the
於一實施例中,如圖4所示,當將目標影像的一顯示視野的影像與此顯示視野所對應的問題輸入至學習模型130後,學習模型130可自動分析並產生預測答案(如透過VQA架構模型)。更進一步地,學習模型130還可產生有關於此顯示視野的新問題,藉以於用戶下次瀏覽相同顯示視野時,可以回答問題(包括新問題)來補強學習模型130對於此顯示視野的訓練完整度。
In one embodiment, as shown in FIG. 4 , after an image of a display field of view of the target image and a question corresponding to the display field of view are input to the
於一實施例中,如圖5所示,當將目標影像的一顯示視野的影像輸入至學習模型130後,學習模型130可自動分析並產生預測動作歷程(如透過DQN架構模型)。具體而言,可藉由於學習模型130的訓練過程中添加的一或多個用戶對此目標影像的動作歷程(即對目標影像的的多個瀏覽動作的組合),讓
學習模型130可以歸納出適合此目標影像的瀏覽方式並提供經驗不足的用戶做為參考。
In one embodiment, as shown in FIG. 5 , after an image of a display field of view of the target image is input to the
請一併參閱圖1、圖2與6,圖2為本發明一實施例的平台端的部分架構圖,圖6為本發明一實施例的方法的流程圖。 Please refer to Figures 1, 2 and 6 together. Figure 2 is a partial architecture diagram of the platform side according to an embodiment of the present invention, and Figure 6 is a flow chart of a method according to an embodiment of the present invention.
於一實施例中,處理模組100可包括模組104-105、20-22、200-202、210-213、220-221,AI模組13可包括模組131-132。這些模組分別被設定來執行不同的功能。
In one embodiment, the
前述模組是相互連接(可為電性連接與資訊連接),並可為硬體模組(例如是電子電路模組、積體電路模組、SoC等等)、軟體模組(例如是韌體、作業系統或應用程式)或軟硬體模組混搭,不加以限定。 The aforementioned modules are interconnected (can be electrical connections and information connections), and can be hardware modules (such as electronic circuit modules, integrated circuit modules, SoC, etc.), software modules (such as firmware modules, etc.) body, operating system or application) or a mix and match of software and hardware modules, without limitation.
值得一提的是,當前述模組為軟體模組(例如是韌體、作業系統或應用程式)時,儲存模組101與AI模組13可包括非暫態電腦可讀取記錄媒體,前述非暫態電腦可讀取記錄媒體儲存有電腦程式,電腦程式記錄有電腦可執行之程式碼,當處理模組100與AI模組13執行前述程式碼後,可實現對應模組之功能。
It is worth mentioning that when the aforementioned module is a software module (such as firmware, operating system or application), the
本實施例的透過問答生成訓練資料的方法包括以下步驟。 The method of generating training data through question and answer in this embodiment includes the following steps.
步驟S10:於平台端10透過影像瀏覽與記錄模組20提供目標影像的瀏覽。前述目標影像可儲存於資料庫12或由用戶上傳。
Step S10: Provide browsing of the target image through the image browsing and
具體而言,平台端10可提供瀏覽介面(如GUI),並於瀏覽介面中顯示目標影像。
Specifically, the
於一實施例中,用戶可操作用戶端11(例如是桌機、筆記型電腦、平板、智慧型手機等通用電腦)的人機介面110(與人機介面103相似,於此不再贅述)來連接平台端10,以使用平台端10所提供的瀏覽服務來瀏覽目標影像。
In one embodiment, the user can operate the human-machine interface 110 (similar to the human-
於一實施例中,用戶可直接操作平台端10的人機介面103來使用瀏覽服務。
In one embodiment, the user can directly operate the human-
步驟S11:平台端10透過影像瀏覽與記錄模組20接受用戶的瀏覽操作(可為本地操作或遠端操作)來調整目標影像的顯示視野,如放大、縮小、平移、旋轉等。
Step S11: The
步驟S12:於變更顯示視野後,平台端10透過知識獲取模組21取得目前的顯示視野所對應的問題,並呈現此問題。前述問題是用來分析目前的顯示視野。
Step S12: After changing the display field of view, the
接著,用戶閱讀完問題後,可參考目前的顯示視野與過往經驗,快速輸入回應答案以回答此問題,而使得平台端10取得此問題的回應答案。藉此,本發明即可獲得目前的顯示視野的一組問答集(包括問題與回應答案),即獲得目前的顯示視野的訓練參數。
Then, after the user has finished reading the question, he or she can quickly input a response answer to answer the question by referring to the current display field of view and past experience, so that the
步驟S13:平台端透過知識記錄與處理模組22關聯顯示視野與其對應的問答集,來產生訓練資料。前述訓練資料是用來提供給學習模型130進行自動分析目標影像的訓練。
Step S13: The platform associates the display field of view with its corresponding question and answer set through the knowledge recording and
藉此,用戶僅需於目標影像的瀏覽過程中簡單回答問題,即可供本發明自動建立此目標影像的訓練資料。 In this way, the user only needs to simply answer questions during the browsing process of the target image, and the present invention can automatically create training data for the target image.
請一併參閱圖1-2、圖7A-7B、圖9-11,圖7A為本發明一實施例的方法的第一部分流程圖,圖7B為本發明一實施例的方法的第二部分流程圖,圖9為本發明一實施例的目標影像的顯示視野的示意圖,圖10為圖9的另一顯示視野的示意圖,圖11為圖9的另一顯示視野的示意圖。 Please refer to Figures 1-2, Figures 7A-7B, and Figures 9-11 together. Figure 7A is the first part of the flow chart of the method according to one embodiment of the present invention, and Figure 7B is the second part of the flow chart of the method according to one embodiment of the present invention. FIG. 9 is a schematic diagram of a display field of view of a target image according to an embodiment of the present invention. FIG. 10 is a schematic diagram of another display field of view of FIG. 9 . FIG. 11 is a schematic diagram of another display field of view of FIG. 9 .
於本實施例中,平台端10透過web模組104(如網頁伺服器模組)來提供網頁瀏覽服務以與用戶端11進行互動。
In this embodiment, the
本實施例的透過問答生成訓練資料的方法包括以下步驟。 The method of generating training data through question and answer in this embodiment includes the following steps.
步驟S20:平台端10透過web模組104來於用戶端11提供瀏覽介面來以提供瀏覽目標影像的網頁服務。
Step S20: The
於一實施例中,步驟S20可包括以下步驟S30-S34。 In one embodiment, step S20 may include the following steps S30-S34.
步驟S30:用戶端11透過網頁登入平台端10。
Step S30: The
於一實施例中,資料庫12可包括用戶資料庫123,用戶資料庫123用以儲存不同用戶(或用戶端11)的註冊資料(例如是帳號密碼、網路位址、硬體位址、裝置識別碼、數位簽章、Dongle Key等可識別資訊)。用戶可操作用戶端11來提供登入資料(例如是帳號密碼、網路位址、硬體位址、裝置識別碼、數位簽章、Dongle Key等可識別資訊)至平台端10。
In one embodiment, the
接著,平台端10透過管理模組105比對登入資料與各註冊資料,並於登入資料與任一註冊資料相符時判定通過驗證並允許登入。
Then, the
步驟S31:於成功登入後,平台端10透過web模組104提供瀏覽介面(如網頁程式)至用戶端11以呈現瀏覽介面。
Step S31: After successful login, the
於一實施例中,資料庫12包括影像庫120,影像庫120儲存有多張影像,如電子元件影像、醫學影像、監視器影像或其他需要判讀的影像。
In one embodiment, the
步驟S32:用戶端11可透過網頁自影像庫120中選擇多個影像的其中之一作為目標影像(即接下來要判讀的影像),或者上傳影像至平台端10作為目標影像。
Step S32: The
接著,平台端10透過影像檢視模組200匯入此目標影像。
Then, the
步驟S33:平台端10透過影像資訊處理模組202依據預設的視野參數(如預設的顯示基準點、縮放等級及/或旋轉角度)來決定要呈現的目標影像的顯示視野,即決定要呈現的目標影像的範圍。
Step S33: The
前述顯示視野的顯示基準點可為影像座標(如像素位置),是表示此顯示視野的基準點(如影像中心點或任一影像邊界點)於整張目標影像中的位 置。前述顯示視野的縮放等級是表示此顯示視野的目前的放大或縮小的等級(此數值會依據不同的解析度的影像,依比例對應到實際放大或縮小的倍率)。前述顯示視野的旋轉角度是表示此顯示視野相對於目標影像的角度差。 The display reference point of the aforementioned display field of view can be an image coordinate (such as a pixel position), which represents the position of the reference point of this display field of view (such as the image center point or any image boundary point) in the entire target image. Set. The aforementioned zoom level of the display field of view represents the current magnification or reduction level of the display field of view (this value will correspond to the actual magnification or reduction ratio in proportion to the images of different resolutions). The aforementioned rotation angle of the display field of view represents the angle difference between the display field of view and the target image.
舉例來說,平台端10可被設定為:縮放等級為1時,實際放大倍率為5倍;縮放等級為5時,實際放大倍率為40倍。
For example, the
於另一例子,平台端10可被設定為:縮放等級為1時,實際放大倍率為0.25倍(如縮小以使整張目標影像進入顯示視野);縮放等級為5時,實際放大倍率為1倍(如等比例瀏覽)。
In another example, the
本發明之縮放等級與實際放大倍率之間的關係(如比例)可視需求而定,不加以限定。 The relationship (such as the ratio) between the zoom level and the actual magnification of the present invention can be determined according to the requirements and is not limited.
藉此,基於顯示基準點與縮放等級,或顯示基準點、縮放等級與旋轉角度,影像資訊處理模組202可決定視野範圍,如以包圍框(bounding box)計算視野範圍。
Thereby, based on the display reference point and zoom level, or the display reference point, zoom level and rotation angle, the image
步驟S34:平台端10透過網路傳送所決定的顯示視野的影像傳送至用戶端11,以於瀏覽介面上呈現此顯示視野。
Step S34: The
步驟S21:用戶端11透過瀏覽介面調整目標影像的顯示視野。
Step S21: The
於一實施例中,步驟S21可包括以下步驟S40-S42。 In one embodiment, step S21 may include the following steps S40-S42.
步驟S40:用戶端11透過瀏覽介面輸入瀏覽操作來調整顯示視野,平台端10透過影像操作模組201蒐集用戶的瀏覽操作,並轉換為各視野參數的調整量(即決定調整後的視野參數)。
Step S40: The
於一實施例中,瀏覽介面可顯示各種操作按鈕(如左移、右移、上移、下移、放大、縮小、順時針旋轉、逆時針旋轉等),影像操作模組201監控這些操作按鈕來蒐集用戶的瀏覽操作。
In one embodiment, the browsing interface can display various operation buttons (such as left move, right move, up move, down move, zoom in, zoom out, clockwise rotation, counterclockwise rotation, etc.), and the
於一實施例中,影像操作模組201可監控用戶端11的鍵盤或滑鼠動作來蒐集用戶的瀏覽操作(如監測滑鼠或鍵盤訊號來對應觸發調整放大等級(以縮放倍率)、平移顯示基準點與旋轉影像)。
In one embodiment, the
步驟S41:平台端10透過影像資訊處理模組202基於調整後的視野參數決定新的顯示視野,即決定要呈現的目標影像的新範圍。
Step S41: The
步驟S42:平台端10透過網路輸出調整後的顯示視野的影像傳送至用戶端11,以於瀏覽介面上呈現調整後的顯示視野。
Step S42: The
於一實施例中,平台端10於傳輸顯示視野的影像至用戶端11前,可先進行影像壓縮(尤其是失真影像壓縮)來適當地減少影像細節並降低解析度,藉以適度減少影像的大小而有利於網路傳輸。
In one embodiment, before transmitting the image of the display field of view to the
值得一提的是,當目標影像的過大(如解析度過高)時,傳送整張目標影像至用戶端11必須耗費大量傳輸時間,且用戶端11的可能因為硬體效能不足,無法順暢地瀏覽整張目標影像。
It is worth mentioning that when the target image is too large (such as the resolution is too high), it will take a lot of transmission time to send the entire target image to the
對此,本發明僅傳送顯示視野的影像(並可適度壓縮)至用戶端11,可確保用戶端11順暢地瀏覽整張目標影像。並且,當用戶端11希望觀看影像細節時,可執行放大的瀏覽操作,來使平台端10回傳區域放大後的影像以供順暢瀏覽,而可提升用戶體驗。
In this regard, the present invention only transmits the image of the display field of view (and can be moderately compressed) to the
於一實施例中,目標影像可被分割為多個圖磚,平台端10是以圖磚作為最小的傳送單位。
In one embodiment, the target image can be divided into multiple tiles, and the
舉例來說,如圖9所示,顯示基準點是影像中心,目標影像30分割為48個圖磚,且各圖磚有對應的識別碼(如座標)。 For example, as shown in Figure 9, the display reference point is the center of the image, the target image 30 is divided into 48 tiles, and each tile has a corresponding identification code (such as coordinates).
並且,預設的視野參數的顯示視野可以是全範圍,即顯示基準點40為(4,3),放大等級為1,旋轉角度為0度。平台端10會將顯示視野(全範圍)
中的所有圖磚(共48個)傳送至用戶端11,藉以於瀏覽介面顯示完整的目標影像30。
Moreover, the display field of view of the preset field of view parameter may be the full range, that is, the display reference point 40 is (4,3), the magnification level is 1, and the rotation angle is 0 degrees.
接著,當用戶希望將顯示視野調整至範圍50時,可輸入瀏覽操作。如圖10所示,調整後的顯示基準點41為(2.5,3.5),放大等級為2,旋轉角度為0度。平台端10會將調整後的顯示視野中的所有圖磚(共15個)傳送至用戶端11,藉以於瀏覽介面顯示調整後的顯示視野。
Then, when the user wants to adjust the display field of view to the range 50, he or she can input a browsing operation. As shown in Figure 10, the adjusted display reference point 41 is (2.5, 3.5), the magnification level is 2, and the rotation angle is 0 degrees. The
接著,當用戶希望將顯示視野調整至範圍51時,可輸入瀏覽操作。如圖11所示,調整後的顯示基準點42為(1.5,4.5),放大等級為3,旋轉角度為0度。平台端10會將調整後的顯示視野中的所有圖磚(共6個)傳送至用戶端11,藉以於瀏覽介面顯示調整後的顯示視野。
Next, when the user wishes to adjust the display field of view to the range 51, a browsing operation can be input. As shown in Figure 11, the adjusted display reference point 42 is (1.5, 4.5), the magnification level is 3, and the rotation angle is 0 degrees. The
藉由上述方式,本發明可有效提供目標影像的瀏覽服務。 Through the above method, the present invention can effectively provide browsing services of target images.
接著執行步驟S22:平台端10透過知識獲取模組21判斷針對目前的顯示視野是否有對應的問題需讓用戶回答,若有的話則執行問答功能,若沒有對應的問題,則不執行問答。
Next, step S22 is performed: the
於一實施例中,資料庫12包括知識庫121。知識庫121可儲存對應目標影像的不同的顯示視野的知識資訊。前述知識資訊可包括多個待回答問題、多個問答集(已回答問題)、多個標的資訊及/或動作資訊等。
In one embodiment, the
前述各問題是針對所對應的顯示視野的影像特徵(如顯示視野中的物件的形狀、種類、顏色、變化等特徵)所預先設定的,或者透過機器學習自動產生的(如圖4),並且是用來分析影像特徵。 Each of the aforementioned questions is preset for the image characteristics of the corresponding display field of view (such as the shape, type, color, change and other characteristics of objects in the display field of view), or is automatically generated through machine learning (as shown in Figure 4), and It is used to analyze image features.
於一實施例中,各顯示視野被關聯至一組識別碼(可基於所涵蓋的圖磚的識別碼來加以決定,基於對應的視野參數所決定,或依序編號等,不加以限定)。並且,與此顯示視野有關的知識資訊都被對應至相同的識別碼。 In one embodiment, each display field of view is associated with a set of identification codes (which can be determined based on the identification codes of the covered tiles, determined based on the corresponding field of view parameters, or sequentially numbered, etc., without limitation). Moreover, the knowledge information related to the display field of view is all mapped to the same identification code.
於一實施例中,步驟S21可包括以下步驟S40-S42。 In one embodiment, step S21 may include the following steps S40-S42.
步驟S40:平台端10透過問答模組210來基於目前的顯示視野的識別碼自知識庫121取得對應的問題。
Step S40: The
步驟S41:平台端10透過問答模組210於用戶端11的瀏覽介面顯示所取得的問題,可提供回答介面來供用戶作答。
Step S41: The
步驟S42:平台端10透過問答模組210透過回答介面接收用戶輸入的回應答案。
Step S42: The
舉例來說,請參閱圖10,當用戶瀏覽到圖10的顯示視野時,問答模組210可顯示問答介面60或61。
For example, please refer to Figure 10. When the user browses to the display view of Figure 10, the question and
以問答介面60為例,問題為開放式問題(如問答題),問答介面60包括文字輸入區,用戶是於文字輸入區中輸入文字內容作為回應答案。 Taking the question and answer interface 60 as an example, the question is an open-ended question (such as a question and answer question). The question and answer interface 60 includes a text input area, and the user inputs text content in the text input area as a response answer.
其他開放式問答可例如為:「問:影像中哪個器官系統異常?答:心血管」、「問:此處以細胞角蛋白7染色的膽管細胞與赫氏小管是什麼?答:免疫組織化學染色」、「問:圖中電子元件如何連接?答:錫焊」等,不加以限定。 Other open-ended questions and answers could be, for example: "Q: Which organ system is abnormal in the image? A: Cardiovascular", "Q: What are the bile duct cells and Heschmann's tubules stained with cytokeratin 7 here? A: Immunohistochemistry staining ”, “Q: How are the electronic components in the picture connected? Answer: Soldering”, etc., are not limited.
以問答介面61為例,問題為封閉式問題(如選擇題),問答介面60包括多個答案選項,用戶是選擇一或多個答案選項作為回應答案。 Taking the question and answer interface 61 as an example, the question is a closed question (such as a multiple-choice question). The question and answer interface 60 includes multiple answer options, and the user selects one or more answer options as a response answer.
其他封閉式問答可例如為:「問:是否組織蛋白亞基帶正電而使帶負電的DNA更緊密穩定?選項:是;否」、「問:這裡是腦栓塞區域嗎?選項:是;否」、「問:圖中有幾個電子元件?選項:1個;2個;3個」等,不加以限定。 Other closed-ended questions and answers may be, for example: "Q: Do histone subunits have positive charges that make negatively charged DNA more compact and stable? Options: Yes; No", "Q: Is this a cerebral embolism area? Options: Yes; No ", "Q: How many electronic components are there in the picture? Options: 1; 2; 3", etc., without limitation.
於一實施例中,本發明進一步提供標的設定功能,具體而言,於任一顯示視野下,步驟S23:平台端10透過標的模組212可以輸入介面來接受用戶端11觀看目前的顯示視野後所設定的標的資訊。前述標的資訊是作為用戶對全影像的量化特徵的人工分析結果,並作為此目標影像的訓練參數。
In one embodiment, the present invention further provides a target setting function. Specifically, in any display field of view, step S23: the
以判斷癌細胞的比例為例,當專家以低倍率(視野較廣)觀看病理影像時,可能認為癌細胞比例高(如推估80%),而設定此病理影像的標的資訊為「癌細胞比例80%」。於調整顯示視野後,如以中倍率(視野較窄)觀看病理影像時,可能發現部分細胞並非癌細胞(如推估60%),此時可修正此病理影像的標的資訊為「癌細胞比例60%」。 Take judging the proportion of cancer cells as an example. When experts view pathological images at low magnification (wide field of view), they may think that the proportion of cancer cells is high (such as an estimated 80%), and set the target information of this pathological image as "cancer cells." The ratio is 80%.” After adjusting the display field of view, if you view the pathological image at medium magnification (narrow field of view), you may find that some cells are not cancer cells (e.g., an estimated 60%). At this time, you can modify the target information of this pathological image to "proportion of cancer cells." 60%".
於一實施例中,本發明進一步提供知識資訊提示功能,具體而言,於任一顯示視野下,步驟S24:知識處理與提供模組213判斷目前的顯示視野於知識庫121中是否存在對應的知識資訊。若有,則於瀏覽介面顯示此知識資訊,以對此顯示視野進行說明。
In one embodiment, the present invention further provides a knowledge information prompting function. Specifically, in any display field of view, step S24: the knowledge processing and providing
於一實施例中,影像資訊處理模組202於每次決定顯示視野後,可生成此顯示視野的影像瀏覽資訊(如識別參數或識別碼),知識獲取模組21可基於影像瀏覽資訊搜尋知識庫121來偵測是否存在與此顯示視野有關的知識資訊(如先前輸入的問答集或整張目標影像的標的資訊,或AI自動判讀的內容)。
In one embodiment, the image
舉例來說,請參閱圖11,當用戶瀏覽到圖11的顯示視野時,知識獲取模組21找到目前的顯示視野於知識庫有對應的知識資訊(異常比例70%),可將知識資訊顯示於標的介面62。此外,若用戶判斷知識資訊不正確時,亦可透過輸入介面進行修改(如修改為90%),來使標的模組212設定新的標的資訊進行學習。
For example, please refer to Figure 11. When the user browses to the display field of view in Figure 11, the
於一實施例中,本發明進一步提供瀏覽動作判斷功能,具體而言,於每次切換顯示視野時,動作模組211可取得前一個顯示視野的前一個顯示基準點、前一個縮放等級或前一個旋轉角度,並與目前的顯示視野的顯示基準點、縮放等級或旋轉角度進行比較,以決定從前一個顯示視野切換至目前的顯示視野的瀏覽動作(例如是平移方向、平移量、放大等級、旋轉方向、旋轉角
度等),並將瀏覽動作加入至動作資訊,即依據兩組視野參數的變化來判斷瀏覽動作的內容。
In one embodiment, the present invention further provides a browsing action judgment function. Specifically, each time the display field of view is switched, the
接著,執行步驟S25:平台端10透過訓練資料產生模組221關聯顯示視野與對應的知識資訊(例如是用戶完成的問答集、標的資訊及/或動作資訊)來產生訓練資料。
Next, step S25 is executed: the
於一實施例中,本發明進一步提供記錄功能(步驟S26-S27)。 In one embodiment, the present invention further provides a recording function (steps S26-S27).
步驟S26:於瀏覽過程中,平台端10透過知識記錄與處理模組22判斷預設的記錄條件是否滿足。前述記錄條件可為每次變換顯示視野、每次操作後或定時(如10秒),不加以限定。
Step S26: During the browsing process, the
若記錄條件不滿足,則持續監測。 If the recording conditions are not met, monitoring will continue.
若記錄條件滿足,則執行步驟S27:平台端10透過影像擷取模組220基於前述之影像瀏覽資訊擷取顯示視野的影像,知識記錄與處理模組22記錄顯示視野的影像與知識資訊(如用戶完成的問答集、標的資訊及/或動作資訊),並進行關聯。
If the recording conditions are met, step S27 is executed: the
於一實施例中,資料庫12包括瀏覽記錄122。知識記錄與處理模組22是關聯顯示視野的視野參數與知識資訊,並作為瀏覽歷程(可包括動作歷程、所有顯示視野的問答集與標的資訊)記錄於瀏覽記錄122。
In one embodiment,
請一併參閱圖8,為本發明一實施例的方法的訓練與自動分析的流程圖。本發明進一步提供訓練功能(步驟S60)與自動分析功能(步驟S61)。 Please also refer to FIG. 8 , which is a flow chart of training and automatic analysis of the method according to an embodiment of the present invention. The present invention further provides a training function (step S60) and an automatic analysis function (step S61).
步驟S60:AI模組13依據訓練資料對學習模型130進行訓練。於一實施例中步驟S60包括步驟S70-S73。
Step S60: The
步驟S70:AI模組13載入取得學習模型130。
Step S70: The
步驟S71:AI模組13透過訓練模組132對訓練資料進行分析。
Step S71: The
於一實施例中,訓練資料包括顯示視野的影像與其對應的知識資訊(例如是問答集、動作資訊、標的資訊等)。訓練模組132是分析顯示視野的影像特徵,並基於知識資訊(如問答集的問題與用戶輸入的標的資訊)產生文字特徵,即將知識資訊轉換至語意網。
In one embodiment, the training data includes images showing visual fields and corresponding knowledge information (such as question and answer sets, action information, target information, etc.). The
於一實施例中,AI模組13可先透過資料轉換模組131將訓練資料轉換為訓練模組132及/或學習模型130可接受的格式,再進行分析與訓練。
In one embodiment, the
步驟S72:訓練模組132輸入影像特徵與文字特徵至學習模型130以產生預測知識資訊(例如是問題的預測答案、預測標的資訊或預測動作資訊等)。
Step S72: The
步驟S73:訓練模組132比較預測知識資訊與訓練資料的知識資訊來訓練學習模型。如比較預測答案與用戶的回應答案,比較預測標的資訊與用戶輸入的標的資訊,或比較預測動作資訊與用戶實際的瀏覽動作等,不加以限定。
Step S73: The
步驟S61:AI模組13接收影像輸入來產生對應的預測知識資訊。於一實施例中步驟S61包括步驟S80-S81。
Step S61: The
步驟S80:AI模組13接受(例如是用戶端11或平台端10)操作來選擇一張影像(另一目標影像)。
Step S80: The
步驟S81:AI模組13輸入此目標影像至學習模型130,以獲得此目標影像的預測知識資訊(可包括此目標影像的一或多個顯示視野的預測動作歷程、預測問題與預測標的資訊),並可儲存於知識庫121。
Step S81: The
請參閱圖12,為本發明一實施例的動作歷程的示意圖。於完成目標影像的歷程記錄後,本發明的平台端10可提供回放功能,於瀏覽介面中顯示目標影像的動作歷程70、顯示視野71與縮圖介面72。用戶端可於事後選擇動
作歷程70的任一步驟進行觀看,或者連續撥放動作歷程70的多個步驟,而重現分析時的瀏覽路徑。
Please refer to FIG. 12 , which is a schematic diagram of the action process of an embodiment of the present invention. After completing the history recording of the target image, the
請參閱圖13,為本發明一實施例的熱點圖的示意圖。 Please refer to FIG. 13 , which is a schematic diagram of a heat map according to an embodiment of the present invention.
於對目標影像進行分析後,本發明的平台端10可提供熱點圖功能,可顯示目標影像在演算法分析的過程中,哪些部分是模型關注的區域。
After analyzing the target image, the
如圖所示,平台端10於瀏覽介面中顯示目標影像的顯示視野介面80與縮圖介面81。於顯示視野介面80中,對於演算法分析信心程度較高的區域,會以較接近淺色熱區801表示,對於演算法分析信心程度較低的區域,會以深色熱區802表示,此外白色區域800指的是背景區域。
As shown in the figure, the
藉此,用戶或其他用戶可直接了解目標影像的各位置的重要性,而可直接忽略白色熱區800、深色熱區802(不重要區域)的瀏覽來節省時間,並更專注在淺色熱區801(重要區域)的瀏覽來提升精確度。 In this way, the user or other users can directly understand the importance of each position of the target image, and can directly ignore the browsing of the white hot area 800 and the dark hot area 802 (unimportant area) to save time and focus more on the light color. Browse hot area 801 (important area) to improve accuracy.
本發明透過於指定的顯示視野提示問題或知識資訊,可以讓專家知悉目前的顯示視野為感興趣區域,而更為專注在目前的顯示視野的分析。 By prompting questions or knowledge information in a designated display field of view, the present invention can allow experts to know that the current display field of view is an area of interest and focus more on the analysis of the current display field of view.
本發明透過網頁提供目標影像的瀏覽服務(web模組104),可讓世界各地的專家透過網頁瀏覽器來觀看相同的目標影像,並對相同的目標影像進行瀏覽操作(包括回答問題與設定標的資訊),而可以收集到不同專家對於相同的目標影像的不同知識資訊(即訓練資料)。 The present invention provides a target image browsing service (web module 104) through a web page, allowing experts around the world to view the same target image through a web browser and perform browsing operations on the same target image (including answering questions and setting targets). information), and different knowledge information (i.e., training data) of different experts on the same target image can be collected.
並且,後續藉由彙整這些知識資訊,提供給AI模組13進行訓練,作為學習模型130的訓練資料,而可以生成專家驗證且高可靠度的學習模型130。未來其他專家(如病理醫師)進行影像判讀時,可先參考AI模組13判讀的成果(如預測知識資訊)作為後續病理報告撰寫的依據,由於參考來源一致,可以提升專家們最終判讀的結果的一致性。
Moreover, by subsequently collecting this knowledge information and providing it to the
本發明透過網頁提供目標影像的瀏覽服務還可降低實體玻片運輸的時間與損壞風險。此外,透過提示目前顯示視野的知識資與接受標的資訊的輸入,可完整記錄專家在閱片過程中的想法,且可分享給其他專家觀看,有助於讓不同專家理解彼此的分析想法與策略。 The present invention provides a browsing service of target images through a web page and can also reduce the time and damage risk of physical slide transportation. In addition, by prompting the knowledge of the currently displayed field of view and accepting the input of target information, the expert's thoughts during the image reading process can be completely recorded and shared with other experts for viewing, which helps different experts understand each other's analysis ideas and strategies. .
當然,本發明還可有其它多種實施例,在不背離本發明精神及其實質的情況下,本發明所屬技術領域中具有通常知識者當可根據本發明作出各種相應的改變和變形,但這些相應的改變和變形都應屬於本發明所附的申請專利範圍。 Of course, the present invention can also have various other embodiments. Without departing from the spirit and essence of the present invention, those with ordinary knowledge in the technical field to which the present invention belongs can make various corresponding changes and modifications according to the present invention. However, these Corresponding changes and deformations shall fall within the scope of the patent application attached to the present invention.
S10-S13:第一訓練資料產生步驟 S10-S13: First training data generation step
Claims (20)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062971236P | 2020-02-07 | 2020-02-07 | |
US62/971,236 | 2020-02-07 |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202131352A TW202131352A (en) | 2021-08-16 |
TWI832032B true TWI832032B (en) | 2024-02-11 |
Family
ID=77180828
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110103801A TWI832032B (en) | 2020-02-07 | 2021-02-02 | System of generating training data by questions and answers and method thereof |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113254608A (en) |
TW (1) | TWI832032B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113254608A (en) * | 2020-02-07 | 2021-08-13 | 台达电子工业股份有限公司 | System and method for generating training data through question answering |
TWI793865B (en) * | 2021-11-18 | 2023-02-21 | 倍利科技股份有限公司 | System and method for AI automatic auxiliary labeling |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105393264A (en) * | 2013-07-12 | 2016-03-09 | 微软技术许可有限责任公司 | Interactive segment extraction in computer-human interactive learning |
CN107330238A (en) * | 2016-08-12 | 2017-11-07 | 中国科学院上海技术物理研究所 | Medical information collection, processing, storage and display methods and device |
CN109583440A (en) * | 2017-09-28 | 2019-04-05 | 北京西格码列顿信息技术有限公司 | It is identified in conjunction with image and reports the medical image aided diagnosis method edited and system |
TW201941218A (en) * | 2018-01-08 | 2019-10-16 | 美商普吉尼製藥公司 | Systems and methods for rapid neural network-based image segmentation and radiopharmaceutical uptake determination |
US20190313986A1 (en) * | 2016-11-16 | 2019-10-17 | The General Hospital Corporation | Systems and methods for automated detection of objects with medical imaging |
US10489736B2 (en) * | 2015-03-16 | 2019-11-26 | Swarm Vision, Inc | Behavioral profiling with actionable feedback methodologies and systems |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI571762B (en) * | 2012-11-08 | 2017-02-21 | 國立台灣科技大學 | Real time image cloud system and management method |
CN109830284A (en) * | 2017-11-23 | 2019-05-31 | 天启慧眼(北京)信息技术有限公司 | The analysis method and device of medical image |
CN108491421B (en) * | 2018-02-07 | 2021-04-16 | 北京百度网讯科技有限公司 | Method, device and equipment for generating question and answer and computing storage medium |
US20210240931A1 (en) * | 2018-04-30 | 2021-08-05 | Koninklijke Philips N.V. | Visual question answering using on-image annotations |
CN109711434B (en) * | 2018-11-30 | 2021-07-09 | 北京百度网讯科技有限公司 | Method, apparatus, device and medium for acquiring and evaluating VQA system training data |
CN113254608A (en) * | 2020-02-07 | 2021-08-13 | 台达电子工业股份有限公司 | System and method for generating training data through question answering |
-
2021
- 2021-02-02 CN CN202110143784.3A patent/CN113254608A/en active Pending
- 2021-02-02 TW TW110103801A patent/TWI832032B/en active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105393264A (en) * | 2013-07-12 | 2016-03-09 | 微软技术许可有限责任公司 | Interactive segment extraction in computer-human interactive learning |
US10489736B2 (en) * | 2015-03-16 | 2019-11-26 | Swarm Vision, Inc | Behavioral profiling with actionable feedback methodologies and systems |
CN107330238A (en) * | 2016-08-12 | 2017-11-07 | 中国科学院上海技术物理研究所 | Medical information collection, processing, storage and display methods and device |
US20190313986A1 (en) * | 2016-11-16 | 2019-10-17 | The General Hospital Corporation | Systems and methods for automated detection of objects with medical imaging |
CN109583440A (en) * | 2017-09-28 | 2019-04-05 | 北京西格码列顿信息技术有限公司 | It is identified in conjunction with image and reports the medical image aided diagnosis method edited and system |
TW201941218A (en) * | 2018-01-08 | 2019-10-16 | 美商普吉尼製藥公司 | Systems and methods for rapid neural network-based image segmentation and radiopharmaceutical uptake determination |
Also Published As
Publication number | Publication date |
---|---|
TW202131352A (en) | 2021-08-16 |
CN113254608A (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20170060828A1 (en) | Gesture based annotations | |
WO2018000519A1 (en) | Projection-based interaction control method and system for user interaction icon | |
TWI832032B (en) | System of generating training data by questions and answers and method thereof | |
CN103092432A (en) | Trigger control method and system of man-machine interaction operating instruction and laser emission device | |
CN110765827A (en) | Teaching quality monitoring system and method | |
WO2019223056A1 (en) | Gesture recognition-based teaching and learning method and apparatus | |
CN111814733A (en) | Concentration degree detection method and device based on head posture | |
CN104598027B (en) | A kind of motion sensing control multi-media Training System based on user behavior analysis | |
Liu et al. | An improved method of identifying learner's behaviors based on deep learning | |
WO2019214019A1 (en) | Online teaching method and apparatus based on convolutional neural network | |
CN114332927A (en) | Classroom hand-raising behavior detection method, system, computer equipment and storage medium | |
KR20190027287A (en) | The method of mimesis for keyboard and mouse function using finger movement and mouth shape | |
Soroni et al. | Hand Gesture Based Virtual Blackboard Using Webcam | |
Juang et al. | Application of character recognition to robot control on smartphone test system | |
Patil et al. | Gesture Recognition for Media Interaction: A Streamlit Implementation with OpenCV and MediaPipe | |
Hendricks | Filmmakers' Attitudes and Intentions toward Adoption of Virtual Camera Systems in Virtual Production | |
Ni et al. | Classroom Roll Call System Based on Face Detection Technology | |
CN114546311A (en) | Multi-display-terminal screen projection system and method for intelligent classroom | |
Acharya | Virtual Mouse using Hand Gestures | |
Wang et al. | Virtual piano system based on monocular camera | |
CN114510591B (en) | Digital image generating and typesetting system and method | |
Qiao et al. | A Review of Attention Detection in Online Learning | |
JP6978815B1 (en) | Information processing system, information processing method and program | |
Céspedes-Hernández et al. | A methodology for gestural interaction relying on user-defined gestures sets following a one-shot learning approach | |
Aydin | Leveraging Computer Vision Techniques for Video and Web Accessibility |