WO2022114823A1

WO2022114823A1 - Method for assisting artificial intelligence reading using visual information-based reference search technology

Info

Publication number: WO2022114823A1
Application number: PCT/KR2021/017561
Authority: WO
Inventors: 박보규; 이현규; 도신호; 최용준
Original assignee: 두에이아이(주)
Priority date: 2020-11-27
Filing date: 2021-11-25
Publication date: 2022-06-02
Also published as: KR102304340B1; KR20220074711A

Abstract

Disclosed in an embodiment is a method for assisting artificial intelligence reading using a visual information-based reference search technology, performed by one or more processors of a computing apparatus. The method may comprise the steps of: acquiring an inspection image including one or more objects; performing classification on the objects by using a pre-trained classification model; performing a search for one or more similar images for the objects by using a pre-trained search model, according to the result of the classification on the objects; and providing the result of the search for the similar images.

Description

Artificial intelligence reading assistance method using visual information-based reference search technology

The present disclosure relates to an artificial intelligence reading assistance method, apparatus, and computer program using visual information-based reference search technology.

In the recent paradigm shift of the 4th industrial revolution, artificial intelligence is attracting attention as a key driver leading a new era. Artificial intelligence is a software technology that implements half of the human thinking process, such as cognition, learning, reasoning, and judgment, with algorithm design. In particular, in the medical field, as it becomes easier to secure large-scale medical big data due to the increase of ICT convergence medical devices, AI-based business using such big data is gradually spreading. Such artificial intelligence is used as a diagnostic aid in the medical field, contributing to maximizing the efficiency of diagnosis or reading.

As a specific example, as a method of diagnosing whether a patient has the disease, there is a method of collecting and testing decidual cells from the patient's body. A sample of decidual cells is collected from a patient, and slides are made through Papanicolat staining and slide encapsulation. Slides with abnormal findings in the primary speculum results are read by a pathologist secondary to confirm the diagnosis regarding lesions.

However, it takes a very long time for the screener to manually inspect multiple slides one by one. Moreover, there is a human limitation that the number of skilled pathologists is insufficient because the number of qualified screeners is quite small.

In addition, since the pathologist depends on his/her own experience and ability to perform a speculum, human error may occur depending on the condition at the time of the speculative due to the human limitations of the pathologist. In order to solve this problem, there have been attempts in the field to reduce errors by collecting the results of the primary speculum and reviewing random samples, but the cause of the problem cannot be solved structurally. In this regard, Korean Patent Laid-Open Patent Publication No. 2002-0084787 discloses a cervical cancer diagnosis system and method for performing diagnosis through cervical imaging information, and a cervical cancer imaging terminal suitable therefor.

In the background of these problems occurring in the field, there is a need for an electronic means capable of providing diagnostic results by consistently and reliably examining multiple slides.

Accordingly, a clinical decision support or auxiliary diagnosis system that detects and classifies a cell region using computer vision technology plays an essential role in automatic medical image analysis.

An object of the present disclosure is to solve the above problems, and to provide a computing device that performs an AI reading assistance method using a visual information-based reference search technology.

The problems to be solved by the present disclosure are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

Disclosed is an artificial intelligence reading assistance method using a visual information-based reference search technology performed in one or more processors of a computing device according to various embodiments of the present disclosure for solving the above problems. The method includes: obtaining an inspection image including one or more objects; performing classification on the object by using a previously learned classification model; It may include performing a search for one or more similar images for the object using .

According to an alternative embodiment, the performing of the similar image search may include: obtaining a visual feature of the object using a first model for extracting features based on content information included in the image; Obtaining attribute information corresponding to the object by using a second model for calculating a specific attribute, acquiring characteristic information about the object using visual characteristics of the object and attribute information of the object, and the object It may include searching for a similar image corresponding to the object by using the characteristic information of the.

According to an alternative embodiment, the second model is a model for calculating probability information of a specific event corresponding to an image, and the obtaining of the attribute information includes a probability value corresponding to the object using the second model. may include the step of obtaining

According to an alternative embodiment, the search model is a neural network model based on proxy-based metric learning, and increases the similarity between the target vector and the positive proxy, and the similarity between the target vector and the negative proxy. It is characterized in that it is learned in a direction to decrease , and the proxy may be a vector representing representativeness of embedding vectors for comparing the similarity between the object and images pre-stored in an image database.

According to an alternative embodiment, the method further comprises: constructing a training data set for learning a classification model based on a plurality of image data and examination information for each image data, wherein the constructing of the training data set comprises: classifying the examination information for each image data into one or more predetermined categories; generating a learning input data set based on the plurality of image data; and learning output based on one or more categories corresponding to the respective image data. It may include generating a data set and matching and labeling a training output data set corresponding to each of the training input data sets.

According to an alternative embodiment, the providing of the similar image search result may include selecting and providing an image having a high similarity to the object, and selecting an image having a high similarity to the object but classified into a different category from the object It may include the step of providing.

According to an alternative embodiment, the examination image includes a plurality of cell images, and performing the classification includes: classifying each of the plurality of cell images into one or more categories; generating diagnostic information corresponding to the examination image based on the classification result; and performing the similar image search may include performing a similar image search for at least some of the plurality of cell images.

According to an alternative embodiment, the one or more categories may include at least one of a negative state, a low risk state, and a high risk state.

According to an alternative embodiment, the generating of the diagnostic information corresponding to the examination image based on the classification result for each of the plurality of cell images may include: based on the number of cell images classified into each of the one or more categories and generating diagnostic information. Each of the one or more categories may be characterized in that different weights are assigned to each other.

According to an alternative embodiment, the generating of the diagnostic information may include updating the diagnostic information based on examination result information matched to each of the found similar images.

According to another embodiment of the present disclosure, a computer program stored in a computer-readable storage medium is disclosed. The computer program, when executed by one or more processors, causes the one or more processes to perform the following operations for performing an artificial intelligence reading assistance method using a visual information-based reference search technology, the operations comprising: : Acquiring an inspection image including one or more objects, performing classification on the object using a pre-learned classification model, and using a pre-learned search model according to the classification result for the object It may include an operation of performing one or more similar image searches for an object and an operation of providing the similar image search result.

According to another embodiment of the present disclosure, a computing device for performing an artificial intelligence reading assistance method using a visual information-based reference search technology is disclosed. The computing device includes a processor including one or more cores, a memory for storing program codes executable in the processor, and a network unit for transmitting and receiving data to and from a user terminal, wherein the processor receives an inspection image including one or more objects obtained, classifying the object using a pre-learned classification model, and performing one or more similar image searches for the object using the pre-learned search model according to the classification result for the object, In addition, the similar image search result may be provided.

According to various embodiments of the present disclosure, there is an effect of enabling diagnosis of various pathological symptoms through image search and analysis using an artificial intelligence model even in an environment in which pathologists are scarce.

In addition, it is possible to prevent human errors that may occur during the diagnosis process and to provide a diagnosis method with consistent accuracy.

In addition, by considering the AI reading result of the image and the reading opinion (medical record) stored in the image database, there is an effect of guaranteeing improved reliability even for modalities and reading names where there are many diagnostic discrepancies between doctors.

Effects of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

Various aspects are now described with reference to the drawings, wherein like reference numbers are used to refer to like elements collectively. In the following example, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It will be evident, however, that such aspect(s) may be practiced without these specific details.

1 is a conceptual diagram illustrating a system in which various aspects of a computing device for performing an artificial intelligence reading assistance method using a visual information-based reference search technology related to an embodiment of the present disclosure can be implemented.

2 is a block diagram of a computing device for performing an AI reading assistance method using a visual information-based reference search technology according to an embodiment of the present disclosure.

3 illustrates an exemplary diagram for explaining one or more similar image search processes related to an embodiment of the present disclosure.

4 illustrates an exemplary diagram for explaining one or more similar image search processes related to another embodiment of the present disclosure.

5 is an exemplary diagram for explaining one or more similar image search processes related to another embodiment of the present disclosure.

6 is an exemplary diagram for explaining a process of providing one or more similar images in response to an examination image related to an embodiment of the present disclosure.

7 is a flowchart exemplarily illustrating steps for performing an AI reading assistance method using a visual information-based reference search technology related to an embodiment of the present disclosure.

8 is a schematic diagram illustrating one or more network functions related to an embodiment of the present disclosure.

Various embodiments are now described with reference to the drawings. In this specification, various descriptions are presented to provide an understanding of the present disclosure. However, it is apparent that these embodiments may be practiced without these specific descriptions.

The terms “component,” “module,” “system,” and the like, as used herein, refer to a computer-related entity, hardware, firmware, software, a combination of software and hardware, or execution of software. For example, a component can be, but is not limited to being, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a computing device and the computing device may be a component. One or more components may reside within a processor and/or thread of execution. A component may be localized within one computer. A component may be distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored therein. Components may communicate via a network such as the Internet with another system, for example, via a signal having one or more data packets (eg, data and/or signals from one component interacting with another component in a local system, distributed system, etc.) may communicate via local and/or remote processes depending on the data being transmitted).

In addition, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or.” That is, unless otherwise specified or clear from context, "X employs A or B" is intended to mean one of the natural implicit substitutions. That is, X employs A; X employs B; or when X employs both A and B, "X employs A or B" may apply to either of these cases. It should also be understood that the term “and/or” as used herein refers to and includes all possible combinations of one or more of the listed related items.

Also, the terms "comprises" and/or "comprising" should be understood to mean that the feature and/or element in question is present. However, it should be understood that the terms "comprises" and/or "comprising" do not exclude the presence or addition of one or more other features, elements and/or groups thereof. Also, unless otherwise specified or unless it is clear from context to refer to a singular form, the singular in the specification and claims should generally be construed to mean “one or more”.

Those skilled in the art will further appreciate that the various illustrative logical blocks, configurations, modules, circuits, means, logics, and algorithm steps described in connection with the embodiments disclosed herein may be implemented in electronic hardware, computer software, or combinations of both. It should be recognized that they can be implemented with To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, configurations, means, logics, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application. However, such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

Descriptions of the presented embodiments are provided to enable those of ordinary skill in the art to use or practice the present disclosure. Various modifications to these embodiments will be apparent to those skilled in the art of the present disclosure. The generic principles defined herein may be applied to other embodiments without departing from the scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments presented herein. This disclosure is to be interpreted in the widest scope consistent with the principles and novel features presented herein.

In this specification, a computer refers to all types of hardware devices including at least one processor, and may be understood as encompassing software configurations operating in the corresponding hardware device according to embodiments. For example, a computer may be understood to include, but is not limited to, smart phones, tablet PCs, desktops, notebooks, and user clients and applications running on each device.

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

Each step described in this specification is described as being performed by a computer, but the subject of each step is not limited thereto, and at least a portion of each step may be performed in different devices according to embodiments.

A system according to embodiments of the present disclosure may include a computing device 100 , a user terminal 10 , an external server 20 , and a network. The components illustrated in FIG. 1 are exemplary, and additional components may be present or some of the components illustrated in FIG. 1 may be omitted. The computing device 100 , the user terminal 10 and the external server 20 according to embodiments of the present disclosure may mutually transmit/receive data for the system according to embodiments of the present disclosure through a network.

Networks according to embodiments of the present disclosure include Public Switched Telephone Network (PSTN), x Digital Subscriber Line (xDSL), Rate Adaptive DSL (RADSL), Multi Rate DSL (MDSL), Very High Speed DSL (VDSL). ), a variety of wired communication systems such as Universal Asymmetric DSL (UADSL), High Bit Rate DSL (HDSL), and Local Area Network (LAN) can be used.

In addition, the networks presented herein include Code Division Multi Access (CDMA), Time Division Multi Access (TDMA), Frequency Division Multi Access (FDMA), Orthogonal Frequency Division Multi Access (OFDMA), Single Carrier-FDMA (SC-FDMA) and Various wireless communication systems may be used, such as other systems.

The network according to the embodiments of the present disclosure may be configured regardless of its communication mode, such as wired and wireless, and is composed of various communication networks such as a personal area network (PAN) and a wide area network (WAN). can be In addition, the network may be a well-known World Wide Web (WWW), and may use a wireless transmission technology used for short-range communication, such as infrared (IrDA) or Bluetooth (Bluetooth). The techniques described herein may be used in the networks mentioned above, as well as in other networks.

According to an embodiment of the present disclosure, the user terminal 10 accesses the computing device 100 to obtain one or more similar images related to an examination image including one or more objects and diagnostic information corresponding to each similar image. It may be a terminal related to a user. In this case, the examination image may mean medical-related image data obtained from the examinee for medical diagnosis, and the diagnosis information may mean medical diagnosis information read by a specialist through the examination image. As a specific example, the diagnosis information may include prediction information related to the onset of cervical cancer, and the examination image may be an image of cervical cells for predicting the onset of cervical cancer. The detailed description of the above-described diagnostic information and test image is only an example, and the present disclosure is not limited thereto.

The user terminal 10 may be a terminal related to an examiner (eg, a specialist) that provides a checkup result to a user (eg, the examinee). When the user terminal 10 is a terminal related to an examiner that provides examination results to the examinee, the diagnosis information corresponding to the examination image received from the computing device 100 may be used as medical assistance information for reading the examination result of the examinee. can The user terminal 10 has a display, so it can receive a user's input and provide an output of any type to the user.

A user of the user terminal 10 is a medical professional, which may mean a doctor, a nurse, a clinical pathologist, a medical imaging specialist, or the like, and may be a technician repairing a medical device, but is not limited thereto. For example, the user may mean an administrator or a patient who performs a checkup using the system according to the disclosed embodiment in a medically vulnerable area.

The user terminal 10 may refer to any type of entity(s) in a system having a mechanism for communication with the computing device 100 . For example, the user terminal 10 is a personal computer (PC), a notebook (note book), a mobile terminal (mobile terminal), a smart phone (smart phone), a tablet PC (tablet pc), and a wearable device (wearable device) and the like, and may include all types of terminals capable of accessing a wired/wireless network. In addition, the user terminal 10 may include an arbitrary server implemented by at least one of an agent, an application programming interface (API), and a plug-in. In addition, the user terminal 10 may include an application source and/or a client application.

According to an embodiment of the present disclosure, the external server 20 may be a server that stores an examination image including one or more objects, an image related to each object, and medical diagnosis or reading information related to each object image. For example, the external server 20 may be at least one of a hospital server and a government server, and information about an examination image including one or more objects, an image related to each object, and medical diagnosis or reading information related to each object image, etc. It may be a server that stores Information stored in the external server 20 may be utilized as training data, verification data, and test data for learning the neural network in the present disclosure. That is, the external server 20 may be a server that stores information about a data set for training the neural network model of the present disclosure.

The computing device 100 of the present disclosure may build a training data set based on a plurality of object images from the external server 20 and read information about each object image, and includes one or more network functions through the training data set By training the neural network model, a classification model for classifying each of one or more objects included in the examination image into one or more predetermined categories may be generated.

The external server 20 is a digital device, and may be a digital device equipped with a processor, such as a laptop computer, a notebook computer, a desktop computer, a web pad, and a mobile phone, and having a computing capability with a memory. The external server 20 may be a web server that processes a service. The above-described types of servers are merely examples, and the present disclosure is not limited thereto.

According to an embodiment of the present disclosure, the computing device 100 may acquire an examination image. The examination image may be medical-related image data and may include one or more objects. Here, the medical related image data may refer to image data obtained from a user (ie, an examinee) for medical diagnosis. For example, the medical-related image data may include X-ray, CT, or MRI image data, karyotype image data, blood vessel image data, and genome image data. The one or more objects refer to objects included in medical-related image data, and may relate to a part of the examinee's body for medical diagnosis or reading. For example, the one or more objects may refer to organs, blood vessels, or cells such as liver, heart, uterus, brain, breast, lung, and abdomen.

According to an embodiment of the present disclosure, the computing device 100 may classify an object using a pre-learned classification model. Specifically, the computing device 100 may classify one or more objects by processing an inspection image including one or more objects in a pre-learned classification model as an input. In this case, the pre-trained classification model may be a neural network model for classifying one or more objects included in the corresponding examination image into one or more categories when an examination image is input. The one or more categories may include, but are not limited to, at least one of, for example, a normal state, a low-risk state, and a high-risk state. This classification model may be pre-trained by the processor 130 through the training data. That is, the pre-learned classification model may be a neural network model that detects an object in an examination image and classifies the detected object into a specific category.

As a specific example, when the first examination image is cell image data including a plurality of cervical cells (ie, image data related to Pap smear), the pre-trained classification model may include a plurality of uterus in the first examination image. Each of the cervical cells may be detected, and each detected cervical cell may be classified into at least one of a category related to normal and a category related to abnormal. In this case, the category related to the abnormality may mean a category for identifying objects affecting medical diagnosis or reading. That is, an object classified into an abnormal category among one or more objects included in the examination image may mean an object that is a criterion for determining the presence or absence of a disease (or whether additional examination is performed) related to the examinee who is the subject of the examination image. have. A detailed description of the classification performed by the above-described first inspection image and classification model is only an example, and the present disclosure is not limited thereto.

That is, when acquiring an examination image including one or more objects, the computing device 100 detects each of the one or more objects in order to identify the objects affecting the examination whether or not the examinee has a disease in the examination image, and each An object may be classified into each of one or more categories.

According to an embodiment of the present disclosure, the computing device 100 may provide diagnostic information corresponding to the examination image. In the present disclosure, the diagnostic information corresponding to the examination image means information for reading the examination result of the examinee, and may include at least one of diagnostic information regarding the presence or absence of a disease and prediction information regarding the incidence rate. For example, when the examination image is cell image data related to the diagnosis of cervical cancer, the diagnosis information may include diagnosis information related to whether the examinee has cervical cancer. For another example, when the examination image is X-ray image data related to a chest X-ray, the diagnostic information may include diagnostic information related to whether the examinee has a lung tumor. As another example, when the test image is karyotype image data for karyotype analysis, the diagnostic information may include diagnostic information related to whether the examinee has leukemia. The detailed description of the above-described diagnostic information is only an example, and the present disclosure is not limited thereto.

Specifically, the computing device 100 may generate diagnostic information corresponding to the examination image based on a classification result of each of one or more objects performed through a pre-learned classification model. The computing device 100 may process an inspection image including one or more objects in the pre-learned classification model as an input. In this case, the pre-trained classification model may classify each of one or more objects included in the examination image into one or more categories. In this case, the one or more categories may include a category related to normal and a category related to abnormality. The computing device 100 may generate diagnostic information based on the number of objects classified into categories related to abnormalities. For example, the computing device 100 may generate diagnostic information based on whether the number of objects classified into an abnormality-related category exceeds a predetermined threshold. In this case, the previously determined threshold may be a reference value of an abnormal object that is a criterion for determining the presence or absence of a disease. As a specific example, if the number of objects classified into a category related to abnormality through the pre-learned classification model is 10 and the predetermined threshold is 15, the computing device 100 indicates that no disease has occurred in response to the examination image. It is possible to generate diagnostic information including information and information that the incidence rate is 60% within 3 years. The above-described number of classified objects, a predetermined threshold, and detailed description of diagnostic information are merely examples, and the present disclosure is not limited thereto.

That is, the computing device 100 may obtain an examination image related to medical-related image data of the examinee, and provide diagnostic information including diagnostic information related to the presence or absence of disease and predictive information related to the incidence rate in response to the obtained examination image. have.

According to an embodiment of the present disclosure, the computing device 100 may perform one or more similar image searches for an object using a pre-learned search model. Specifically, the computing device 100 may perform one or more similar image searches for an object according to a classification result of the object using a pre-learned search model. In this case, the pre-learned search model may be a neural network model for retrieving one or more similar images having similarity to the corresponding object from the image database by inputting an object classified into a specific category (eg, a category related to abnormality). . Such a search model may be pre-trained by the processor 130 through training data. That is, the pre-learned search model may be a neural network model that searches for one or more similar images similar to the corresponding object based on the object classified into a specific category.

The one or more similar image searches may be performed through a similarity determination process for images previously stored in an image database. In this case, the image database may store a plurality of object images and medical diagnosis information related to each object image.

As a specific example, the computing device 100 may classify a first object (ie, one cervical cell) among a plurality of cervical cells included in the first examination image into an abnormal category by using a pre-trained classification model. can In this case, the computing device 100 may search for one or more similar images corresponding to the first object by processing the first object classified into the abnormal category as an input of the pre-trained search model. That is, when the pre-learned search model receives a first object as an input, it is possible to search for one or more similar images by determining similarities between the first object and each of a plurality of objects included in the image database.

That is, the computing device 100 may perform one or more similar image searches for the object according to the classification of the object. In other words, the computing device 100 may perform a search for one or more similar images similar to an object classified into a specific category from the image database (ie, an object influencing the examinee's disease determination).

According to an embodiment of the present disclosure, the computing device 100 may provide a similar image search result. Specifically, the computing device 100 may classify one or more objects into one or more categories by processing an inspection image including one or more objects as an input of a pre-learned classification model, and pre-learned according to the classification result of the objects. One or more similar image searches for an object may be performed using the search model. In other words, the computing device 100 detects a specific object classified into a category related to abnormality from an examination image including one or more objects by using the pre-trained neural network model, and detects one object having similarity to the object detected from the image database. By searching for the above similar images, it is possible to provide similar image search results. In this case, the search result provided by the computing device 100 may include one or more similar images and examination information corresponding to each similar image.

That is, the computing device 100 identifies objects affecting medical diagnosis or reading from the examination image related to medical-related image data, and includes one or more similar images similar to each of the corresponding objects and a diagnosis corresponding to each similar image. By providing a record, a medical diagnosis or reading of a user (eg, a specialist) may be aided.

In an embodiment, the computing device 100 may be a terminal or a server, and may include any type of device. The computing device 100 is a digital device, and may be a digital device equipped with a processor, such as a laptop computer, a notebook computer, a desktop computer, a web pad, and a mobile phone, and having a computing power having a memory. The computing device 100 may be a web server that processes a service. The types of computing devices described above are merely examples, and the present disclosure is not limited thereto.

According to an embodiment of the present disclosure, the computing device 100 may be a server that provides a cloud computing service. More specifically, the computing device 100 is a type of Internet-based computing, and may be a server that provides a cloud computing service that processes information not with a user's computer but with another computer connected to the Internet. The cloud computing service may be a service that stores data on the Internet and allows the user to use it anytime and anywhere through Internet access without installing necessary data or programs on his/her computer. Easy to share and deliver with a click. In addition, cloud computing service not only stores data on a server on the Internet, but also allows users to perform desired tasks using the functions of applications provided on the web without installing a separate program, and multiple people can simultaneously view documents. It may be a service that allows you to work while sharing. In addition, the cloud computing service may be implemented in the form of at least one of Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), a virtual machine-based cloud server, and a container-based cloud server. . That is, the computing device 100 of the present disclosure may be implemented in the form of at least one of the above-described cloud computing services. The detailed description of the above-described cloud computing service is merely an example, and may include any platform for building the cloud computing environment of the present disclosure.

Specific configuration and effect of the learning method for the neural network, the learning process, the method of providing one or more similar images related to the examination image, and the artificial intelligence reading assistance method using the visual information-based reference search technology in the present disclosure The description will be given later with reference to FIG. 2 below.

2 is a block diagram of a computing device for performing an artificial intelligence reading assistance method using a visual information-based reference search technology according to an embodiment of the present disclosure.

As shown in FIG. 2 , the computing device 100 may include a network unit 110 , a memory 120 , and a processor 130 . Components included in the aforementioned computing device 100 are exemplary and the scope of the present disclosure is not limited to the aforementioned components. That is, additional components may be included or some of the above-described components may be omitted according to implementation aspects for the embodiments of the present disclosure.

According to an embodiment of the present disclosure, the computing device 100 may include the user terminal 10 and the network unit 110 for transmitting and receiving data to and from the external server 20 . The network unit 110 transmits data for performing an artificial intelligence reading assistance method using a visual information-based reference search technology according to an embodiment of the present disclosure and a training data set for learning a neural network model to other computing devices, servers, and the like. can send and receive. That is, the network unit 110 may provide a communication function between the computing device 100 , the user terminal 10 , and the external server 20 . For example, the network unit 110 may receive the examination image data from the user terminal 10 . As another example, the network unit 110 may receive a training data set for learning the classification model or the search model of the present disclosure from the external server 20 . Additionally, the network unit 110 may allow information transfer between the computing device 100 and the user terminal 10 and the external server 20 by calling a procedure to the computing device 100 .

The network unit 110 according to an embodiment of the present disclosure includes a Public Switched Telephone Network (PSTN), x Digital Subscriber Line (xDSL), Rate Adaptive DSL (RADSL), Multi Rate DSL (MDSL), VDSL ( A variety of wired communication systems such as Very High Speed DSL), Universal Asymmetric DSL (UADSL), High Bit Rate DSL (HDSL), and Local Area Network (LAN) can be used.

In addition, the network unit 110 presented herein is CDMA (Code Division Multi Access), TDMA (Time Division Multi Access), FDMA (Frequency Division Multi Access), OFDMA (Orthogonal Frequency Division Multi Access), SC-FDMA ( A variety of wireless communication systems can be used, such as Single Carrier-FDMA) and other systems.

In the present disclosure, the network unit 110 may be configured regardless of its communication mode, such as wired and wireless, and may be composed of various communication networks such as a short-range network (PAN: Personal Area Network) and a local area network (WAN: Wide Area Network). can In addition, the network may be a well-known World Wide Web (WWW), and may use a wireless transmission technology used for short-range communication, such as infrared (IrDA) or Bluetooth (Bluetooth). The techniques described herein may be used in the networks mentioned above, as well as in other networks.

According to an embodiment of the present disclosure, the memory 120 may store a computer program for performing the artificial intelligence reading assistance method using the visual information-based reference search technology according to an embodiment of the present disclosure, and the stored computer program It may be read and driven by the processor 130 . In addition, the memory 120 may store any type of information generated or determined by the processor 130 and any type of information received by the network unit 110 . Also, the memory 120 may store information on an examination image including one or more objects. For example, the memory 120 may store input/output data (eg, an examination image, one or more objects included in the examination image, diagnostic information corresponding to each of the one or more objects, and analysis generated in response to the examination image). information, etc.) may be temporarily or permanently stored.

According to an embodiment of the present disclosure, the memory 120 is a flash memory type, a hard disk type, a multimedia card micro type, and a card type memory (eg, a SD or XD memory, etc.), Random Access Memory (RAM), Static Random Access Memory (SRAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read (PROM) -Only Memory), a magnetic memory, a magnetic disk, and an optical disk may include at least one type of storage medium. The computing device 100 may operate in relation to a web storage that performs a storage function of the memory 120 on the Internet. The description of the above-described memory is only an example, and the present disclosure is not limited thereto.

According to an embodiment of the present disclosure, the processor 130 may be configured with one or more cores, and may include a central processing unit (CPU) and a general purpose graphics processing unit (GPGPU) of a computing device. , data analysis such as a tensor processing unit (TPU), and a processor for deep learning.

The processor 130 may read a computer program stored in the memory 120 to perform data processing for deep learning according to an embodiment of the present disclosure. According to an embodiment of the present disclosure, the processor 130 may perform an operation for learning the neural network. The processor 130 for learning of the neural network, such as processing input data for learning in deep learning (DL), extracting features from input data, calculating an error, updating the weight of the neural network using backpropagation calculations can be performed.

Also, in the processor 130, at least one of a CPU, a GPGPU, and a TPU may process learning of a network function. For example, the CPU and the GPGPU can process learning of a network function and data classification using the network function. Also, in an embodiment of the present disclosure, learning of a network function and data classification using the network function may be processed by using the processors of a plurality of computing devices together. In addition, the computer program executed in the computing device according to an embodiment of the present disclosure may be a CPU, GPGPU or TPU executable program.

In the present specification, a network function may be used interchangeably with an artificial neural network and a neural network. In the present specification, a network function may include one or more neural networks, and in this case, an output of the network function may be an ensemble of outputs of one or more neural networks.

The processor 130 may read a computer program stored in the memory 120 to provide a classification model according to an embodiment of the present disclosure. According to an embodiment of the present disclosure, the processor 130 may generate analysis information corresponding to image data. According to an embodiment of the present disclosure, the processor 130 may perform calculation for training the classification model.

According to an embodiment of the present disclosure, the processor 130 may typically process the overall operation of the computing device 100 . The processor 130 processes signals, data, information, etc. input or output through the above-described components or runs an application program stored in the memory 120, thereby providing or processing appropriate information or functions to the user or user terminal. can do.

According to an embodiment of the present disclosure, the processor 130 may acquire an examination image including one or more objects. Acquisition of the examination image according to an embodiment of the present disclosure may include receiving or loading image data stored in the memory 120 . Also, the image data acquisition may be receiving or loading data from another storage medium, another computing device, or a separate processing module in the same computing device based on a wired/wireless communication means.

The examination image of the present disclosure may refer to medical-related image data obtained from an examinee for medical diagnosis. For example, the medical-related image data may include X-ray, CT, or MRI image data, karyotype image data, blood vessel image data, and genome image data. The one or more objects refer to objects included in medical-related image data, and may relate to a part of the examinee's body for medical diagnosis or reading. For example, the one or more objects may refer to organs, blood vessels, or cells such as liver, heart, uterus, brain, breast, lung, and abdomen.

For a more specific example, the examination image may include an image of a result obtained by smearing the cervical cells of the subject on a slide and performing necessary processing such as staining for the diagnosis of cervical cancer, and one or more objects, It may mean each of a plurality of cells included in the corresponding photographed image. The detailed description of the above-described inspection image and one or more objects is only an example, and the present disclosure is not limited thereto. That is, according to various embodiments of the present disclosure, the examination image may further include various medical images (eg, chest Xray image, karyotype-related image, chromosome image, etc.) obtained from the examinee, and one or more objects may also be used for each examination. Various objects included in the image may be further included.

In an embodiment, a separate camera module for acquiring an examination image may be provided in the computing device of the present disclosure. In an additional embodiment, auxiliary equipment such as a magnifying glass, a lens, and a microscope attached to or provided in the camera module may be used, and the camera module may take an enlarged image through this.

According to an embodiment of the present disclosure, the processor 130 may perform image pre-processing on the inspection image. The processor 130 may perform the step of resizing the image for each image included in the training data. In an embodiment, the processor 130 may downsize the image after upscaling, and the scaling method and order for the image are not limited thereto. In an embodiment, the processor 130 may obtain images of different resolutions for images through extended convolution in a network-based learning process, and may upscale them to transform them to be the same as the original resolution.

Also, the processor 130 may perform image pre-processing by adjusting the color of the inspection image. In one embodiment, one or more objects included in the image may be stained after smearing. Accordingly, the color of the image can be adjusted so that the colors of the stained cell nucleus, cytoplasm, and cell membrane and other regions can be clearly distinguished. The method of adjusting the color of the image is not limited, but color adjustment using a filter that adjusts brightness or saturation may be performed.

According to an embodiment of the present disclosure, the processor 130 may classify each of one or more objects included in the examination image. Specifically, the processor 130 may classify one or more objects included in the examination image by using a pre-learned classification model. The processor 130 may process the inspection image as an input to the pre-learned classification model so that each of one or more objects is classified into one or more categories, respectively.

The classification model may be a neural network model trained to detect one or more objects included in the examination image and classify each object into at least one of one or more categories. The one or more categories may include at least one of a category related to normal and a category related to abnormality. In this case, the category related to the abnormality may mean a category for identifying objects affecting medical diagnosis or reading. That is, an object classified into a category related to abnormality among one or more objects included in the examination image may mean an object that serves as a criterion for determining the presence or absence of a disease related to the subject of the examination image. In an additional embodiment, the category related to the abnormality may be subdivided into at least two categories according to the degree of risk. For example, the category related to the abnormality may be subdivided into a low-risk state, a high-risk state, etc. according to diagnosis accuracy or a degree of predicting the likelihood of an onset.

According to an embodiment of the present disclosure, the processor 130 may perform pre-learning on the classification model through a training data set including a plurality of training data.

To this end, the processor 130 may receive medical-related data from the external server 20 , and may build a learning data set based on the data. Specifically, the processor 130 may build a learning data set based on a plurality of cell image data received from the external server 20 and examination information for each cell image data. In this case, the training data set may include a training input data set and a training output data set. In addition, the processor 130 may build a learning input data set through a plurality of object images related to the inspection image, and may build a learning output data set through reading information for each of the plurality of object images.

In the process of building the learning output data set, the processor 130 may reclassify the read information for each of the plurality of object images into at least one of one or more predetermined categories. In this case, the one or more categories may be characterized by fewer than the plurality of types of read information.

As a specific example, the examination image may be an examination image related to cervical cells, and a plurality of object images included in the examination image may be related to a plurality of cell images. In general, read information related to each of a plurality of cells in a cervical cancer reading process may relate to classifying each cell image into five classification criteria. For example, the five classification criteria related to readings related to cell images are normal, Atypical Squamous Cells of Undetermined Significance (ASC-US), Low-grade Squamous Intraepithelial Lesion (LSIL), High-grade Squamous Intraepithelial Lesion (HSIL), and Carcinoma. can be However, a plurality of cell images related to cervical cells may be difficult to be utilized as learning data of a neural network because there is little data and images related to each classification are unbalanced. That is, the accuracy of the learned neural network may be somewhat lowered or the learning of the neural network itself may not be possible due to insufficiently secured learning data for learning the neural network or lack of diversity in classification.

Accordingly, the processor 130 may construct the learning output data by reclassifying the read information for each of the cell images into at least one of one or more predetermined categories.

For example, the processor 130 may reclassify each examination information into at least one of one or more categories based on the number of read information related to each of the five classifications.

As a specific example, the number of cell images classified as normal is 2000, the number of cell images classified as ASC-US is 400, the number of cell images classified as LSIL is 1700, the number of cell images classified as HSIL is 1200, and the number of cell images is classified as Carcinoma. The number of imaged cells may be 1000. In this case, since each of the learning data classified into five detailed units is not balanced, the efficiency of learning may be reduced.

The processor 130 of the present disclosure integrates data related to ASC-US and LSIL and reclassifies it into a low-risk category, and integrates data related to HSIL and Carcinoma to a high-risk category can be reclassified as That is, the processor 130 may reclassify the classification of the existing five detailed units into three categories (normal, low risk, and high risk). In other words, the processor 130 may perform reclassification so that the number of data in each category is balanced. Accordingly, learning for various classifications is performed in a balanced manner, so that the learning efficiency of the neural network can be improved, and the accuracy of the neural network on which learning has been completed can be improved.

In addition, as the reclassification performed by the processor 130 in the present disclosure is a classification based on relatively fewer categories than the existing detailed classification, it is possible to construct learning data with high utility in the medical field with little data and unbalanced learning data conditions. can make it possible That is, in consideration of the number of data included in each sub-unit of the learning output data related to the correct answer, by reclassifying each sub-unit into one or more categories, which are relatively small classifications, to generate the learning output data, the learned neural network is improved It is possible to perform a classification operation with accuracy.

In addition, the processor 130 may match and label the training output data set corresponding to each of the training input data sets. That is, through the above-described process, the processor 130 may build a training data set for training the classification model.

The classification model of the present disclosure may include a dimensionality reduction submodel (eg, an encoder) and a dimensionality reconstruction submodel (eg, a decoder). The processor 130 may use the training input data as an input of the dimension reduction sub-model to train the dimension restoration sub-model to output training output data associated with the label of the training input data.

The processor 130 receives the learning input data related to the object image as an input to the dimension reduction sub-model, outputs a feature corresponding to the learning input data, and processes the output feature as an input of the dimension restoration sub-model to obtain the object image. It may be classified into at least one of one or more categories. The processor 130 derives an error by comparing the output of the dimension restoration sub-model with the classification result and learning reclassification information (ie, classification related to the correct answer), and backpropagates the weight of each model based on the derived error. ) can be adjusted in this way. The processor 130 adjusts the weights of one or more network functions so that the classification result, which is the output of the dimension restoration submodel, approaches the learning output data based on the error between the operation result and the learning output data of the dimensional restoration submodel for the training input data. can

That is, the dimensionality reduction sub-model receives the learning input data related to the object image from the processor 130 and designates a feature related to a specific vector of the learning input data as an output to learn an intermediate process in which the input data is converted into a feature. have.

In addition, the processor 130 may transfer the embedding (ie, object image feature) related to examination information (ie, information about reclassification) related to the object image from the dimension reduction sub-model to the dimension restoration sub-model. The dimension restoration sub-model may classify the object image into at least one of one or more categories by inputting the features of the object image.

According to an additional embodiment, when a plurality of objects related to the examination image are related to the cell image, the cell nucleus and the cytoplasm may be recognized in the cell image of the processor 130, and a classification model based on the recognized area ratio of the cell nucleus and the cytoplasm weights can be adjusted. For a specific example, the processor 130 calculates the area of the cell nucleus and the cytoplasm from the cell image included in the examination image, and the smaller the difference between the two areas, the greater the probability that the classification model will classify the cell image into a category related to abnormality. We can adjust the weight to the corresponding classification model to be high.

Through the above-described learning process, the classification model learned by the processor 130 may detect one or more objects included in the examination image, and classify each detected object into at least one of one or more categories.

According to an embodiment of the present disclosure, the processor 130 may generate diagnostic information corresponding to the examination image based on the classification result of each of the plurality of object images. In the present disclosure, the diagnostic information corresponding to the examination image means information for reading the examination result of the examinee, and may include at least one of diagnostic information regarding the presence or absence of a disease and prediction information regarding the incidence rate. For example, when the examination image is cell image data related to the diagnosis of cervical cancer, the diagnosis information may include diagnosis information related to whether the examinee has cervical cancer. For another example, when the examination image is X-ray image data related to a chest X-ray, the diagnostic information may include diagnostic information related to whether the examinee has a lung tumor. As another example, when the test image is karyotype image data for karyotype analysis, the diagnostic information may include diagnostic information related to whether the examinee has leukemia. The detailed description of the above-described diagnostic information is only an example, and the present disclosure is not limited thereto.

Specifically, the processor 130 may generate diagnostic information corresponding to the examination image based on a classification result of each of one or more objects performed through a pre-learned classification model. The processor 130 may generate diagnostic information based on the number of object images classified into each of one or more categories. In more detail, the processor 130 may process an inspection image including one or more objects in the pre-learned classification model as an input. In this case, the pre-learned classification model may classify each of one or more objects included in the examination image into one or more categories. In this case, the one or more categories may include a category related to normal and a category related to abnormality. The processor 130 may generate diagnostic information based on the number of objects classified into categories related to abnormality. For example, the processor 130 may generate diagnostic information based on whether the number of objects classified into an abnormality-related category exceeds a predetermined threshold. In this case, the previously determined threshold may be a reference value of an abnormal object that is a criterion for determining the presence or absence of a disease. As a specific example, if the number of objects classified into a category related to abnormality through the pre-learned classification model is 10 and the predetermined threshold is 15, the processor 130 provides information that no disease has occurred in response to the examination image. And it is possible to generate diagnostic information including information that the incidence rate within 3 years is 30%. The above-described number of classified objects, a predetermined threshold, and detailed description of the diagnostic information are merely examples, and the present disclosure is not limited thereto.

That is, the processor 130 may obtain a test image related to the medical-related image data of the examinee, and provide diagnostic information including diagnostic information related to the presence or absence of a disease and prediction information related to the incidence rate in response to the acquired test image. .

According to an embodiment of the present disclosure, the processor 130 may perform one or more similar image searches for an object using a pre-learned search model. Specifically, the processor 130 may perform one or more similar image searches for the object according to the classification result of the object using the pre-learned search model. In this case, the pre-learned search model may be a neural network model for retrieving one or more similar images having similarity to the corresponding object from the image database by inputting an object classified into a specific category (eg, a category related to abnormality). . Such a search model may be pre-trained by the processor 130 through training data.

The search model is a neural network model based on proxy-based metric learning, in which the similarity between the target vector and the positive proxy is increased and the similarity between the target vector and the negative proxy is lowered. can be done with The proxy may be a vector indicating representativeness of embedding vectors for comparing the similarity between the object and images previously stored in the image database.

For a specific example, the processor 130 may include a target target vector with features related to the first object, a vector similar to the first object, a target target similar vector, and a vector related to the second object as a target target dissimilar vector. A search model can be trained using the training data. When a search model is trained using such training data, the search model classifies the target target vector and the Thursday target similar vector into the same group (or cluster), and divides the target target vector and target dissimilar vector into different groups. can be learned to classify.

More specifically, the search model is trained to form clusters among similar data in a solution space. The search model is trained such that the target vector is included in one cluster with the target similar vector, and the target dissimilar vector is included in a different cluster from the target vector and target similar vector. Each cluster may be positioned to have a certain distance margin on the solution space of the learned search model.

The search model receives training data including a target target target vector, target target similar vector, and target target dissimilar vector, matches each data to the solution space, and searches so that it can be clustered according to the labeled cluster information in the solution space You can update the weights of one or more network functions included in the model. That is, the search model can be trained so that the distance between the target vector and the target similar vector in the solution space is close to each other, and the distance in the solution space between the target vector and the target dissimilar vector is far apart from each other. . The search model can be trained using a proxy-based metric-based cost function. The proxy-based metric-based cost function aims to separate input data of the same class from negative proxies related to different classes, and a first distance from a corresponding target target vector to a positive proxy representing representativeness of input data of the same class and A difference value between the second distance from the positive proxy to the negative proxy is at least a distance margin, and the method for training the search model may include reducing the first distance to less than or equal to a predetermined ratio of the distance margin. Here the distance margin can always be positive. In order to reach the distance margin, the weights of one or more network functions included in the search model may be updated, and the weight update may be performed every iteration or every 1 epoch.

A search model that is a proxy-based metric learned model may be provided through the above process, and as input data of the search model is classified into clusters, a search for one or more similar images having similarity to a specific object may be performed. have. Proxy-based metric learning, unlike pair or triplet-based learning, compares embeddings of each data with embeddings representing the representativeness of a specific class, so the amount of computation in the mini-batch sampling process for learning can be significantly reduced. , and thus learning speed and efficiency can be improved.

In addition, since it is learned based on representativeness-related information (eg, global information), the influence on data located at an outlier in the learning process (ie, data classified into the same class but related to a specific sample) will be minimized. can This may contribute to the improvement of similar image search ability in medical fields where there are many specific type object images (eg, specific type cell images).

According to an embodiment of the present disclosure, the processor 130 may perform one or more similar image searches using characteristic information of the object. In this case, the characteristic information of the object is information that considers the visual characteristics and the medical characteristics of the object together, and may be generated through the visual characteristics of the object and attribute information of the object. For example, the feature information may mean a feature vector output by the neural network model in response to input data. That is, the feature information may mean embedding in a vector space related to a specific input. In this case, the neural network model that outputs a feature vector corresponding to a specific input may include one or more neural network models trained to output various feature vectors. For example, the first model may be a neural network model that takes the first input data as an input and outputs a visual feature vector corresponding to the first input data, and the second model uses the first input data as an input, It may be a neural network model that outputs a medical feature vector corresponding to the first input data. That is, the feature information in the present disclosure may include a feature vector related to each of a visual feature and a medical feature.

In more detail, the processor 130 may acquire the visual characteristics of the object by using the first model for extracting the features based on content information included in the image. In this case, the first model may be implemented through a dimension reduction sub-model (eg, an encoder) among the learned classification models. Specifically, the learned classification model may be a model that performs classification on each of one or more objects included in the inspection image, and the dimensionality reduction sub-model constituting it corresponds to each object by receiving each of the one or more objects as an input. You can output a vector that does. That is, the dimension reduction sub-model may be a model that outputs a vector related to the visual feature of each object image. In other words, the first model may be implemented through a dimensionality reduction sub-model among the learned classification models, and a vector related to a visual feature related to the image may be output by receiving an image as an input. For example, when the distance between the feature vectors related to the output of the first model is close, it may mean that the images related to the input are visually similar, and when the distance between the feature vectors related to the output is long, the images related to the input are visually It may mean that they are not similar. In other words, the processor 130 may acquire a visual feature corresponding to the object image by using the first model. In this case, since the first model for extracting visual features from the image is implemented through a part of the pre-trained classification model, a separate learning data construction and learning process for implementing the neural network model may be omitted. However, the first model of the present disclosure is not limited to only being implemented through the dimensionality reduction sub-model. That is, according to an embodiment of the present disclosure, a first model for extracting visual features corresponding to an image through various learning methods may be provided.

Also, the processor 130 may obtain attribute information corresponding to the object by using the second model for calculating a specific attribute corresponding to the image. The attribute information of the object may be information related to the reading result of the object. For example, the attribute information of the object may include information related to a reading result of pneumonia as there is a 94% probability that a specific object (eg, lung) is abnormal.

In this case, the second model may be a model for calculating probability information of a specific event corresponding to the image. That is, the processor 130 may obtain the attribute information of the object by obtaining a probability value corresponding to the object using the second model. In other words, the attribute information of the object may be obtained based on the probability value calculated through the second model.

According to an embodiment, the second model may be implemented including a pre-trained classification model. The classification model of the present disclosure may be a neural network model trained to detect one or more objects included in the examination image by the processor 130 and to classify each detected object. In this case, the second model may be characterized in that the probability value is calculated based on the classification result. For example, when the number of objects classified into an abnormality-related category exceeds a predetermined threshold, a high probability value corresponding to a specific disease may be calculated. That is, the probability value related to the classification result may be related to the diagnosis result of the examination image including the corresponding objects.

As a specific example, the second model may output a probability value related to each of one or more diagnosis names by receiving a specific examination image as an input. For example, the second model may output a first probability value related to pneumonia as 80% and calculate a second probability value related to lung cancer as 6% in response to the test image. In this case, the processor 130 may acquire attribute information related to reading pneumonia corresponding to the highest 80% of the probability values output in response to each of one or more diagnosis names. The detailed description of the above-described diagnosis name and probability value corresponding to each diagnosis name is merely an example, and the present disclosure is not limited thereto.

Also, the processor 130 may acquire characteristic information about the object by using the visual characteristics of the object and attribute information of the object. Specifically, the processor 130 may acquire characteristic information about the object based on the visual characteristics of the object acquired through the first model and attribute information of the object acquired with the second model. That is, the acquired characteristic information may be information in which the visual characteristics of the object and the medical characteristics of the object are considered together.

Also, the processor 130 may search for a similar image corresponding to the object by using the characteristic information of the object. That is, the processor 130 may search for a similar image corresponding to the object by using the characteristic information of the object in which the visual characteristics of the object and the medical characteristics of the object are considered together.

According to an embodiment, in the process of acquiring the characteristic information, the processor 130 may assign a weight to at least one of the visual characteristics of the object and the attribute information of the object. That is, the processor 130 may perform a search in which at least one of a visual characteristic or a medical characteristic is further reflected in the search process using the characteristic information of the object.

For example, when the processor 130 adds weight to the visual characteristics of the object, the similar image search using the characteristic information may be a search in which the visual characteristics of the object are more reflected than the medical characteristics. As another example, when the processor 130 adds weight to the medical characteristics of the object, the similar image search using the characteristic information may be a search in which the medical characteristics of the object are more reflected than the visual characteristics.

In an embodiment, the processor 130 may determine to assign a weight to at least one of the visual characteristics of the object and the attribute information of the object based on the probability value output by the second model. Specifically, the processor 130 assigns a weight to at least one of visual features and attribute information based on whether the probability value output by the second model exceeds a predetermined threshold probability value, so that similar image search focuses on visual features It is possible to determine whether the search is a search based on , or a search focused on medical characteristics. When the probability value output by the second model exceeds a predetermined threshold probability value, the processor 130 may determine that the reliability of the attribution information is equal to or greater than a reference value, and assign a weight to the attribution information related to the medical characteristic. In addition, when the probability value output by the second model is less than or equal to a predetermined threshold probability value, the processor 130 may determine that the reliability of the attribution information is somewhat low, and assign a weight to the attribution information related to the visual characteristic.

As described above, the similar image search performed in the present disclosure does not simply consider the visual similarity related to the image, but reflects the attribute information of the object related to the diagnostic information related to the object image (that is, the image related to a specific risk) Since it is performed through feature information that enables a search from), it is possible to search for a similar image that is medically closer.

Additionally, accuracy and reliability of similar image search may be improved by assigning weights to further reflect relatively reliable features among visual features or medical features based on the calculated value of attribute information related to medical features.

According to an embodiment of the present disclosure, the processor 130 may provide a similar image search result. Specifically, the processor 130 may classify one or more objects into one or more categories by processing an inspection image including one or more objects as an input of a pre-trained classification model, and may classify the one or more objects into one or more categories according to the classification results of the objects. The search model may be used to perform one or more similar image searches for an object. In other words, the processor 130 detects a specific object classified into a category related to abnormality from an examination image including one or more objects by using the pre-trained neural network model, and detects one or more objects having similarity to the object detected from the image database. By searching for similar images, it is possible to provide similar image search results. For example, as shown in FIG. 4 , when the examination image is an x-ray image related to the lung, one or more similar images 320 may be provided as a similar image search result in relation to the lung image 310 which is an object. can

In addition, the search result provided by the processor 130 may include one or more similar images and a diagnostic record corresponding to each similar image.

According to an embodiment, the processor 130 may select and provide an image having a high similarity to the object.

According to an additional embodiment, the processor 130 may select and provide an image with high or low similarity to the object. For example, as shown in FIG. 5 , when the object image is a chromosome image 410 , the processor 130 corresponds to the chromosome image 410 to one or more similar images 420 and one or more dissimilar images. 430 may be provided.

In addition, the processor 130 may select and provide an image classified into a category different from the object although similar to the object is high. In this case, the category different from the object may mean a category related to reading different from the reading of the corresponding object. For example, when the first object includes diagnostic information of pneumonia, the processor 130 corresponds to the first object, except for diagnostic information related to 'pneumonia'. One or more similar image searches may be performed.

That is, the processor 130 may prevent information from being standardized through a similar image search corresponding to a specific category. In other words, it is possible to secure the diversity of similar images provided in response to the object.

Accordingly, the processor 130 identifies objects affecting medical diagnosis or reading in the examination image related to the medical-related image data, and records one or more similar images similar to each of the corresponding objects and a diagnosis record corresponding to each similar image. By providing , it is possible to assist a user (eg, a specialist) in medical diagnosis or reading.

In other words, by presenting the AI reading results of similar images and the reading findings stored in the DB, it is possible to provide information that guarantees improved reading reliability even for modalities and read names where there are many diagnostic discrepancies between specialists.

According to an embodiment, the processor 130 may update the diagnosis information based on the examination information matched to each of one or more similar images. Specifically, the processor 130 may generate diagnostic information corresponding to the examination image. The diagnostic information may be generated based on the number of objects classified into categories related to abnormalities, and may include diagnostic information related to the presence or absence of a disease and prediction information related to an incidence rate.

In addition, the processor 130 detects a specific object classified into an abnormality-related category from an examination image including one or more objects by using the pre-trained neural network model, and detects one or more similarities having similarity to the object detected from the image database. By searching for an image, it is possible to provide a similar image search result. In this case, the search result provided by the processor 130 may include one or more similar images and examination information corresponding to each similar image.

In this case, the processor 130 may update the diagnosis information based on the examination information corresponding to one or more similar images. Updating the diagnosis information may mean, for example, reflecting at least a part of information included in the diagnosis information to the diagnosis information. Updating the diagnostic information may mean that when the examination information corresponding to one or more similar images includes first read information having a different content from the diagnostic information, the first read information is reflected in the diagnostic information. As a specific example, the diagnostic information generated by the processor 130 in response to the examination image includes reading information that corresponds to pneumonia, the examination information corresponding to the first similar image includes reading information corresponding to pulmonary tuberculosis, and the second When the checkup information corresponding to the two similar images includes read information corresponding to pulmonary tuberculosis, the processor 130, based on 'pulmonary tuberculosis', which is checkup information matched to the first similar image and the second similar image, Diagnostic information including only read information related to ' can be updated to 'Pneumonia or pulmonary tuberculosis is suspected'. The detailed description of the update of the diagnostic information described above is only an example, and the present disclosure is not limited thereto.

That is, the processor 130 may update the diagnosis information based on the examination information matched to the similar image related to the object. When the diagnostic information is updated, since additional information not included in the existing diagnostic information may be provided, medical assistance may be performed through various types of read information.

According to an embodiment of the present disclosure, the examination image may include a plurality of cell images. For example, the cell image may be a cervical cell image. Also, the processor 130 may classify each of the plurality of cell images into one or more categories. In this case, the one or more categories may include at least one of a normal state, a low-risk state, and a high-risk state. Specifically, the processor 130 may process the examination image as an input to the classification model to detect a plurality of cell images, and classify each detected cell image into at least one of a normal state, a low risk state, and a high risk state. That is, as shown in FIG. 3 , the processor 130 may detect each of a plurality of cell images included in the examination image and classify each cell image into at least one of three categories.

Also, the processor 130 may generate diagnostic information corresponding to the examination image based on the classification result for each of the plurality of cell images. The diagnostic information may be generated based on the number of cell images classified into categories related to abnormalities, and may include diagnostic information related to the presence or absence of a disease and predictive information related to an incidence rate.

Also, the processor 130 may perform a similar image search for at least some of the plurality of cell images. Specifically, the processor 130 detects each cell image from the examination image including one or more cell images by using the pre-trained neural network model, and searches for one or more similar images having similarity to the detected cell image from the image database. By doing so, it is possible to provide similar image search results. In this case, the search result provided by the processor 130 may include one or more similar images and examination information corresponding to each similar image. As a specific example, referring to FIG. 3 , the processor 130 may detect one or more cell images 210 from the examination image. Also, the processor 130 may provide the search result 220 for one or more similar images corresponding to each of the one or more cell images 210 . That is, as shown in FIG. 3 , cells having a high degree of similarity may be sorted and displayed corresponding to each cell image. In this case, the degree of similarity between each cell image and one or more similar images may be displayed together.

According to an embodiment of the present disclosure, the computing device 100 may provide one or more similar images in response to the examination image 501 . According to an embodiment, the provision of one or more similar images may be for verification or medical assistance on a result (ie, diagnostic information) obtained through a classification model. For example, the provision of one or more similar images may be for verifying whether diagnostic information corresponding to an examination image obtained through a classification model is appropriate. That is, by providing one or more similar images to obtain an existing medical record in a similar situation, medical verification of whether the diagnosis information obtained through the classification model of the present disclosure is appropriate or a diagnosis assistance corresponding to the diagnosis information. can

Specifically, the computing device 100 may search for one or more similar images by using characteristic information of each object included in the examination image 501 . The characteristic information of the object may be information generated based on the visual characteristic 511 of the object and the attribute information 521 of the object. For example, the feature information may mean a feature vector output by the neural network model in response to input data. That is, the feature information may mean embedding in a vector space related to a specific input. In this case, the neural network model that outputs a feature vector corresponding to a specific input may include one or more neural network models trained to output various feature vectors. For example, the first model may be a neural network model that takes the first input data as an input and outputs a visual feature vector corresponding to the first input data, and the second model uses the first input data as an input, It may be a neural network model that outputs a medical feature vector corresponding to the first input data. That is, the feature information in the present disclosure may include a feature vector related to each of a visual feature and a medical feature.

In more detail, the computing device 100 may acquire the visual feature 511 by processing the examination image 501 as an input of the first model 510 . In this case, the first model 510 may be implemented through a dimension reduction sub-model (eg, an encoder) among the learned classification models. The learned classification model may be a model that performs classification on each of one or more objects included in the inspection image, and the dimensionality reduction sub-model constituting this may receive a vector corresponding to each of the one or more objects by inputting each of the one or more objects as an input. It can be printed out. That is, the dimension reduction sub-model may be a model that outputs a vector related to the visual feature of each object image. In other words, the first model 510 may be implemented through a dimension reduction sub-model among the learned classification models, and may output a vector related to a visual feature related to the image by receiving an image as an input. That is, the computing device 100 may acquire the visual feature 511 corresponding to the object by using the first model 510 . In this case, as the first model 510 for extracting the visual features 511 from the image is implemented through a part of the pre-trained classification model, the separate learning data construction and learning process for implementing the neural network model will be omitted. can

Also, the computing device 100 may acquire attribute information 521 corresponding to the object by using the second model 520 that outputs a specific attribute corresponding to the image. The attribute information of the object may be information related to the reading result of the object. In this case, the second model 520 may be a model for calculating probability information of a specific event corresponding to the image. The computing device 100 may obtain the attribute information 521 of the object by obtaining a probability value corresponding to the object by using the second model 520 . In this case, the second model 520 may be implemented including a pre-trained classification model. The classification model of the present disclosure is a neural network model trained to detect one or more objects included in an examination image by the computing device 100 and perform classification for each detected object, and calculate a probability value based on the classification result can do. For example, when the number of objects classified into an abnormality-related category exceeds a predetermined threshold, a high probability value corresponding to a specific disease may be calculated. That is, the probability value related to the classification result may be related to the diagnosis result of the examination image including the corresponding objects. Accordingly, the second model 520 implemented through the classification model may be characterized in that a probability value is calculated based on the classification result.

As a specific example, the second model 520 may output a probability value related to each of one or more diagnosis names by receiving the examination image 501 as an input. For example, in response to the examination image 501 , the second model 520 may output a first probability value related to pneumonia as 80% and calculate a second probability value related to lung cancer as 6%. In this case, the computing device 100 may acquire the attribute information 521 related to the reading of pneumonia corresponding to the highest 80% of the probability values output in response to each of one or more diagnosis names. The detailed description of the above-described diagnosis name and probability value corresponding to each diagnosis name is merely an example, and the present disclosure is not limited thereto.

Also, the computing device 100 may obtain the characteristic information 530 of the object by using the visual characteristic 511 of the object and the attribute information 521 of the object. Specifically, the computing device 100 determines the characteristic of the object based on the visual feature 511 of the object obtained through the first model 510 and the attribute information 521 of the object obtained through the second model 520 . Information 530 may be obtained. That is, the acquired characteristic information 530 may be information in which the visual characteristics of the object and the medical characteristics of the object are considered together.

Also, the computing device 100 may search for one or more similar images by using the characteristic information 530 of the object. The computing device 100 may search for one or more similar images having a similarity to the feature information 530 . In this case, the computing device 100 may calculate ( 540 ) the similarity probability based on the cosine similarity of the feature vector of each piece of information. The cosine similarity may mean a similarity between two vectors obtained by using a cosine angle between the two vectors. For example, it may have a value of 1 if the directions of two vectors are perfectly equal, 0 if they form an angle of 90 degrees, and -1 if they have opposite reverberations by 180 degrees. That is, the cosine similarity may have a value of -1 or more and 1 or less, and it may be determined that the similarity is higher as the value is closer to 1.

As a specific example, the computing device 100 calculates the degree of similarity between the feature information 530 and the first image 541 , the second image 542 , and the n-th image 54n as shown in FIG. 6 , respectively. It can be calculated as 0.937, 0.265 and 0.717. In this case, the computing device 100 aligns each image based on the calculated similarity and provides it as one or more similar images, or only one image (eg, the second similar image having a similarity of 0.265 is removed) over a certain reference value. It can be provided as an image similar to the above.

That is, the computing device 100 may provide one or more similar images by performing one or more similar image searches using the characteristic information 530 of the object in which the visual and medical characteristics corresponding to the examination image are considered together. have. In this case, examination information may be matched to each of one or more similar images. That is, by acquiring an existing medical record in an image similar to the examination image, it is possible to verify whether the diagnostic information obtained through the classification model of the present disclosure is appropriate.

In other words, for verification or medical assistance on a result obtained through the classification model (ie, diagnostic information), a search for one or more similar images corresponding to the examination image may be performed from the image database. In this case, the first model 510 related to extraction of visual features and the second model 520 related to extraction of medical features may be utilized to search for one or more similar images. In an embodiment, as the first model 510 and the second model 520 may be implemented through a learned classification model, a separate training data construction and training process for implementing each model may be omitted. have.

7 is a flowchart exemplarily showing steps for performing an artificial intelligence reading assistance method using a visual information-based reference search technology related to an embodiment of the present disclosure.

According to an embodiment of the present disclosure, the method may include acquiring an examination image including one or more objects ( 610 ).

According to an embodiment of the present disclosure, the method may include performing classification on an object using a pre-learned classification model ( 620 ).

According to an embodiment of the present disclosure, the method may include performing a search for one or more similar images for an object using a pre-learned search model according to the classification result of the object ( 630 ).

According to an embodiment of the present disclosure, the method may include providing a similar image search result ( 640 ).

The order of the steps illustrated in FIG. 7 described above may be changed if necessary, and at least one or more steps may be omitted or added. That is, the above-described steps are merely an embodiment of the present disclosure, and the scope of the present disclosure is not limited thereto.

Throughout this specification, computational model, neural network, network function, and neural network may be used interchangeably. A neural network may be composed of a set of interconnected computational units, which may generally be referred to as “nodes”. These “nodes” may also be referred to as “neurons”. A neural network is configured to include at least one or more nodes. Nodes (or neurons) constituting neural networks may be interconnected by one or more “links”.

In the neural network, one or more nodes connected through a link may relatively form a relationship between an input node and an output node. The concepts of an input node and an output node are relative, and any node in an output node relationship with respect to one node may be in an input node relationship in a relationship with another node, and vice versa. As described above, an input node-to-output node relationship may be created around a link. One or more output nodes may be connected to one input node through a link, and vice versa.

In the relationship between the input node and the output node connected through one link, the value of the output node may be determined based on data input to the input node. Here, a node interconnecting the input node and the output node may have a weight. The weight may be variable, and may be changed by a user or an algorithm in order for the neural network to perform a desired function. For example, when one or more input nodes are interconnected to one output node by respective links, the output node sets values input to input nodes connected to the output node and links corresponding to the respective input nodes. An output node value may be determined based on the weight.

As described above, in a neural network, one or more nodes are interconnected through one or more links to form an input node and an output node relationship in the neural network. The characteristics of the neural network may be determined according to the number of nodes and links in the neural network, the correlation between the nodes and the links, and the value of a weight assigned to each of the links. For example, when the same number of nodes and links exist and there are two neural networks having different weight values between the links, the two neural networks may be recognized as different from each other.

A neural network may include one or more nodes. Some of the nodes constituting the neural network may configure one layer based on distances from the initial input node. For example, a set of nodes having a distance of n from the initial input node is You can configure n layers. The distance from the initial input node may be defined by the minimum number of links that must be passed to reach the corresponding node from the initial input node. However, the definition of such a layer is arbitrary for description, and the order of the layer in the neural network may be defined in a different way from the above. For example, a layer of nodes may be defined by a distance from the final output node.

The initial input node may mean one or more nodes to which data is directly input without going through a link in a relationship with other nodes among nodes in the neural network. Alternatively, in a relationship between nodes based on a link in a neural network, it may mean nodes that do not have other input nodes connected by a link. Similarly, the final output node may refer to one or more nodes that do not have an output node in relation to other nodes among nodes in the neural network. In addition, the hidden node may mean nodes constituting the neural network other than the first input node and the last output node. The neural network according to an embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be the same as the number of nodes in the output layer, and the number of nodes decreases and then increases again as progresses from the input layer to the hidden layer. can Also, in the neural network according to another embodiment of the present disclosure, the number of nodes in the input layer may be less than the number of nodes in the output layer, and the number of nodes may be reduced as the number of nodes progresses from the input layer to the hidden layer. have. In addition, the neural network according to another embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be greater than the number of nodes in the output layer, and the number of nodes increases as the number of nodes progresses from the input layer to the hidden layer. can The neural network according to another embodiment of the present disclosure may be a neural network in a combined form of the aforementioned neural networks.

A deep neural network (DNN) may refer to a neural network including a plurality of hidden layers in addition to an input layer and an output layer. Deep neural networks can be used to identify the latent structures of data. In other words, it can identify the potential structure of photos, texts, videos, voices, and music (e.g., what objects are in the photos, what the text and emotions are, what the texts and emotions are, etc.) . Deep neural networks include convolutional neural networks (CNNs), recurrent neural networks (RNNs), auto encoders, generative adversarial networks (GANs), and restricted boltzmann machines (RBMs). machine), a deep trust network (DBN), a Q network, a U network, a Siamese network, and the like. The description of the deep neural network described above is only an example, and the present disclosure is not limited thereto.

The neural network may be learned by at least one of teacher learning (supervised learning), unsupervised learning, and semi-supervised learning. The training of the neural network is to minimize the error in the output. In the training of a neural network, iteratively input the training data into the neural network, calculate the output of the neural network and the target error for the training data, and calculate the error of the neural network from the output layer of the neural network to the input layer in the direction to reduce the error. It is a process of updating the weight of each node in the neural network by backpropagation in the direction. In the case of teacher learning, learning data in which the correct answer is labeled in each learning data is used (ie, labeled learning data), and in the case of comparative learning, the correct answer may not be labeled in each learning data. That is, for example, learning data in the case of teacher learning related to data classification may be data in which categories are labeled in each of the learning data. The labeled training data is input to the neural network, and an error can be calculated by comparing the output (category) of the neural network with the label of the training data. As another example, in the case of comparison learning related to data classification, an error may be calculated by comparing the input training data with the neural network output. The calculated error is back propagated in the reverse direction (ie, from the output layer to the input layer) in the neural network, and the connection weight of each node of each layer of the neural network may be updated according to the back propagation. The change amount of the connection weight of each node to be updated may be determined according to a learning rate. The computation of the neural network on the input data and the backpropagation of errors can constitute a learning cycle (epoch). The learning rate may be applied differently according to the number of repetitions of the learning cycle of the neural network. For example, in the early stage of learning of a neural network, a high learning rate can be used to enable the neural network to quickly obtain a certain level of performance, thereby increasing efficiency, and using a low learning rate at a later stage of learning can increase accuracy.

In the training of neural networks, in general, the training data may be a subset of real data (that is, data to be processed using the trained neural network), and thus, the error on the training data is reduced, but the error on the real data is reduced. There may be increasing learning cycles. Overfitting is a phenomenon in which errors on actual data increase by over-learning on training data as described above. For example, a phenomenon in which a neural network that has learned a cat by seeing a yellow cat does not recognize that it is a cat when it sees a cat other than yellow may be a type of overfitting. Overfitting can act as a cause of increasing errors in machine learning algorithms. In order to prevent such overfitting, various optimization methods can be used. In order to prevent overfitting, methods such as increasing training data, regularization, or dropout in which a part of nodes in the network are omitted in the process of learning, may be applied.

Throughout this specification, computational model, neural network, network function, and neural network may be used interchangeably. (Hereinafter, the neural network is unified and described.) The data structure may include a neural network. And the data structure including the neural network may be stored in a computer-readable medium. Data structures, including neural networks, may also include data input to the neural network, weights of the neural network, hyperparameters of the neural network, data obtained from the neural network, activation functions associated with each node or layer of the neural network, and loss functions for learning the neural network. have. A data structure comprising a neural network may include any of the components disclosed above. That is, the data structure including the neural network includes all or all of the data input to the neural network, the weights of the neural network, hyperparameters of the neural network, data obtained from the neural network, the activation function associated with each node or layer of the neural network, and the loss function for training the neural network. may be configured including any combination of In addition to the above-described configurations, a data structure including a neural network may include any other information that determines a characteristic of a neural network. In addition, the data structure may include all types of data used or generated in the operation process of the neural network, and is not limited to the above. Computer-readable media may include computer-readable recording media and/or computer-readable transmission media. A neural network may be composed of a set of interconnected computational units, which may generally be referred to as nodes. These nodes may also be referred to as neurons. A neural network is configured to include at least one or more nodes.

The data structure may include data input to the neural network. A data structure including data input to the neural network may be stored in a computer-readable medium. The data input to the neural network may include learning data input in a neural network learning process and/or input data input to the neural network in which learning is completed. Data input to the neural network may include pre-processing data and/or pre-processing target data. The preprocessing may include a data processing process for inputting data into the neural network. Accordingly, the data structure may include data to be pre-processed and data generated by pre-processing. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

The data structure may include the weights of the neural network. (In this specification, weight and parameter may be used interchangeably.) And the data structure including the weight of the neural network may be stored in a computer-readable medium. The neural network may include a plurality of weights. The weight may be variable, and may be changed by a user or an algorithm in order for the neural network to perform a desired function. For example, when one or more input nodes are interconnected to one output node by respective links, the output node sets values input to input nodes connected to the output node and links corresponding to the respective input nodes. An output node value may be determined based on the parameter. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

By way of example and not limitation, the weight may include a weight variable in a neural network learning process and/or a weight in which neural network learning is completed. The variable weight in the neural network learning process may include a weight at a time point at which a learning cycle starts and/or a weight variable during the learning cycle. The weight for which neural network learning is completed may include a weight for which a learning cycle is completed. Accordingly, the data structure including the weights of the neural network may include a data structure including the weights that vary in the neural network learning process and/or the weights on which the neural network learning is completed. Therefore, it is assumed that the above-described weights and/or combinations of weights are included in the data structure including the weights of the neural network. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

The data structure including the weights of the neural network may be stored in a computer-readable storage medium (eg, memory, hard disk) after being serialized. Serialization can be the process of converting a data structure into a form that can be reconstructed and used later by storing it on the same or a different computing device. The computing device may serialize the data structure to send and receive data over the network. A data structure including weights of the serialized neural network may be reconstructed in the same computing device or in another computing device through deserialization. The data structure including the weight of the neural network is not limited to serialization. Furthermore, the data structure including the weights of the neural network is a data structure to increase computational efficiency while using the resources of the computing device to a minimum (e.g., B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree). The foregoing is merely an example, and the present disclosure is not limited thereto.

The data structure may include hyper-parameters of the neural network. In addition, the data structure including the hyperparameters of the neural network may be stored in a computer-readable medium. The hyper parameter may be a variable variable by a user. Hyperparameters are, for example, learning rate, cost function, number of iterations of the learning cycle, weight initialization (e.g., setting the range of weight values to be initialized for weights), Hidden Unit The number (eg, the number of hidden layers, the number of nodes of the hidden layer) may be included. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

Steps of a method or algorithm described in relation to an embodiment of the present disclosure may be implemented directly in hardware, as a software module executed by hardware, or by a combination thereof. A software module may contain random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present disclosure pertains.

Components of the present disclosure may be implemented as a program (or application) to be executed in combination with a computer, which is hardware, and stored in a medium. Components of the present disclosure may be implemented as software programming or software components, and similarly, embodiments may include various algorithms implemented as data structures, processes, routines, or combinations of other programming constructs, including C, C++ , may be implemented in a programming or scripting language such as Java, assembler, or the like. Functional aspects may be implemented in an algorithm running on one or more processors.

Those of ordinary skill in the art of the present disclosure will recognize that the various illustrative logical blocks, modules, processors, means, circuits, and algorithm steps described in connection with the embodiments disclosed herein include electronic hardware, (convenience For this purpose, it will be understood that it may be implemented by various forms of program or design code (referred to herein as "software") or a combination of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. A person skilled in the art of the present disclosure may implement the described functionality in various ways for each specific application, but such implementation decisions should not be interpreted as a departure from the scope of the present disclosure.

The various embodiments presented herein may be implemented as methods, apparatus, or articles of manufacture using standard programming and/or engineering techniques. The term “article of manufacture” includes a computer program, carrier, or media accessible from any computer-readable device. For example, computer-readable media include magnetic storage devices (eg, hard disks, floppy disks, magnetic strips, etc.), optical disks (eg, CDs, DVDs, etc.), smart cards, and flash memory. devices (eg, EEPROMs, cards, sticks, key drives, etc.). Also, various storage media presented herein include one or more devices and/or other machine-readable media for storing information. The term “machine-readable medium” includes, but is not limited to, wireless channels and various other media that can store, hold, and/or convey instruction(s) and/or data.

It is to be understood that the specific order or hierarchy of steps in the presented processes is an example of exemplary approaches. Based on design priorities, it is to be understood that the specific order or hierarchy of steps in the processes may be rearranged within the scope of the present disclosure. The appended method claims present elements of the various steps in a sample order, but are not meant to be limited to the specific order or hierarchy presented.

The description of the presented embodiments is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the present disclosure. Thus, the present disclosure is not intended to be limited to the embodiments presented herein, but is to be construed in the widest scope consistent with the principles and novel features presented herein.

The relevant content has been described in the best mode for carrying out the invention as described above.

The present invention can be utilized in the field of visual information-based reference search technology using artificial intelligence.

Claims

A method performed on one or more processors of a computing device, comprising:

acquiring an inspection image including one or more objects;

performing classification on the object by using a pre-learned classification model;

performing one or more similar image searches for the object using a pre-learned search model according to the classification result of the object; and

providing the similar image search result; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
According to claim 1,

The step of performing the similar image search comprises:

obtaining a visual feature of the object by using a first model for extracting features based on content information included in the image;

obtaining attribute information corresponding to the object by using a second model for calculating a specific attribute corresponding to the image;

obtaining characteristic information about the object by using the visual characteristics of the object and attribute information of the object; and

searching for a similar image corresponding to the object by using the characteristic information of the object; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
3. The method of claim 2,

The second model is

It is a model that calculates probability information of a specific event corresponding to the image,

The step of obtaining the attribute information includes:

obtaining a probability value corresponding to the object by using the second model; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
According to claim 1,

The search model is

A neural network model based on proxy-based metric learning, characterized in that it is learned in a direction to increase the similarity between the target vector and the positive proxy and decrease the similarity between the target vector and the negative proxy,

The proxy is

A vector indicating the representativeness of embedding vectors for comparing the degree of similarity between the object and images previously stored in the image database,

Artificial intelligence reading assistance method using visual information-based reference search technology.
According to claim 1,

constructing a training data set for training a classification model based on a plurality of image data and examination information for each image data;

further comprising,

Building the training data set comprises:

classifying the examination information for each image data into one or more predetermined categories;

generating a training input data set based on the plurality of image data, and generating a training output data set based on one or more categories corresponding to the respective image data; and

matching and labeling a training output data set corresponding to each of the training input data sets; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
According to claim 1,

The step of providing the similar image search result comprises:

selecting and providing an image having a high similarity to the object; and

providing an image having a high degree of similarity to the object, but classified into a different category from the object; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
According to claim 1,

The inspection image includes a plurality of cell images,

The step of performing the classification is:

classifying each of the plurality of cell images into one or more categories; and

generating diagnostic information corresponding to the examination image based on a classification result of each of the plurality of cell images; including,

The step of performing the similar image search comprises:

performing a similar image search for at least some of the plurality of cell images; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
8. The method of claim 7,

The one or more categories are

comprising at least one of a negative condition, a low risk condition, and a high risk condition;

Artificial intelligence reading assistance method using visual information-based reference search technology.
8. The method of claim 7,

The step of generating diagnostic information corresponding to the examination image based on the classification result for each of the plurality of cell images includes:

generating the diagnostic information based on the number of cell images classified into each of the one or more categories; including,

Each of the one or more categories,

Characterized in that different weights are given from each other,

Artificial intelligence reading assistance method using visual information-based reference search technology.
8. The method of claim 7,

The generating of the diagnostic information comprises:

updating the diagnosis information based on examination result information matched to each of the found similar images; containing,

Artificial intelligence reading assistance method using visual information-based reference search technology.
A computer program stored in a computer-readable storage medium, wherein, when the computer program is executed by one or more processors, the following for causing the one or more processors to perform an artificial intelligence reading assistance method using a visual information-based reference search technology to perform the operations of:

acquiring an inspection image including one or more objects;

performing classification on the object by using a pre-learned classification model;

performing, according to the classification result of the object, searching for one or more similar images for the object using a pre-learned search model; and

providing the similar image search result;

containing,

A computer program stored on a computer-readable storage medium.
a processor including one or more cores;

a memory storing program codes executable by the processor; and

a network unit for transmitting and receiving data to and from the user terminal;

including,

The processor is

Obtaining an inspection image including one or more objects, performing classification on the object using a pre-learned classification model, and applying a pre-learned search model to the object according to the classification result for the object performing at least one similar image search for and providing the similar image search result;

A computing device that performs an artificial intelligence reading assistance method using a visual information-based reference search technology.