WO2023027248A1

WO2023027248A1 - Data generation method, and training method and apparatus using same

Info

Publication number: WO2023027248A1
Application number: PCT/KR2021/016322
Authority: WO
Inventors: 조경재; 최재우; 신윤섭; 태윤원
Original assignee: 주식회사 뷰노
Priority date: 2021-08-26
Filing date: 2021-11-10
Publication date: 2023-03-02
Also published as: KR102591355B1; KR20230030810A

Abstract

Disclosed are a method by which various computing apparatuses provide data for an artificial neural network, and an apparatus therefor. Disclosed are a method and an apparatus therefor, the method comprising the steps of: training a generative adversarial network (GAN) so that a loss function of the GAN including a term for a first noise is minimized on the basis of real medical data including real normal medical data; and generating similar event medical data by adding a second noise generated using the trained GAN to real event medical data.

Description

Data generation method and learning method and device using the same

The present invention relates to a method for generating data and an apparatus using the same.

In general, classification learning means predicting a corresponding class for given input data. However, if the data used for classification learning has severe imbalance, a phenomenon in which classification performance is not good occurs.

For example, it is assumed that the data of 100 patients consists of 95 cancer-negative patients and 5 cancer-positive patients. In this way, data composed of an imbalanced number of classes is referred to as imbalanced data. When classification learning is performed using such imbalanced data, most general classification models predict that all 100 people will not have cancer. That is, all of them can be predicted to be positive. In this way, the overall accuracy of the classification is as high as 95%, but there is a problem in that it is not possible to find an important cancer patient.

As such, data imbalance is known to degrade the performance of machine learning classification models. Therefore, in order to produce excellent results with only a small amount of training data, data augmentation technology that transforms and increases data so that various environments or characteristics can be reflected in the training data is important. In particular, augmentation technology is absolutely necessary for medical data, in which it is difficult to obtain the data itself or it is very difficult to label the data. For example, events such as cardiac arrest are rare, making it difficult to collect sufficient event data for deep learning. This may cause imbalance between event data and normal data and performance degradation.

An object of the present invention is to provide an effective data augmentation method for resolving data imbalance during artificial neural network learning.

More specifically, the present invention is to provide an effective event data augmentation method when learning a medical artificial neural network.

More specifically, the present invention is to provide an effective event data providing method using a generative adversarial network (GAN).

The technical problems are not limited to the above-mentioned technical problems, and other technical problems not mentioned will be clearly understood by those skilled in the art from the following description.

The characteristic configuration of the present invention for achieving the object of the present invention as described above and realizing the characteristic effects of the present invention described later is as follows.

According to one aspect of the present invention, in a method for providing data for an artificial neural network by a computing device, a generative adversarial network (GAN) including a term for a first noise based on actual medical data including actual normal medical data. ), and generating similar event medical data by adding second noise generated using the learned GAN to actual event medical data.

According to another aspect of the present invention, there is provided a computer program stored in a medium, including instructions implemented to cause a computing device to perform the following method. Here, the following method is: learning a GAN such that a loss function of a generative adversarial network (GAN) including a term for a first noise is minimized based on actual medical data including actual normal medical data, and the learned and generating similar event medical data by adding a second noise generated using a GAN to real event medical data.

According to another aspect of the present invention, in a computing device providing data for an artificial neural network, the communication unit for obtaining medical data; and a processor connected to the communication unit, wherein the processor is configured to minimize a loss function of a generative adversarial network (GAN) including a term for a first noise based on actual medical data including actual normal medical data. , and similar event medical data may be generated by adding second noise generated using the learned GAN to actual event medical data.

Alternatively, the actual normal medical data may occupy most of the actual medical data.

Alternatively, the actual event medical data may occupy a minority of the actual medical data.

Alternatively, the second noise may include, as a main characteristic, a characteristic of the actual normal medical data, which occupies most of the actual medical data.

Alternatively, the term for the first noise may include a term defined so that noise generated by the generator of the GAN has a non-zero significant value.

Alternatively, the GAN may further include a GAN discriminator that distinguishes the real medical data from the similar medical data to which the first noise is added.

Various embodiments can effectively augment data for resolving data imbalance during artificial neural network learning. In addition, it is possible to effectively augment event data during learning of a medical artificial neural network.

In addition, effective event data may be provided using a generative adversarial network (GAN).

Effects obtainable in various embodiments are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description below. There will be.

The drawings accompanying this specification are intended to provide an understanding of the present invention, show various embodiments of the present invention, and explain the principles of the present invention together with the description of the specification.

1 illustrates the structure of a GAN.

2 illustrates a data augmentation method using GAN.

3 illustrates an artificial neural network training process using a data augmentation method.

4 illustrates the structure of a GAN according to an example of the present invention.

5 illustrates an artificial neural network training process according to an example of the present invention.

6-7 illustrate a noise generator according to an example of the present invention.

8 is a diagram for explaining a method of generating similar event medical data by a computing device.

9 illustrates a computing device that may be applied to the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The following detailed description of the present invention refers to the accompanying drawings, which illustrate specific embodiments in which the present invention may be practiced in order to make the objects, technical solutions and advantages of the present invention clear. These embodiments are described in sufficient detail to enable a person skilled in the art to practice the present invention.

As used throughout the description and claims herein, the terms "image" or "image data" refer to multidimensional data composed of discrete image elements (e.g., pixels for a two-dimensional image or voxels for a three-dimensional image). refers to

For example, "image" may mean a two-dimensional image corresponding to a slide of a predetermined tissue observed using a microscope, but "image" is not limited thereto, and computerized (cone-beam) It may be a medical image of a subject collected by computed tomography, magnetic resonance imaging (MRI), ultrasound, or any other medical imaging system known in the art. Images may also be provided in a non-medical context, for example, a remote sensing system, electron microscopy, and the like.

Throughout the description and claims herein, 'image' is a term that refers to a visible image or a digital representation of an image (eg, displayed on a screen).

In the drawings presented for convenience of explanation, slide image data is illustrated as an exemplary image format. However, those skilled in the art know that the image formats used in various embodiments of the present invention are X-ray images, MRI, CT, PET (positron emission tomography), PET-CT, SPECT, SPECT-CT, MR-PET, 3D ultrasound images, etc. It will be appreciated that it includes but is not limited to the illustratively enumerated forms.

The medical images described throughout the detailed description and claims of this specification may conform to the 'Digital Imaging and Communications in Medicine (DICOM)' standard. The DICOM standard is a term that collectively refers to various standards used for digital image expression and communication in medical devices, and the DICOM standard is announced by an association committee formed by the American College of Radiology (ACR) and the American Electrical Manufacturers Association (NEMA).

In addition, the medical images described throughout the detailed description and claims of this specification may be stored or transmitted through a 'Picture Archiving and Communication System (PACS)', which is a DICOM standard. It may be a system that stores, processes, and transmits medical images according to the user's needs. Medical images obtained using digital medical imaging equipment such as X-ray, CT, and MRI are stored in DICOM format and can be transmitted to terminals inside and outside the hospital through a network, to which observation results and medical records can be added. .

And throughout the detailed description and claims of this specification, 'learning' or 'learning' is a term that refers to performing machine learning through procedural computing, and is intended to refer to mental operations such as human educational activities. , and training is used in the generally accepted sense of machine learning. For example, 'deep learning' refers to machine learning using deep artificial neural networks. A deep neural network automatically learns the characteristics of each data by learning a large amount of data in a structure composed of multi-layer artificial neural networks, and through this, a machine that proceeds with learning in a way that minimizes the error of the objective/loss function, that is, classification accuracy. It is a learning model and can extract and classify various levels of features, from low-level features such as points, lines, and planes to complex and meaningful high-level features.

And throughout the description and claims herein, the word 'comprise' and variations thereof are not intended to exclude other technical features, additions, components or steps. In addition, 'one' or 'one' is used to mean more than one, and 'another' is limited to at least two or more.

Other objects, advantages and characteristics of the present invention will appear to those skilled in the art, in part from this specification and in part from practice of the invention. The examples and drawings below are provided as examples and are not intended to limit the invention. Accordingly, details disclosed herein with respect to a particular structure or function are not to be construed in a limiting sense, but are merely representative and provide guidance for those skilled in the art to variously practice the present invention with any detailed structures substantially suitable. It should be interpreted as basic data.

Moreover, the present invention covers all possible combinations of the embodiments presented herein. It should be understood that the various embodiments of the present invention are different from each other but are not necessarily mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented in one embodiment in another embodiment without departing from the spirit and scope of the invention. Additionally, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the invention. Accordingly, the detailed description set forth below is not to be taken in a limiting sense, and the scope of the present invention, if properly described, is limited only by the appended claims, along with all equivalents as claimed by those claims. Like reference numbers in the drawings indicate the same or similar function throughout the various aspects.

In this specification, unless otherwise indicated or clearly contradicted by context, terms referred to in the singular encompass the plural unless the context requires otherwise. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description will be omitted.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily practice the present invention.

1 illustrates the structure of a GAN. GAN makes it possible to generate data with a distribution similar to that of real data. Referring to FIG. 1, GAN includes a generator (G) and a discriminator (D). D's job is to distinguish whether the input is real data. Given data x as input, output D(x) of D returns the probability that x is real data. G's role is to create fake data that makes it indistinguishable from whether D is real. For example, G samples a noise/random vector z from a standard normal distribution, then uses z as input to generate pseudo data G(z). G(z) is used as input to D, and D(G(z)) returns the probability that G(z) is real data.

GAN is learned by updating the weights of D and G respectively. In the case of D, with G fixed, the weight is updated in the direction of returning a high probability for real data (x to Pdata(x)) and returning a low probability to similar data (z to Pz(z)). . In the case of G, the weight is updated in a direction that returns a high probability when G(z) is input to the previously learned D. By learning G and D alternately in this way, G can be trained so that D can create similar data well enough to be indistinguishable from D, and D can learn to distinguish well no matter what similar data G creates. .

Equation 1 represents the objective function of the GAN. In the case of GAN, there are two networks that need to be trained, and optimization is performed separately because learning collides with each other.

Equations 2 and 3 represent the objective function of D and the objective function of G, respectively. In the case of D, the weights are updated in the direction of maximizing V(D,G) with respect to D. In the case of G, the weights are updated in the direction of minimizing V(D,G) with respect to G.

In Equations 1 to 3, each symbol may refer to the following.

- E: represents the expected value / average value (expectation),

- D(xi): indicates the probability that input xi is real data (i=1~m),

- G(zi) represents similar data from input zi (i=1~m).

2 illustrates a data augmentation method using a conventional GAN. As described with reference to FIG. 1, the generator has a goal of generating similar data by receiving a noise vector z, and tricking the discriminator into discriminating the similar data as real data. On the other hand, the discriminator aims to discriminate between real data and similar data. Therefore, similar data generated using the trained GAN can be used for data augmentation for artificial neural network learning. In this case, pseudo-normal data may be generated from a GAN trained using normal data, and similar event data may be generated from a GAN trained using event data. Here, the artificial neural network includes a medical artificial neural network, and the medical artificial neural network may provide analysis/diagnosis (auxiliary) information on a disease of interest. Here, the normal data means normal (medical) data for a disease of interest, and the event data means data for a predetermined symptom (medical) for a disease of interest. By generating enough similar event data to correspond to normal data using GAN, it is possible to solve data imbalance and obtain a desired sample ratio for learning normal data and event data simultaneously. Here, data includes various types of data used for medical purposes. For example, the data includes image data and bio-signal data (eg, electrocardiogram).

3 shows an example of training a medical artificial neural network. Referring to FIG. 3 , a medical artificial neural network may be trained using normal data and event data. In this case, since the amount of normal data is significantly greater than the amount of event data, if only real data is used, performance degradation of the trained artificial neural network may be caused due to data imbalance. In particular, events such as cardiac arrest occur less frequently, making it difficult to collect sufficient event data for deep learning learning. Accordingly, an artificial neural network may be trained using the similar event data generated in FIG. 2 together.

On the other hand, according to the method of FIG. 3, in a situation where the total amount of data is large but the number of event data is small, the model performance (Accuracy, AUROC (area under the receiver operating characteristic curve)) is increased by increasing the number of event data used for deep learning learning. etc.) can be improved. At this time, according to the conventional method illustrated in FIG. 2, noise can be generated through the standard deviation of data, but there is a limit in that the standard deviation can be calculated only when pure normal data is separated.

In order to solve the above problem, we propose to use a relatively large number of normal data to find out the measurement error distribution appearing in the event data. This is because it can be assumed that measurement errors in normal data and measurement errors in event data will have the same distribution because the same machine is used.

4 illustrates the structure of another GAN according to an example of the present invention. Referring to FIG. 4 , the GAN includes a generator (G, 402), a discriminator (D, 410), and a multiplexer (408). G creates the noise needed to generate pseudo data from real data. The noise/random number vector z can be generated from the standard deviation of the real data. For example, the noise/random vector z can be sampled from a standard normal distribution of real data. Generator (G, 402) uses z as an input to generate noise 404a. Multiplexer 408 may combine real data 406 and noise 404a to generate pseudo data 404b. Here, combining includes an addition operation. For example, the real data 406 and the noise 404a have the same vector size, and similar data 404b corresponding to the real data 406 may be generated by adding corresponding elements to each other. The multiplexer 408 may be implemented in any device that supports an addition operation, and may be referred to as an adder/adder, an overlapper/overlapper, or the like. In addition, the multiplexer 408 may not be configured separately, but may be implemented as part of an existing configuration (eg, G in FIG. 4/6/7/8 and noise generator in FIG. 6/7/8). D (410) distinguishes whether the input is genuine data. Given data x as input, the output D(x) of D(410) returns the probability that x is real data.

That is, G(402) of GAN generates noise, and D(410) of GAN can distinguish (1) real data 406 and (2) similar data 404b generated by adding noise to real data. learned to be At this time, both actual normal data and actual event data may be used for GAN learning. On the other hand, GAN learns G(402) and D(410) alternately, so that G(402) can create similar data well enough that D(410) cannot distinguish, D(410) is G(402) It can be learned to distinguish well from any similar data. Therefore, according to the prior art, the noise 404a generated by G 402 can converge to zero. In this case, the GAN learned using normal data can generate only pseudo-normal data. Therefore, in order to prevent the noise 404a from converging to 0, the loss of the noise 404a (eg, L1 loss) may be added to the training loss of the GAN.

Equation 4 represents a loss function according to an example of the present invention. Learning of GAN proceeds in the direction of minimizing the loss of Equation 1.

- x represents real data 406;

- D(x) represents the probability that input x is real data,

- G(z) represents the noise 404a generated from the input z,

- G(z)+x denotes like data 404b,

-

represents the loss of noise 404a (e.g., L1 loss),

-

denotes a tunable hyperparameter for learning stabilization.

-

Represents the L1 norm and is defined by Equation 5.

Here, i represents the number of elements of a vector constituting G(z), and n represents the number of elements of a vector constituting G(z).

By including the noise 404a loss (eg, L1 loss) term in the loss function of the GAN, it is possible to prevent G(z) from converging to zero. Accordingly, when generating noise through the standard deviation of data, there is no need to separate only normal data, and an optimal noise intensity for each feature can be automatically determined through learning. That is, unlike the prior art (see FIG. 2), it is possible to train a GAN using both normal data and event data. Similar data (sets) can be generated from normal data (sets) using GAN training results, and the generated similar data (sets) can be used as event data (sets) in deep learning learning. For example, similar medical data (sets) can be generated from normal medical data (sets) using GAN training results, and the generated similar medical data (sets) can be used in deep learning learning (eg, medical artificial neural networks), etc. Can be used as event medical data (set). To this end, a process of labeling similar data (set) as event data (set) may be included. Here, the normal data (set) used to generate the training data (set) includes actual/similar normal data (set), and may preferably include actual normal data (set). After all, according to the present invention, the performance of the model (accuracy, AUROC, etc.) can be improved by increasing the number of event data used for deep learning learning.

Here, labeling includes generating labeling data corresponding to each data. Labeling data is data generated corresponding to each data, and may refer to data including at least one of characteristic information of medical data. Characteristic information of medical data (eg, images) includes information on the existence and location of a tissue area or lesion area, time information corresponding to the elapsed time at which the tissue area or lesion area was observed (eg, 30 seconds), and electrocardiogram information. However, it is not limited thereto, and may include arbitrary information capable of reflecting characteristics of medical data.

5 shows an example of training a medical artificial neural network as an example of the present invention. Referring to FIG. 5 , a medical artificial neural network may be trained using normal medical data (set) and event medical data (set). The trained artificial neural network can provide information about disease/biological condition based on input medical data (set). The basic contents of FIG. 5 have been described with reference to FIG. 3 . The difference from FIG. 3 lies in the method of providing similar data. In FIG. 3 , similar event data is provided using the GAN learned using event data (see FIG. 2 ), whereas in the present invention, similar event data may be provided using the GAN of FIG. 4 . The GAN of FIG. 4 can be trained using both normal data and event data by considering the L1 loss for noise. As a result, by using the GAN of FIG. 4 , similar event data (sets) can be provided for training of the medical artificial neural network using normal data (sets) much larger in number than event data. In this case, in order to use similar data (set) generated from normal data (set) as event data (set), a process of labeling similar data as event data may be included.

6 illustrates a noise generator according to one example of the invention. FIG. 6 illustrates a case where the noise generator 600 is configured as part of the GAN of FIG. 4 . 602 , 604a , 604b , 606 and 608 in FIG. 6 correspond to 402 , 404a , 404b , 406 and 408 in FIG. 4 . A description of each may refer to FIG. 4 . When learning of the GAN of FIG. 4 is completed, each layer/node of the neural network constituting G of the GAN is updated using a learning result (eg, a weight set). Thereafter, the GAN of FIG. 4 can be reused as a noise generator 600 for generating similar (event) data. For example, the noise generator 600 of FIG. 6 is a set of elements (eg, G) for generating noise in the GAN of FIG. 4 or elements (eg, G, multiplexers) for generating similar (event) data. It can be understood/defined as a set. The noise generator 600 of FIG. 6 may generate noise 604a based on the learning result of FIG. 4 . Specifically, the noise generator 600 of FIG. 6 includes a generator (G, 602) learned according to the proposal of the present invention (see FIG. 4; see Equation 1), and a random value is generated through the generator (G, 602). Noise 604a may be generated from (z). The generated noise 604a can be used to transform real data into pseudo (event) data. For example, the generated noise 604a is added to the actual data 606, through which similar (event) data 604b may be provided. Here, the actual data 606 to which the noise 604a may be added according to an example of the present invention includes both normal/event data, and may preferably be limited to normal data. Similar (event) data 604b converted from real data 606 using noise 604a according to the present invention is not input to the discriminator D, unlike FIG. 4 . Instead, similar (event) data (set) 604b converted from real data (set) 606 using noise 604a according to an example of the present invention, regardless of whether the real data is normal data or event data. , can be used as event data (set) when learning/training the medical artificial neural network 506. Here, learning/training the medical artificial neural network 506 may include a process of labeling the similar (event) data 604b as event data. Here, the actual data (set) 606 used to generate learning/training data (set) may be replaced with similar data (set). According to the present invention, unlike the prior art (refer to FIG. 2), it is possible to provide similar event data (set) using actual normal data (set), thereby solving data imbalance during learning/training of a medical neural network. .

7 illustrates a noise generator according to another example of the present invention. FIG. 7 illustrates a case where the noise generator 700 is configured separately unlike the GAN of FIG. 4 . For example, the noise generator 700 may include a neural network for generating noise. Although not limited thereto, the neural network for noise generation may include G in the GAN of FIG. 4 . In this case, the noise generator 700 may be configured to exclude D from the GAN of FIG. 4 or include only G from the GAN of FIG. 4 . In this case, the neural network (eg, G) 702 for generating noise of the noise generator 700 may generate noise 704a from z using the learning result (eg, weight set) of the GAN of FIG. 4 . there is. That is, each layer/node of the neural network (eg, G) 702 for noise generation is updated using the learning result (eg, weight set) of FIG. 4 . Thereafter, the neural network (eg, G) 702 for noise generation may generate noise 704a from the random value z, and the generated noise 704a converts real data into similar (event) data. can be used For example, the noise 704a generated by the noise generator 700 is added to the actual data 706 through the multiplexer 708, through which similar (event) data 704b may be provided. Here, the actual data 706 to which the noise 704a may be added according to an example of the present invention includes both normal/event data, and may preferably be limited to normal data. Similar (event) data (set) 704b generated by adding noise 704a according to the present invention is an event when learning/training the medical artificial neural network 506, regardless of whether the actual data is normal data or event data. Can be used as data (set). Here, learning/training the medical artificial neural network 506 may include a process of labeling the similar (event) data 704b as event data. Here, the actual data (set) 706 used to generate learning/training data (set) can be replaced with similar data (set). According to the present invention, unlike the prior art (see FIG. 2), the actual normal By enabling similar event data to be provided using data, data imbalance can be resolved during learning/training of a medical neural network.

Referring to FIG. 8 , the computing device may generate input real event medical data (or actual event data) and similar event medical data (or similar event data) using the noise generator 800 . The noise generator 800 may be configured separately unlike the GAN of FIG. 4 to generate the similar event medical data. For example, the noise generator 800 may include a neural network for generating noise. Although not limited thereto, the neural network for noise generation may include G in the GAN of FIG. 4 . In this case, the noise generator 800 may be configured to exclude D from the GAN of FIG. 4 or include only G from the GAN of FIG. 4 .

In this case, in the computing device, the neural network (eg, G) 802 for noise generation of the noise generator 800 uses the learning result (eg, weight set) of the GAN of FIG. can create That is, each layer/node of the neural network (eg, G) 802 for generating noise is updated using the learning result (eg, weight set) of FIG. 4 . Then, the neural network (eg, G) 802 for noise generation may generate noise 804a from the random value z. Here, the noise 804a generated by the learned noise generator 800 may be defined as second noise.

Specifically, with a loss function defined to include a term for the first noise as in Equation 4 above and learning based on actual medical data, the noise generator 800 generates significant noise 804a based on the actual medical data. can create Here, the significant noise 804a is noise including characteristics related to the actual medical data through the learning.

Meanwhile, as described above, the actual medical data may include actual normal medical data and actual event medical data. In particular, the actual normal medical data is major class data occupying most of the actual medical data, and the actual event medical data is minor class data occupying a part or minority of the actual medical data. In other words, the actual medical data may be biased to the actual normal medical data. In this case, since the noise generator 800 is learned based on the actual medical data that is highly biased toward the actual normal medical data, the noise generator 800 generates noise 804a having characteristics of the actual normal medical data. will do That is, by the above-described learning, the noise generator 800 may generate the noise 804a including the characteristics of the actual normal medical data as main characteristics.

Noise 804a can be used to transform real event medical data 806 into pseudo event medical data 804b. For example, the noise 804a generated from the noise generator 800 is added to the actual event medical data 806 through the multiplexer 808, and through this, similar event data 804b may be generated. Here, the similar event data (set) 804b generated by adding the noise 804a according to the present invention may be used as event data (set) when learning/training the medical artificial neural network 506.

Specifically, the computing device may generate similar event medical data 804b by adding the noise 804a generated using the noise generator 800 to actual event medical data 806 . The actual event medical data 806 may be converted into similar event medical data 804b with the addition of noise 804a adding characteristics of the actual normal medical data. That is, the computing device augments the actual event medical data 806 in a manner that adds noise 804a including characteristics related to actual normal medical data to other types of actual event medical data 806 during the learning process. can For example, the computing device may generate various similar event medical data 804b from one real event medical data by adding various noises 804a to the one real event medical data.

In this way, the computing device adds characteristics related to the normal medical data to the real event medical data 806, which is minor class data in the real medical data, through the noise 804a to generate various similarities from the real event medical data 806. It is possible to generate or augment event medical data, and minimize or imbalance the data bias of the actual normal medical data in the actual medical data input in learning/training the medical artificial neural network 506 through the augmentation or generation. can be resolved

9 illustrates a computing device according to one example of the present invention. The computing device 200 according to an example of the present invention includes a communication unit 210 and a processor 220, and may directly or indirectly communicate with an external computing device (not shown) through the communication unit 210.

Specifically, computing device 200 may include typical computer hardware (e.g., computer processors, memory, storage, input and output devices, and other devices that may include components of conventional computing devices; electronic devices such as routers, switches, and the like). communication devices; electronic information storage systems such as network-attached storage (NAS) and storage area networks (SAN)) and computer software (i.e., instructions that cause a computing device to function in a particular way); s) may be used to achieve desired system performance.

The communication unit 210 of the computing device may transmit/receive requests and responses with other computing devices that are interlocked. As an example, such requests and responses may be made through the same transmission control protocol (TCP) session. It is not limited, and may be transmitted and received as, for example, a user datagram protocol (UDP) datagram. In addition, in a broad sense, the communication unit 210 may include a pointing device such as a keyboard or mouse for receiving commands or instructions, other external input devices, printers, displays, and other external output devices.

In addition, the processor 220 of the computing device may include a micro processing unit (MPU), a central processing unit (CPU), a graphics processing unit (GPU) or a tensor processing unit (TPU), a cache memory, and a data bus. ) may include hardware configurations such as In addition, it may further include a software configuration of an operating system and an application that performs a specific purpose.

The present invention illustrated with reference to FIGS. 4 to 8 may be configured based on hardware/software, and the processor 220 of the computing device may be configured to perform/control the operation of the present invention according to FIGS. 4 to 8 .

Based on the description of the above embodiments, a person skilled in the art can understand that the methods and/or processes of the present invention, and the steps thereof, can be realized with hardware, software, or any combination of hardware and software suitable for a particular application. point can be clearly understood. The hardware may include a general purpose computer and/or a dedicated computing device or a specific computing device or a particular aspect or component of a specific computing device. The processes may be realized by one or more microprocessors, microcontrollers, embedded microcontrollers, programmable digital signal processors or other programmable devices having internal and/or external memory. Additionally, or alternatively, the processes may be configured to process application specific integrated circuit (ASIC), programmable gate array, programmable array logic (PAL) or electronic signals. can be implemented with any other device or combination of devices. Furthermore, the objects of the technical solution of the present invention or parts contributing to the prior art may be implemented in the form of program instructions that can be executed through various computer components and recorded on a machine-observable recording medium. The machine-observable recording medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the machine-observable recording medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in the art in the field of computer software. Examples of machine-observable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROM, DVD, and Blu-ray, and magneto-optical media such as floptical disks. (magneto-optical media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include stored and compiled or interpreted for execution on any one of the foregoing devices, as well as a heterogeneous combination of processors, processor architectures, or different combinations of hardware and software, or any other machine capable of executing program instructions. Machine code, This includes not only bytecode, but also high-level language code that can be executed by a computer using an interpreter or the like.

Accordingly, in one aspect according to the present specification, when the methods and combinations thereof described above are performed by one or more computing devices, the methods and combinations of methods may be implemented as executable code that performs each step. In another aspect, the method may be implemented as systems performing the steps, the methods may be distributed in several ways across devices or all functions may be integrated into one dedicated, stand-alone device or other hardware. In another aspect, the means for performing the steps associated with the processes described above may include any of the hardware and/or software described above. All such sequential combinations and combinations are intended to fall within the scope of this disclosure.

For example, the hardware device may be configured to act as one or more software modules to perform processing according to the present disclosure, and vice versa. The hardware device may include a processor such as an MPU, CPU, GPU, TPU coupled to a memory such as ROM/RAM for storing program instructions and configured to execute instructions stored in the memory, and external devices and signals It may include a communication unit capable of sending and receiving. In addition, the hardware device may include a keyboard, mouse, and other external input devices for receiving commands written by developers.

In the above, the present invention has been described by specific details such as specific components and limited embodiments and drawings, but these are provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , Various modifications and variations can be made from these descriptions by those of ordinary skill in the art to which the present invention belongs.

Therefore, the spirit of the present invention should not be limited to the above-described embodiments and should not be determined, and not only the claims to be described later, but also all modifications equivalent or equivalent to these claims fall within the scope of the spirit of the present invention. will do it

Such equivalent or equivalent modifications will include, for example, logically equivalent methods that can produce the same results as those performed by the method according to the present specification, the spirit and scope of the present invention. should not be limited by the above examples, and should be understood in the broadest sense permitted by law.

Embodiments of the present invention as described above can be applied to various computing devices and the like.

Claims

A method in which a computing device provides data for an artificial neural network,

learning a GAN such that a loss function of a generative adversarial network (GAN) including a term for a first noise is minimized based on actual medical data including actual normal medical data; and

And generating similar event medical data by adding a second noise generated using the learned GAN to actual event medical data.
According to claim 1,

wherein the actual normal medical data accounts for a majority of the actual medical data.
According to claim 1,

wherein the actual event medical data occupies a minority of the actual medical data.
According to claim 1,

Wherein the second noise includes as a main characteristic a characteristic of the actual normal medical data, which occupies most of the actual medical data.
According to claim 1,

Wherein the similar event medical data is used as an event medical data set for learning the artificial neural network.
According to claim 1,

The term for the first noise includes a term defined so that noise generated by the generator of the GAN has a non-zero significant value.
According to claim 1,

The method of claim 1 , wherein the GAN further includes a GAN discriminator that distinguishes between the real medical data and similar medical data to which the first noise is added.
A computer program stored on a medium comprising instructions implemented to cause a computing device to perform the method of any one of claims 1 to 7.
A computing device providing data for an artificial neural network,

a communication unit for obtaining medical data; and

And a processor connected to the communication unit, wherein the processor

Based on real medical data including real normal medical data, GAN is trained so that the loss function of a generative adversarial network (GAN) including a term for the first noise is minimized, and the training result for the generator of the GAN is used. A computing device configured to add the generated second noise to the actual event medical data to generate similar event medical data.
According to claim 9,

The computing device configured to use the similar event medical data as an event medical data set for learning the artificial neural network.