WO2020091337A1 - Appareil et procédé d'analyse d'image - Google Patents

Appareil et procédé d'analyse d'image Download PDF

Info

Publication number
WO2020091337A1
WO2020091337A1 PCT/KR2019/014265 KR2019014265W WO2020091337A1 WO 2020091337 A1 WO2020091337 A1 WO 2020091337A1 KR 2019014265 W KR2019014265 W KR 2019014265W WO 2020091337 A1 WO2020091337 A1 WO 2020091337A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
deep learning
analysis
sensitivity
Prior art date
Application number
PCT/KR2019/014265
Other languages
English (en)
Korean (ko)
Inventor
김원태
강신욱
이명재
김동민
김필수
Original Assignee
(주)제이엘케이인스펙션
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)제이엘케이인스펙션 filed Critical (주)제이엘케이인스펙션
Publication of WO2020091337A1 publication Critical patent/WO2020091337A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to an image analysis apparatus and method. More specifically, the present disclosure analyzes whether an input image includes an object to be searched using a pre-trained deep learning-based model, and efficiently generates learning data necessary for training the deep learning-based model. It relates to a device and method.
  • customs electronic customs clearance system is a computerized customs clearance service for imports and exports, and through this, it is possible to improve the efficiency of customs administration tasks between parties.
  • the security inspection system is a computerized security inspection task that determines whether there is a safety or security problem in the passenger's belongings, thereby enhancing security in the security area.
  • deep learning deep learning
  • the characteristic factors are automatically found, and thus, attempts to utilize them in the artificial intelligence field are increasing.
  • the technical problem of the present disclosure is to provide an article search system to which a deep learning technique is applied.
  • Another technical problem of the present disclosure is to provide an apparatus and method for analyzing an image acquired in an article search system using a pre-trained deep learning based model.
  • Another technical problem of the present disclosure is to provide an apparatus and method for generating training data necessary for training a model based on deep learning.
  • Another technical problem of the present disclosure is to provide a user interface for effectively utilizing an article search system to which a deep learning technique is applied.
  • control information includes operation mode information
  • the analysis target image is a streaming image of the object
  • the operation mode information indicates a non-busy mode
  • the analysis target image is for the object
  • an input unit for receiving an analysis target image of an object including at least one object and obtaining control information for detecting the at least one object, the control information and a model based on deep learning It includes an image analysis unit for analyzing the analysis target image and an output unit for outputting the analyzed result image, wherein the control information includes operation mode information, and when the operation mode information indicates a busy mode, the analysis When the target image is a streaming image of the object, and the operation mode information indicates a non-busy mode, the image analysis apparatus may be provided, wherein the analysis target image is a single picture of the object. .
  • receiving an analysis target image of an object including at least one object obtaining control information for detecting the at least one object, based on the control information and deep learning Analyzing the analysis target image using a model and outputting the analyzed result image, wherein the control information includes operation mode information, and when the operation mode information indicates a busy mode, the analysis target image is a streaming image of the object, and when the operation mode information indicates a non-busy mode, the analysis target image is a single picture of the object.
  • a computer readable recording medium can be provided.
  • an article search system to which a deep learning technique is applied may be provided.
  • an apparatus and method for analyzing an image acquired in an article search system using a pre-trained deep learning-based model may be provided.
  • a deep learning-based model using information about an acquired image and a read target is constructed according to a read target or a read purpose, thereby providing an electronic customs clearance system that provides a higher level of prediction results and analysis results. Can be.
  • UI user interface
  • FIG. 1 is a view for explaining an article search system according to an embodiment of the present disclosure.
  • FIG. 2 is a block diagram showing the configuration of an image analysis apparatus 200 according to an embodiment of the present disclosure.
  • 3 is a view for explaining an image reading process.
  • FIG. 4 is a diagram for explaining an application range of artificial intelligence in an image reading process according to an embodiment of the present disclosure.
  • FIG. 5 is a diagram illustrating an embodiment of an image enhancement device that performs image enhancement according to the present disclosure.
  • FIG. 6 is a diagram for explaining a process of classifying an object and a background from an image including a single object and generating location information of the object, according to an embodiment of the present disclosure.
  • FIG. 7 is a diagram illustrating an image in which colors are expressed based on physical properties of an object according to an embodiment of the present disclosure.
  • FIG. 8 is a view for explaining a process of generating an output image based on color distribution information of an image according to an embodiment of the present disclosure.
  • FIG. 9 is a diagram for explaining a process of obtaining a final output image that combines an image obtained by using color distribution information and an image obtained by applying edge-based filtering or smoothing filtering according to an embodiment of the present disclosure.
  • FIG. 10 is a view for explaining a process of obtaining a final output image using a graphical model according to an embodiment of the present disclosure.
  • FIG. 11 is a view for explaining an image enhancement method according to an embodiment of the present disclosure.
  • FIG. 12 is a diagram for explaining context analysis according to an embodiment of the present disclosure.
  • FIG. 13 is a diagram illustrating a process of generating and analyzing context information of an image according to an embodiment of the present disclosure.
  • FIG. 14 is a diagram for explaining a process in which an image analysis apparatus according to an embodiment of the present disclosure analyzes an image to identify an object.
  • 15 is a view for explaining the operation of the image analysis apparatus according to an embodiment of the present disclosure.
  • 16 is a diagram for explaining an embodiment of a multi-product neural network generating a multi-channel feature map.
  • 17 is a view for explaining an embodiment of a pooling technique.
  • FIG. 18 is a block diagram showing the configuration of an image synthesizing apparatus according to an embodiment of the present disclosure.
  • 19 is a diagram illustrating a process of generating a multi-object image using two images including a single object according to an embodiment of the present disclosure.
  • 20 is a diagram illustrating a process of training a convolutional neural network using a multi-object image according to an embodiment of the present disclosure.
  • 21 is a view for explaining a process of analyzing an actual image using an image synthesizing apparatus according to an embodiment of the present disclosure.
  • 22 is a diagram for explaining a method for synthesizing an image according to an embodiment of the present disclosure.
  • FIG. 23 is a diagram for describing a user interface according to an embodiment of the present disclosure.
  • 24 is a diagram for describing a user interface according to some embodiments of the present disclosure.
  • 25 is a diagram illustrating an administrator user interface according to an embodiment of the present disclosure.
  • 26 is another block diagram illustrating a configuration of an image analysis apparatus according to an embodiment of the present disclosure.
  • 27 is a view for explaining a method of setting an operation mode according to an embodiment of the present disclosure.
  • first and second are used only for the purpose of distinguishing one component from other components, and do not limit the order or importance of components, etc., unless otherwise specified. Accordingly, within the scope of the present disclosure, the first component in one embodiment may be referred to as the second component in other embodiments, and likewise the second component in one embodiment may be the first component in another embodiment It can also be called.
  • the components that are distinguished from each other are for clarifying each feature, and the components are not necessarily separated. That is, a plurality of components may be integrated to be composed of one hardware or software unit, or one component may be distributed to be composed of a plurality of hardware or software units. Accordingly, such integrated or distributed embodiments are included within the scope of the present disclosure, unless otherwise stated.
  • components described in various embodiments are not necessarily essential components, and some may be optional components. Accordingly, an embodiment comprised of a subset of components described in one embodiment is also included in the scope of the present disclosure. Also, embodiments that include other elements in addition to the elements described in various embodiments are included in the scope of the present disclosure.
  • FIG. 1 is a view for explaining an article search system according to an embodiment of the present disclosure.
  • the article retrieval system 100 may include a reading unit 110 and / or a learning unit 120.
  • the reading unit 110 may include an image analysis device 112 and / or an output device 114.
  • the learning unit 120 may include a database 122, a deep learning learning unit 124, an algorithm verification unit 126, and / or a trained model storage unit 128.
  • the reading unit 110 may function as a reading interface, and the learning unit 120 may function as a centrally managed artificial intelligence data center.
  • the article search system according to the present disclosure is utilized in an electronic customs clearance system or a security search system as an example.
  • the article search system according to the present disclosure is not limited to such applications.
  • the article search system according to the present disclosure may be utilized in a system that serves to identify a specific article according to various purposes.
  • the input 130 of the article search system 100 may include images, article information and / or control information.
  • the image may be an image of an article including at least one object.
  • it may be an X-Ray image of an article photographed by an X-Ray reading device.
  • the image may be a raw image photographed by an X-Ray imaging device or an image in an arbitrary format (format) for storing or transmitting the raw image.
  • the image may be obtained by capturing and dataizing image information captured by an X-Ray reading device and transmitted to an output device such as a monitor.
  • the image may be enhanced before being output to the output device 114 or before being input to the image analysis device 112. The method of enhancing the image will be described later.
  • the output device 114 may output an image or an enhanced image.
  • the image analysis device 112 may receive an image or an enhanced image and perform an operation of the image analysis device 112 described later.
  • the article information may be information about the article included in the corresponding image.
  • the product information may include import declaration information and / or customs inventory list information.
  • the product information may include passer identification information, passer security level and / or passer authorized item information.
  • the product information may be subjected to a predetermined pre-processing process before being input to the image analysis device 112.
  • a refining operation of a product name may be performed on a product list, import information, and the like included in the product information.
  • Purification work of the product name may refer to a work of unifying the names of various items input for the same or similar items.
  • Input of article information may be optional.
  • the article retrieval system 100 of the present disclosure can operate by receiving only an image as an input even if there is no entry of article information.
  • the article may include all kinds of articles as objects to be inspected or read.
  • the article may be at least one of express cargo, postal cargo, container cargo, traveler transport cargo, and traveler himself.
  • the image analysis apparatus according to the present disclosure is used in a security search system, the article may be at least one of a passenger's belongings and the passenger's own.
  • the electronic customs clearance system reads a traveler, and the traveler is a major traveler with a history of transporting anomalous or dangerous objects in the past, the traveler's cargo has a higher level than that of other travelers.
  • Analysis and / or reading may be performed.
  • the reader may be provided with information that a particular item is the cargo of a traveler of interest.
  • passer-by when the passer-by is a passer with a high security level, it is possible to perform a higher level of analysis and / or reading on the belongings of the passer-by than other passers-by. For example, it is possible to provide the reader with information that a specific item is belonging to a passer with a high security level.
  • control information may be information for controlling image reading or controlling the read image.
  • control information may be input by the reading source 140.
  • control information may include source information, manager information, operation mode information, read sensitivity information, and / or user interface information. The detailed use of control information will be described later.
  • the article search system 100 may receive an image, article information, and / or control information 130 and transmit it to the output device 114 or transmit it to the image analysis device 112.
  • the image analysis device 112 may analyze the input image using a pre-trained deep learning-based model.
  • the image analysis device 112 may transmit the analyzed result to the output device 114.
  • the output device 114 outputs the input image, product information and / or control information 130, the video analysis result and / or user interface received from the video analysis device 112, and the reader 140 is an output device
  • the output result of 114 can be read.
  • a refining operation may be performed on the article information 130, and also, before being input to the image analysis device 112 and / or before being output to the output device 114, the image for the analysis target image Consolidation can be performed.
  • the output device 114 outputs all types of signals that can be detected by humans, such as a device that outputs visual information such as a monitor or a warning light, a device that outputs sound information such as a speaker, or a device that outputs tactile information such as a vibrator Includes a device that can.
  • a user interface may be provided through the output device 114, and a reader may control the operation of the article retrieval system 100 using the user interface.
  • the reading source 140 may control the operation of the image analysis device by inputting control information using an output user interface.
  • an image analysis result of the image analysis device 112 includes an object to be detected, an object with an abnormality, or an object whose risk level is greater than or equal to a threshold
  • the related information is output through the output device 114 as an image analysis result.
  • the reader 140 can confirm this.
  • the image analysis device 112 may perform various processes of analyzing an image to be analyzed. For example, the image analysis device 112 may perform context analysis to more accurately analyze an analysis target image. Various processes and context analysis performed by the image analysis device 112 will be described later.
  • the reader 140 may determine whether to perform an additional test based on an image analysis result output through the output device 114.
  • the additional inspection may include an opening inspection to directly open an article related to the corresponding image and check an object included in the corresponding article.
  • the object to be searched may refer to an object with an abnormality or an object with a risk greater than or equal to a threshold as described above.
  • the present invention is not limited thereto, and may include various objects to be detected or searched by the system of the present disclosure.
  • the image analysis result of the image analysis device, the remodeling inspection result input by the reader after performing the remodeling inspection directly, and / or matching result information obtained by matching the image and product information by the image analysis device may be transmitted to the learning unit 120.
  • the learning unit 120 may store newly received information in the database 122, and the deep learning learning unit 124 may perform deep learning learning using the information stored in the database 122. Alternatively, without being stored in the database 122, the deep learning learning unit 124 may directly receive all or part of the learning data.
  • the results learned by the deep learning learning unit 124 are verified by the algorithm verification unit 126, and the verified models may be stored as updated models in the trained model storage unit 128.
  • the model stored in the trained model storage unit 128 is transmitted to the image analysis device 112 again, and the image analysis device 112 may update and use the received model as the above-described pre-trained deep learning-based model.
  • the learning unit 120 may generate a composite image by receiving and synthesizing a plurality of images.
  • virtual image analysis results, remodeling inspection results and / or matching result information corresponding to the composite image may be generated using image analysis results, remodeling inspection results, and / or matching result information for each of the plurality of images. Can be.
  • the learning unit 120 may use the composite image and the generated virtual information as learning data.
  • the reading unit 110 and the learning unit 120 may be implemented as separate devices or may be implemented within the same device. In addition, some or all of the components included in the reading unit 110 and the learning unit 120 may be configured by hardware or software.
  • Artificial intelligence technology allows computers to learn data and make decisions on their own as if they were humans.
  • Artificial neural networks are mathematical models inspired by biological neural networks. By changing the intensity of synaptic binding through learning, neurons can mean an overall model with problem-solving skills.
  • Artificial neural networks are generally composed of an input layer, a hidden layer, and an output layer. Neurons included in each layer are connected through weights, linear combination of weights and neuron values, and nonlinearity. Through the activation function, the artificial neural network may have a form capable of approximating a complex function.
  • the purpose of artificial neural network learning is to find a weight that minimizes the difference in value between the output calculated from the output layer and the actual output.
  • Deep neural network is an artificial neural network consisting of several hidden layers between the input layer and the output layer, and can model complex nonlinear relationships through many hidden layers.
  • the structure is called deep learning. Deep learning learns a very large amount of data, and when new data is input, it can operate adaptively according to the image because it selects the highest answer probability based on the learning result. In the process of learning, characteristic factors can be automatically found.
  • the deep learning-based model includes a fully convolutional neural network, a fully convolutional neural network, and a cyclic neural network (regression). It may include at least one of a neural network, a recurrent neural network, a restricted Boltzmann machine (RBM), and a deep belief neural network (DBN), but is not limited thereto.
  • a machine learning method other than deep learning may also be included.
  • a hybrid model combining deep learning and machine learning may be included. For example, a feature of an image based on deep learning may be extracted, and a machine learning based model may be applied when classifying or recognizing an image based on the extracted feature. Models based on machine learning may include, but are not limited to, Support Vector Machines (SVM), AdaBoost, and the like.
  • SVM Support Vector Machines
  • AdaBoost AdaBoost
  • a method of learning a model based on deep learning may include at least one of supervised learning, unsupervised learning, or reinforcement learning.
  • Supervised learning is performed using a series of learning data and a corresponding label (target output value), and the neural network model based on supervised learning is a model in which a function is inferred from training data.
  • Supervised learning receives a series of training data and its corresponding target output value, finds errors through learning to compare the actual output value with the target output value for input data, and corrects the model based on the result. do.
  • Supervised learning can be divided into regression, classification, detection, and semantic segmentation. The function derived through supervised learning can be used to predict new results.
  • the neural network model based on supervised learning optimizes the parameters of the neural network model through learning a lot of training data.
  • a model based on deep learning may use information about an input image and an article for learning, and after generating a trained model, information about an image and an article acquired by the apparatus of the present disclosure Can be used to update the neural network model.
  • an analysis result output by the method of the present disclosure for example, anomalies or risks for the identified object, information about the object, and the identified object are searched
  • the neural network model may be updated using prediction results, such as whether the object is an object, comparison information on the prediction result and the final remodeling test result, and evaluation or reliability information on the prediction result.
  • FIG. 2 is a block diagram showing the configuration of an image analysis apparatus 200 according to an embodiment of the present disclosure.
  • the image analysis device 200 of FIG. 2 is an embodiment of the image analysis device 112 of FIG. 1.
  • the image analysis device 200 may include an image receiving unit 210, an article information matching unit 220, and / or an image analyzing unit 230. As described above, since the input of the article information is optional, the image analysis apparatus 200 may not include the article information matching unit 220. Description of the input of the article information is as described with reference to FIG. 1.
  • the image receiving unit 210 may receive an image of an article including one or more objects.
  • the description of the image received by the image receiving unit 210 is as described with reference to FIG. 1.
  • the article information matching unit 220 may receive the image received from the article information and the image receiving unit 210 as an input and perform matching of the article information and the image.
  • the description of the article information is as described with reference to FIG. 1.
  • Matched images and article information may be output to a reader to assist the reader in reading.
  • the matched image and article information may be transmitted to the learning unit 120 of FIG. 1 to be used for learning the deep learning model.
  • the matched image and article information are stored in the database 122 of the learning unit 120 of FIG. 1, and then refined for each reading object and / or reading task, and the deep learning learning unit 124 is for each reading object and / or Alternatively, learning can be performed using the refined data for each read task to be applied.
  • the objects to be read may include express cargo, postal cargo, container cargo, traveler transport cargo, and traveler. Also, the object to be read may include a passenger's belongings and a passenger.
  • the reading task includes determining whether an object included in the object is abnormal or dangerous, determining whether the identified object is an object to be searched, determining whether information on the identified object matches the object, whether the object is reported, or It may include a judgment as to whether or not it has been reported.
  • the model trained in the learning unit 124 may be input to the image analysis unit 230 to update the existing model. At this time, suitable artificial intelligence may be updated according to the object to be read.
  • the learning unit 124 may generate new learning data using the existing learning data and use it for learning. As described above, new learning data can be generated by combining existing images and merging data.
  • the image analysis unit 230 may receive an image (image to be analyzed) or image and article information, analyze the image using a pre-trained deep learning-based model, and output the analyzed result to the output device have.
  • the image analysis unit 230 may identify an object included in the image, and determine whether there is an abnormality or risk for the identified object.
  • the image analysis unit 230 may improve the accuracy of object identification by performing a context analysis process described below.
  • the image analysis unit 230 may determine that the object is abnormal or dangerous.
  • the risk may be expressed as a numerical value, and it may be determined whether the object is a dangerous object through comparison with a predetermined threshold.
  • the numerical value related to the risk and / or the predetermined threshold may be adaptively determined according to a read target and / or a read task.
  • the image analysis unit 230 may more accurately perform analysis on the object included in the image using the image and article information. For example, the type, quantity and / or size information of the items listed in the item list list, the security level of the passer, and / or the authorized item information of the passer may be additionally used to identify the object from the image. When there is a discrepancy between the object and the item information identified by analyzing the image, it may be output as a result of the image analysis.
  • the image analysis result output by the image analysis unit 230 may include at least one of an object's risk, type, amount, number, size, and location.
  • the location of the object may be displayed on the image to be analyzed and output to the output device.
  • the position of the object may be displayed in coordinates, but the object may be highlighted and displayed at the corresponding position in the output image so that the reader can easily read it.
  • the object may be emphasized by highlighting the edge of the object or by displaying a square box surrounding the object.
  • a predetermined object area may be enhanced so that a reader can more easily identify the object through an image enhancement process described later.
  • an image corresponding to a predetermined color may be enhanced to convert an image so that the region can be more clearly identified.
  • the image analysis unit 230 may determine whether an object to be searched (eg, an object for which customs clearance is prohibited or inappropriate) is included in the analysis target image. To this end, the image analysis unit 230 may receive or store information about an object to be searched in advance. In addition, the image analysis unit 230 may identify an object included in the image and determine whether the identified object is a search target object.
  • an object to be searched eg, an object for which customs clearance is prohibited or inappropriate
  • 3 is a view for explaining an image reading process.
  • FIG. 3A is a flowchart of a conventional reading process
  • FIG. 3B is a flowchart of a reading process according to an embodiment of the present disclosure.
  • the image analysis apparatus 322 may learn deep learning in advance.
  • the image is analyzed using the base model, and the analyzed result is provided as information to the reader (324).
  • the image analysis device 322 may transmit the training data to the AI data center 323, and the AI data center 323 may learn the training data.
  • the artificial intelligence data center 323 may transmit the trained model to the image analysis device 322 as a read assistant assistive AI for each reading object.
  • the reader may select 325 an item requiring remodeling inspection based on an analysis result of the image analysis device 322, an image, and / or item information.
  • the result of performing the remodeling test may be input 326 as a test result.
  • the test result may be transmitted to the AI data center 323 and used as learning data.
  • FIG. 4 is a diagram for explaining an application range of artificial intelligence in an image reading process according to an embodiment of the present disclosure.
  • a sample 420 randomly extracted from all the items 410 may be selected 450 as a management object.
  • the risk analysis 440 for all the products 410 may be performed using the screening assistant artificial intelligence 430 for selecting the management object, and the management object may be selected 450 through this.
  • artificial intelligence is not limited to the risk analysis 440 of the aforementioned articles.
  • the management target when selected 450, it may be used as an inspection aid artificial intelligence 460 to assist the examination.
  • the inspection aid artificial intelligence 460 by applying the inspection aid artificial intelligence 460, the inspection of the inspection source can be assisted by identifying the object, determining whether an identified object is abnormal or dangerous, and / or providing information about the object to be searched to the inspection source. .
  • the reader may perform a precise inspection 470 using information provided by the inspection assistant artificial intelligence.
  • FIG. 5 is a diagram illustrating an embodiment of an image enhancement device that performs image enhancement according to the present disclosure.
  • the image enhancement device of FIG. 5 may be configured separately from the image analysis device 112 of FIG. 1 or may be configured as a part thereof.
  • the image enhancement device 500 may include an image reception unit 510, an object image extraction unit 520, a color distribution analysis unit 530, and / or an image enhancement unit 540.
  • this only shows some components necessary to describe the present embodiment, and the components included in the image enhancement apparatus 500 are not limited to the above-described examples. For example, two or more components may be implemented in one component, or an operation executed in one component may be divided and implemented to be executed in two or more components. In addition, some components may be omitted or additional components may be added.
  • the image enhancement apparatus 500 receives the input image 550, extracts an object included in the input image 550, and extracts an object image including the object into one or more regions. Divide, obtain color distribution information for each of the one or more areas, and determine one or more weights for at least some of the one or more areas based on the color distribution information, and determine the determined one or more weights among the one or more areas.
  • the first output image 560 for the object image may be generated by applying to at least a part.
  • Each pixel constituting the image may have a predetermined brightness and color by a combination of a luminance value representing luminance (brightness) and a color value representing color.
  • the color value may be represented by a combination of values of three or more color elements according to various ways of expressing color.
  • the color value may be expressed as an RGB value that is a combination of three color elements (Red (R), Green (G), Blue (B)).
  • R, G, and B has a value from 0 to 255, so that the intensity of each color element can be expressed.
  • the range of values that each of R, G, and B can have may be determined based on the number of bits representing each of R, G, and B. For example, when represented by 8 bits, each of R, G, and B may have a value from 0 to 255.
  • Acquiring color distribution information may mean acquiring various statistical values that can be obtained by analyzing color components of color values of pixels included in a corresponding region.
  • the statistical value may be information on a color element having an average largest value among color elements of color values of pixels included in a corresponding region. For example, based on the sum of the values of R, G, and B of all pixels included in the corresponding area, it may be determined which color element has the largest sum or average among R, G, and B. Alternatively, for each pixel, the color element having the largest value among R, G, and B is determined as the dominant color of the corresponding pixel, and which color is determined as the dominant color for all pixels included in the corresponding region I can judge.
  • the dominant color of a given area For example, for the color values of the majority of pixels included in a predetermined region, if R among the three color elements R, G, and B has the largest value, the dominant color of the predetermined region is red. I can judge.
  • color distribution information or dominant color was analyzed based on each of R, G, and B.
  • the present invention is not limited thereto, and may be analyzed based on various colors expressed by a combination of two or more of R, G, and B. For example, if the color to be identified is orange, it may be determined whether the dominant color of the pixel in the corresponding area is orange based on a combination of some or all of R, G, and B representing orange.
  • a region in which the dominant color is red is an object of image enhancement, and an embodiment of a process of enhancing the image by applying a weight will be described in detail.
  • one or more weights may be determined for the corresponding region. Weights can be determined for all or part of R, G, B and luminance. For example, when enhancing red, the weight for R may be a value greater than one. Applying a weight may mean multiplying a color element value of a pixel in a corresponding area by a corresponding weight. In this case, the weight for G and / or B may be a value less than one. By doing so, the region where red is dominant can be strengthened to a region that is more red.
  • the enhancement of the image of the present disclosure is not limited to this, and may include both a change in color value or a change in brightness value. Therefore, if necessary, an image may be enhanced by applying a weight to a luminance value.
  • the image receiving unit 510 may receive an input image 550 including one or more objects.
  • the input image 550 may be an image before being input to the image analysis device 112 and / or an image before being output to the output device 114.
  • the object image extracting unit 520 may extract an object included in the input image received from the image receiving unit 510 and divide the object image including the object into one or more regions. For example, the object image extractor 520 compares the pixel value of the analysis target image with a predetermined threshold to binarize the pixel values and group the binarized pixel values to extract objects included in the input image.
  • extracting an object may mean distinguishing an object from a background, an object may mean a specific object in an image, and the background may mean a portion excluding an object from an image.
  • the background of the image may be expressed in a predetermined color according to a method of photographing or a photographing device. For example, the predetermined color may be white. When a color representing the background of the image is specified, the background and the object may be separated based on the specified background color. For example, an object may be classified by deleting the specified background color area from the input image 550.
  • an object image may be obtained by specifying a bounding box surrounding an object area, and the object image extracting unit 520 may generate location information of the separated object based on the specified rectangle box.
  • the rectangular box may mean an object recognition box.
  • the input image is an X-Ray image of an article photographed by an X-Ray reading device
  • the background portion other than the article is unnecessary, the background portion is cut out and the article exists It can be analyzed with only the domain.
  • it can be said that it is important to obtain an area for an article in a real environment in which the article continuously passes through the X-Ray reading device through the conveyor belt.
  • the object image extraction unit 600 of FIG. 6 may be an embodiment of the object image extraction unit 520 of FIG. 5.
  • the input image 610 may be the input image 550 described with reference to FIG. 5, for example, an image related to an article including the bag 612 as a single object.
  • the object image extraction unit 600 first roughly cuts the surrounding area based on the bag 612 by performing a cropping operation on the input image 610 including one bag 612. A discarded, cropped image 620 may be obtained. Then, the object image extractor 600 may obtain the binarized image 630 by binarizing the pixel value by comparing a pixel value of the cropped image 620 with a predetermined threshold. Then, the object image extracting unit 600 may obtain a grouped image 640 by grouping (clustering, morphology, closing) adjacent pixels to select a portion of the object in the binarized image 630.
  • the object image extractor 600 performs labeling and hole filling operations on the grouped image 640 to convert the group of pixels formed in the largest shape into an area 652 for the object.
  • the image 650 from which the object is extracted may be obtained by determining and determining the rest as the region 654 for the background.
  • the object image extraction unit 600 may determine the location of the object in the input image 610 using information on the extracted object image. For example, the object image extraction unit 600 may specify a rectangular box surrounding the object area, and generate location information of the object based on the specified rectangular box. Referring to FIG. 6, the object image extraction unit 600 may specify a rectangular box 662 surrounding the bag 612 and obtain location information of the bag 612 based on the specified rectangular box. .
  • the location information of the bag 612 may be location information of four vertices forming the rectangular box 662, but is not limited thereto.
  • the location information may be represented by the coordinates (x, y) of one vertex of the rectangular box 662, and the width and height of the rectangular box.
  • the coordinates (x, y) of the one vertex may be the coordinates of the upper left corner of the rectangular box 662.
  • the coordinates (x, y) of the vertex may be specified based on the coordinates (0, 0) of the upper left corner of the input image 610.
  • the object image extractor 520 may divide the object image into one or more regions based on the size of the object image. Each of the one or more regions may be square.
  • the object image extraction unit 520 may determine the number or size of regions for dividing the object image based on the size of the object image. For example, when the object image is relatively large or has a size larger than a predetermined threshold, the object image may be divided to have more divided areas. Also, the sizes of the regions dividing the object image may not be the same.
  • the object image extractor 520 converts the object image into a square by up-sampling or down-sampling the object image when the object image is not square, and then converting the object image into one or more squares. It can be divided into regions. For example, since the object image is obtained based on a rectangular box surrounding the object for the object extracted by the object image extraction unit 520, the object image may not be square. In this case, the object image extractor 520 may divide the object image into one or more regions, but acquires and obtains a square object image by upsampling or downsampling in the horizontal or vertical direction of the object image. The divided square object image may be divided into one or more regions.
  • the object image 800 may not be square because it is composed of 9 pixels horizontally and 12 pixels vertically.
  • the shape of one or more regions dividing the object image is not limited to a square.
  • the region may have a form of nxm in which n and m are different positive integers. In this case, the aforementioned upsampling or downsampling may not be performed.
  • the color distribution analysis unit 530 acquires color distribution information for each of the regions divided by the object image extraction unit 520, and based on the color distribution information, for at least some of the regions One or more weights can be determined.
  • the color distribution information may include information for each of n (n is an integer greater than 1) color expression ranges.
  • the "color expression range” may be defined for a color to be identified. In the above-described example, the color expression range of red is described as a reference, but the color expression range of green (G) or blue (B) may be defined. Alternatively, a range of color expression for arbitrary colors (yellow, orange, sky blue, etc.) expressed by combining some or all of R, G, and B may be defined.
  • Image enhancement When an object included in the image is to be enhanced, for example, the area of the object expressed in orange, by analyzing the color distribution information, by applying a weight to a region in which a number of pixels included in the range of orange color are predominant or dominant, Image enhancement according to the present disclosure may be performed.
  • the method of applying the weight is as described above.
  • the color distribution information may include information on some or all of the three color elements. If there are five color elements R, G, B, Y (yellow), and P (purple), the color distribution information may include information on some or all of the five color elements.
  • an X-Ray image of an object photographed by an X-Ray reading device different color expression ranges are determined according to properties of objects included in the image (for example, whether the object is an organic substance, an inorganic substance, a metal, or the like).
  • the applied X-Ray image is used.
  • the reader can discriminate not only the shape of the object included in the image, but also the physical properties of the object.
  • the image enhancement of the present disclosure analyzes color distribution information using an X-Ray image to which color is added according to the physical properties of an object as an input image, and strengthens a region of a specific color based on this, thereby detecting an object included in the image. It can improve the accuracy and readability of the reader reading the image.
  • FIG. 7 is a diagram illustrating an image in which colors are expressed based on physical properties of an object according to an embodiment of the present disclosure.
  • a bag image 700 taken by an X-Ray reading device, a medicine container image 710 and a traveler luggage carrier image 720 are shown.
  • the bag loop 702, the bag zipper 704, the medicine 712, and the bottle 722 it can be confirmed that the color expression range (the applied color) is different depending on the properties of the object.
  • the bag loop 702, the bag zipper 704, the medicine 712, and the bottle 722 are relatively clearly colored so that they can be distinguished from other objects, while any content in the luggage 724 In the case of), it can be seen that it is difficult to determine what the arbitrary content 724 is in the traveler's luggage image 720 and it is not easy to distinguish it from other objects. This is due to the physical properties of the object.
  • a metal or an inorganic material is expressed in a relatively clear and distinct color so that it can be clearly distinguished from a background, whereas an organic material is expressed in a light color so that the distinction from the background is not clear.
  • the area of color representing organic matter can be enhanced with a clear and clear color that can be clearly distinguished from the background through a method of enhancing the corresponding color.
  • the color distribution for each of the divided regions may be analyzed to apply weights to at least some regions.
  • the one or more weights may include weights for at least some of n color expression ranges or n color elements representing colors. For example, if one region has n color expression ranges or color elements, the number of weights in the region may have 1 to n.
  • the determined weight when one weight is determined for one area, the determined weight may be applied to all color elements or all color expression ranges included in the one area. Alternatively, the determined weight may be applied to at least a portion of all color elements or all color expression ranges included in the one area. For example, in order to enhance the image, the determined weight may be applied only to a predetermined color element among n color elements or a predetermined color expression range among n color expression ranges.
  • a weight may be determined for each of n color elements or n color expression ranges. That is, the number of weights for one region may be n. In this case, weights corresponding to each color element or color expression range included in the region may be applied to the corresponding color element or color expression range.
  • the weight may be given a relatively high weight for a predetermined color element or color expression range that is an object of image enhancement. For example, a weight greater than 1 may be given and multiplied by a value of a corresponding color element or a pixel value belonging to a corresponding color expression range.
  • a weight may be determined for each of m color elements greater than 1 and less than n or a color expression range. That is, the number of weights for one region may be m. In this case, the weighted weight may be applied only to a weighted color element or color expression range among color elements or color expression ranges included in the region. It is as described above that a relatively high weight is given to a predetermined color element or color expression range that is an object of image enhancement.
  • the weight may be relatively high for a predetermined color element or color expression range among n color elements or color expression ranges.
  • a boundary is often less clearly defined in an image than an object having different physical properties (metal, inorganic, etc.). This is because the color of the object, which is an organic material, is not vivid enough to be distinguished from other objects or backgrounds. For example, by being expressed in light orange, it may not be well distinguished from a white background. Therefore, by applying a relatively high weight to a portion corresponding to a color expression range representing an organic material among the divided regions, the corresponding color can be enhanced to change, for example, light orange to dark orange. By strengthening the image in this way, it is possible to more clearly distinguish between the surrounding objects or the background and the object to be strengthened.
  • the predetermined color element or color expression range to which a relatively high weight is assigned may be one or more.
  • the predetermined color element or color expression range to which a relatively high weight is assigned may be 1 to n.
  • the predetermined color element or color expression range is plural, the degree of image enhancement required for each may be different, and accordingly, different weights may be assigned to each. For example, when an image is clearly displayed in the order of metal-> inorganic-> organic, a relatively high weight may be given only to a color element or a range of color expression for an organic material, but relative to a metal for an inorganic and organic material You can also give it a high weight. At this time, a relatively high weight may be given to the organic material rather than the inorganic material.
  • FIG. 8 is a view for explaining a process of generating an output image based on color distribution information of an image according to an embodiment of the present disclosure.
  • the object image 800 may be divided into one or more regions, such as the first region 810 and the second region 820.
  • the process of dividing regions in the object image 800 is as described with respect to the object image extractor 520 of FIG. 5.
  • a process of obtaining color distribution information and determining weights in the first area 810 will be described in detail.
  • the image enhancement device acquires color distribution information including information on five color expression ranges for the first area 810, and based on the obtained color distribution information, at least a part of the 3x3 sized area One or more weights can be determined for.
  • only information on a predetermined color expression range targeted for image enhancement may be obtained and used as color distribution information. For example, when the distribution information for a predetermined color expression range is greater than or equal to a predetermined threshold, the corresponding area is determined as a target for enhancement, and a relatively high weight can be given to the corresponding area.
  • the first color channel image 830, the second color channel image 840, and the third color channel may correspond to color elements of R, G, B, Y, and P, respectively.
  • Each of the first to fifth color channel images 830 to 870 is generated by mapping each pixel to a color channel image corresponding to the corresponding color information based on color information of each of the constituent pixels of the first region 810.
  • the first pixel 812 is mapped to the pixel 852 at the corresponding position of the third color channel image 850
  • the second pixel 814 is the pixel at the corresponding position of the first color channel image 830.
  • Mapped to 832, and the third pixel 816 is mapped to the pixel 872 at the corresponding position of the fifth color channel image 870
  • the fourth pixel 818 is the second color channel image 840.
  • the color expression range is up to n
  • fewer color channel images than n may be obtained.
  • pixels having a color corresponding to the fourth color channel image 860 may be obtained. Since this does not exist, a total of four color channel images can be obtained except for the fourth color channel image 860.
  • the first color channel image 830, the second color channel image 840, the third color channel image 850, the fourth color channel image 860, and the fifth color channel image 870 Can be applied to the weights a1, a2, a3, a4, and a5, respectively.
  • the weight may be determined in consideration of the color distribution of pixels constituting each area, and for example, the weight may be determined to be proportional to the color distribution of pixels. Alternatively, the weight may be determined to have a relatively high weight for a predetermined color expression range and a relatively low weight for the rest of the color expression range.
  • the image enhancement unit 540 may generate a first output image for the object image by applying one or more weights determined by the color distribution analysis unit 530 to at least some of the one or more regions. .
  • Weighted a1, a2, a3, a4, and a5 may be applied to the weighted first region 810-1 by combining the weighted first to fifth color channel images. And, by repeating the above process for the remaining regions of the object image 800, the first output image may be finally generated.
  • the weight may be determined in consideration of the color distribution of pixels constituting each region, and a relatively high weight may be determined for a predetermined color expression range and a relatively low weight for the remaining color expression ranges.
  • the portion corresponding to the color representing the organic material in each divided region is not clearly distinguished from the background, so the boundary portion is not relatively clearly expressed in the image, so the weight is determined relatively high, and the color corresponding to the color representing the metal Since the portion is relatively distinct from the background, the weight of the border portion may be relatively low because the border portion is relatively clearly expressed in the image. As described above, applying the weight may mean replacing a pixel in the enhanced region with a new pixel value multiplied by the weight.
  • a relatively high weight can be set.
  • red is enhanced, but is not limited thereto, and any color may be determined as a target color.
  • the predetermined threshold and / or weight may be arbitrarily determined, or may be determined based on accumulated image processing information. Alternatively, by performing learning on the threshold and / or weight through an AI-based learning model, the optimal threshold and / or weight may be continuously updated.
  • the image enhancement unit 540 may generate a second output image for the object image by applying edge-based filtering or smoothing filtering on at least some of the one or more regions. Also, the image enhancement unit 540 may generate a third output image for the object image based on the generated first output image and second output image.
  • Edge-based filtering or smoothing filtering is a technique for enhancing the contrast of an image, including, but not limited to, Wiener filtering, Unsharp mask filtering, Histogram equalization, linear contrast adjustment, and the like. May include techniques to enhance.
  • FIG. 9 is a diagram for explaining a process of obtaining a final output image that combines an image obtained by using color distribution information and an image obtained by applying edge-based filtering or smoothing filtering according to an embodiment of the present disclosure.
  • the object image 900 of FIG. 9, the first area 910 and the weighted first area 910-1 are the object image 800 of FIG. 8, the first area 810, and the weighted first area It may correspond to (810-1), respectively.
  • the image enhancement unit 540 may generate the first region 910-2 to which the filtering is applied to the first region 910, and the first region 910-1 to which the weight is applied. And the first region 910-2 to which filtering has been applied may be combined to generate a final first region 910-3.
  • the image enhancement unit 540 may generate a second output image in which the above filtering techniques are applied to the remaining regions, and a third output image combining the first output image and the second output image.
  • the process of generating a weighted area (eg, 910-1), a filtered area (eg, 910-2), and / or a final area 910-3 using the two may be performed in units of areas.
  • the present invention is not limited thereto, and the process may be performed in units of object images.
  • a weighted object image (first output image) may be obtained by performing a process of applying a weight to each of the regions included in the object image.
  • an object image second output image
  • the final image third output image
  • the influence on the first output image may be relatively small by combining the second output image with the first output image.
  • the color representing the organic substance The weight for the distribution information can be determined relatively higher. Also, for example, by combining the first output image and the second output image, more accurate object recognition may be possible even when multiple objects in the image overlap.
  • FIG. 10 is a view for explaining a process of obtaining a final output image using a graphical model according to an embodiment of the present disclosure.
  • the image enhancement apparatus determines each of the color expression ranges included in the color distribution information as an individual node, and determines a relative relationship between each determined individual node and a first output image, a second output image, and a third output image. Using the relative relationship of, a graphical model of a hierarchical structure can be generated.
  • a first output image 1020 may be obtained by applying a weight to each of the corresponding divided regions or the color expression ranges of the divided regions.
  • the first output image 1020 may be determined as the final output image.
  • the second output image 1030 obtained by applying the contrast enhancement technique of the image is further generated, and the third output image 1040 is generated based on the first output image 1020 and the second output image 1030. You may.
  • FIG. 11 is a view for explaining an image enhancement method according to an embodiment of the present disclosure.
  • the image enhancement method of FIG. 11 is a method performed by the image enhancement apparatus of FIG. 5, and the description of the image enhancement apparatus of FIG. 5 may be applied to the image enhancement method of FIG. 11.
  • step S1100 an input image may be received.
  • an object included in the input image may be extracted. For example, by comparing the pixel value of the input image with a predetermined threshold, the pixel value is binarized and the binarized pixel value is grouped to extract an object included in the analysis target image.
  • the object image including the object may be divided into one or more regions.
  • the number or size of regions for dividing the object image may be determined based on the size of the object image.
  • the sizes of the regions dividing the object image may not be the same.
  • the object image can be divided into one or more regions after up-sampling or down-sampling to convert the object image into a square. have.
  • color distribution information may be obtained for each of the one or more regions.
  • the color distribution information may include information for each of n (n is an integer greater than 1) color expression range.
  • one or more weights may be determined for at least some of the one or more regions based on the color distribution information.
  • the one or more weights may include weights for at least some of the n color expression ranges. For example, if one region has n color expression ranges, the number of weights in the region may have 1 to n.
  • a first output image for the object image may be generated by applying the determined one or more weights to at least some of the one or more regions.
  • a second output image for the object image may be generated by applying edge-based filtering or smoothing filtering to at least some of the one or more regions. Also, for example, a third output image for the object image may be generated based on the generated first output image and second output image.
  • the input image may be an image including two or more objects.
  • two or more objects and a background can be distinguished from the input image, and location information can be generated and used for each of the two or more objects.
  • location information can be generated and used for each of the two or more objects.
  • each pixel group is an area for an object as well as other pixel groups formed in the largest shape. The process of generating location information of each determined object is the same as described for an image including one object.
  • At least some of the components of the image enhancement apparatus and steps of the image enhancement method of the present disclosure may be performed using an AI-based or deep learning-based model.
  • an AI-based or deep learning-based model For example, the size, number of regions generated by dividing an object image, weights determined based on color distribution information, various thresholds mentioned in the present disclosure, whether a second output image is generated, or the like is based on artificial intelligence or deep learning. It can be learned using a model, and information according to the trained model can be used.
  • the image analysis device 1200 of FIG. 12 may be an embodiment of the image analysis device 112 of FIG. 1. Alternatively, the image analysis device 1200 of FIG. 12 may be included in the image analysis device 112 of FIG. 1, or may be configured separately to perform context analysis.
  • the image analysis apparatus 1200 may include a feature extraction unit 1210, a context generation unit 1220, and / or a feature and context analysis unit 1230.
  • the image analysis apparatus 1200 may extract characteristics of an input image (analysis target image), generate context information based on the extracted characteristics, and analyze an analysis target image based on the extracted characteristics and the generated context information. have. For example, the image analysis apparatus 1200 may classify an image using the extracted feature and the generated context information or locate the object of interest.
  • the input image of the image analysis apparatus 1200 may be the same as the input image of the image analysis apparatus 112 of FIG. 1.
  • the feature extraction unit 1210 may analyze the input image to extract features of the image.
  • the feature may be a local feature for each region of the image.
  • the feature extraction unit 1210 may extract characteristics of an input image using a general convolutional neural network (CNN) technique or a pooling technique.
  • the pooling technique may include at least one of a max (max) pooling technique and an average pooling technique.
  • the pooling technique referred to in the present disclosure is not limited to the Max pooling technique or the average pooling technique, and includes any technique for obtaining a representative value of an image region of a predetermined size.
  • the representative value used in the pooling technique may be at least one of a variance value, a standard deviation value, a mean value, a most frequent value, a minimum value, and a weighted average value, in addition to the maximum value and the average value.
  • the convolutional neural network of the present disclosure can be used to extract “features” such as borders, line colors, and the like from input data (images), and may include a plurality of layers. Each layer may receive input data and process input data of the corresponding layer to generate output data.
  • the convolutional neural network may output a feature map generated by convolution of an input image or an input feature map with filter kernels as output data.
  • the initial layers of the convolutional neural network can be operated to extract low level features such as edges or gradients from the input.
  • the next layers of the neural network can extract progressively more complex features, such as the eyes and nose. The detailed operation of the convolutional neural network will be described later with reference to FIG. 16.
  • the convolutional neural network may include a pooling layer in which a pooling operation is performed in addition to a convolutional layer in which a convolution operation is performed.
  • the pooling technique is a technique used to reduce the spatial size of data in the pooling layer.
  • the pooling technique includes a max pooling technique that selects a maximum value in a corresponding region and an average pooling technique that selects an average value in a corresponding region.
  • a max pooling technique is generally used. do.
  • the pooling window size and spacing are generally set to the same value.
  • the stride means adjusting an interval to move when applying a filter to input data, that is, an interval to move the filter, and stride can also be used to adjust the size of the output data.
  • the detailed operation of the pulling technique will be described later with reference to FIG. 17.
  • the feature extraction unit 1210 is pre-processing for extracting features of an analysis target image, and filtering may be applied to the analysis target image.
  • the filtering may be a Fast Fourier Transform (FFT), histogram equalization, motion artifact removal, or noise removal.
  • FFT Fast Fourier Transform
  • the filtering of the present disclosure is not limited to the above-listed methods, and may include all types of filtering capable of improving the image quality.
  • enhancement of the image described with reference to FIGS. 5 to 11 may be performed.
  • the context generation unit 1220 may generate context information of the input image (analysis target image) using the features of the input image extracted from the feature extraction unit 1210.
  • the context information may be a representative value representing all or part of an image to be analyzed.
  • the context information may be global context information of the input image.
  • the context generation unit 1220 may generate context information by applying a convolutional neural network technique or a pooling technique to features extracted from the feature extraction unit 1210.
  • the pooling technique may be, for example, an average pooling technique.
  • the feature and context analysis unit 1230 may analyze an image based on the feature extracted by the feature extraction unit 1210 and the context information generated by the context generation unit 1220.
  • the feature and context analysis unit 1230 according to an embodiment concatenates local features and local contexts reconstructed by the context generation unit 1220 for each region of the image extracted by the feature extraction unit 1210. It can be used together to classify the input image or to find the location of the object of interest included in the input image. Since information at a specific 2D position in the input image includes not only local feature information but also global context information, the feature and context analysis unit 1230 uses these information, so that the actual content is different but local feature information. It is possible to more accurately recognize or classify similar input images.
  • the invention according to an embodiment of the present disclosure enables more accurate and efficient learning and image analysis by using global context information as well as local features used by general convolutional neural network techniques. do.
  • the neural network to which the invention according to the present disclosure is applied may be referred to as 'deep neural network through context analysis'.
  • FIG. 13 is a diagram illustrating a process of generating and analyzing context information of an image according to an embodiment of the present disclosure.
  • the feature extraction unit 1310, the context generation unit 1320, and the feature and context analysis unit 1330 of FIG. 13 are the feature extraction unit 1210, the context generation unit 1220, and the feature and context analysis of FIG. 12, respectively. It may be an embodiment of the unit 1230.
  • the feature extractor 1310 may extract a feature from the input image 1312 using the input image 1312 and generate a feature image 1314 that includes the extracted feature information.
  • the extracted feature may be a feature for a local area of the input image.
  • the input image 1312 may include an input image of an image analysis device or a feature map at each layer in a convolutional neural network model.
  • the feature image 1314 may include a feature map and / or feature vector obtained by applying a convolutional neural network technique and / or a pooling technique to the input image 1312.
  • the context generation unit 1320 may generate context information by applying a convolutional neural network technique and / or a pooling technique to the feature image 1314 extracted by the feature extraction unit 1310.
  • the context generating unit 1320 may generate context information of various scales, such as an entire image, a quadrant area, and a 9-section area, by variously adjusting the spacing of the pooling.
  • an entire context information image 1322 including context information for an image of a full-size image, and a quadrant context information image including context information for a quarter image having a size that is divided into four parts of the entire image ( 1324) and a 9-part context information image 1326 may be obtained, including context information for a 9-part image of a size divided into 9 parts.
  • the feature and context analysis unit 1330 may more accurately perform analysis on a specific region of an analysis target image using both the feature image 1314 and the context information images 1322, 1324, and 1326.
  • the identified object is obtained from the feature image 1314 including local features extracted by the feature extractor 1310. It is impossible to accurately determine whether is a car or a boat. That is, the feature extraction unit 1310 may recognize the shape of the object based on the local feature, but may not accurately identify and classify the object using only the shape of the object.
  • the context generation unit 1320 can more accurately identify and classify objects by generating context information 1322, 1324, and 1326 based on the analysis target image or the feature image 1314.
  • the feature extracted for the entire image is recognized or classified as "natural landscape”
  • the feature extracted for the quarter image is recognized or classified as “lake”
  • the feature extracted for the 9-part image is "water”
  • the extracted features “natural landscape”, “lake”, and “water” may be generated and utilized as context information.
  • the feature and context analysis unit 1330 may identify an object having a shape of the boat or vehicle as a "boat" by utilizing the context information.
  • context information for an entire image In the embodiment described with reference to FIG. 13, it has been described to generate and utilize context information for an entire image, context information for a quarter image, and context information for a ninth image, but the size of the image for extracting context information is It is not limited to this.
  • context information for an image having a size other than the above-described image may be generated and utilized.
  • FIG. 14 is a diagram for explaining a process in which an image analysis apparatus according to an embodiment of the present disclosure analyzes an image to identify an object.
  • the image analysis device 1400 may accurately identify and / or classify objects included in the image 1410 by receiving the image 1410 and generating information on image regions of various sizes.
  • the input image 1410 may be, for example, an X-ray image including a bag.
  • the image analysis device 1400 analyzes the input image 1410 as described above, extracts features for the entire image, and features for some areas of the image, and accurately identifies the objects included in the image 1410 using the image analysis. can do.
  • the feature 1422 for the entire image may be, for example, a feature for the shape of the bag.
  • Features for some areas of the image may include, for example, features 1424 for handles, features 1426 for zippers, features 1428 for rings, and the like.
  • the image analysis apparatus 1400 can accurately identify that the object included in the image 1410 is a "bag” by using the generated features 1422, 1424, 1426, and 1428 as context information.
  • the image analysis device 1400 cannot identify that the object included in the image 1410 is a "bag” or the image 1410 It may provide an analysis result that the object included in the "bag” cannot be identified.
  • an abnormality of the corresponding object may be output. For example, when an irregular space, a space of a certain thickness, or the like, which is not related to the normal characteristics of the "bag", is detected, the corresponding "bag” may output a signal that there is an abnormal bag.
  • contextual information that is not related to the normal contextual information when contextual information that is not related to the normal contextual information is included, such fact may be output to the reader, and the reader may, based on this, perform a close inspection or remodeling inspection of the object or object of the corresponding image. Can be done.
  • 15 is a view for explaining the operation of the image analysis apparatus according to an embodiment of the present disclosure.
  • the image analysis device may extract characteristics of the image to be analyzed.
  • the image analysis apparatus may extract characteristics of an input image using a general convolutional neural network technique or a pooling technique.
  • the characteristic of the analysis target image may be a local characteristic for each region of the image, and the pooling technique may include at least one of a max pooling technique and an average pooling technique.
  • step S1510 the image analysis device may generate context information based on the feature extracted in step S1500.
  • the image analysis apparatus may generate context information by applying a convolutional neural network technique and / or a pooling technique to features extracted in step S1500.
  • the context information may be a representative value representing all or part of an image to be analyzed.
  • the context information may be global context information of the input image.
  • the pooling technique may be, for example, an average pooling technique.
  • step S1520 the image analysis device may analyze the analysis target image based on the feature extracted in step S1500 and the context information generated in step S1510.
  • the image analysis apparatus may classify the input image by combining local features of each region of the image extracted in step S1500 and the global context reconstructed in step S1510, or find the location of the object of interest included in the input image. have. Accordingly, since information at a specific 2D position in the input image is included from the local information to the global context, more accurate recognition or classification of input images having different local contents but similar local information is possible. Alternatively, it is possible to detect an object containing contextual information that is not related to other contextual information.
  • 16 is a diagram for explaining an embodiment of a multi-product neural network generating a multi-channel feature map.
  • the image processing based on the convolutional neural network can be used in various fields.
  • an image processing device for object recognition of an image an image processing device for image reconstruction, an image processing device for semantic segmentation, and image processing for scene recognition It can be used for devices and the like.
  • the input image 1610 may be processed through the convolutional neural network 1600 to output a feature map image.
  • the output feature map image can be utilized in various fields described above.
  • the convolutional neural network 1600 may be processed through a plurality of layers 1620, 1630, and 1640, and each layer may output multi-channel feature map images 1625 and 1635.
  • the plurality of layers 1620, 1630, and 1640 may extract characteristics of an image by applying a filter having a constant size from the upper left to the lower right of the received data.
  • the plurality of layers 1620, 1630, and 1640 multiply the weights of the upper left NxM pixels of the input data and map them to one neuron in the upper left of the feature map.
  • the multiplied weight will also be NxM.
  • the NxM may be, for example, 3x3, but is not limited thereto.
  • the plurality of layers 1620, 1630, and 1640 scan input data from left to right and from top to bottom by multiplying the weights by k cells to map to the neurons of the feature map.
  • the k column means a stride to move the filter when performing the convolution, and may be appropriately set to adjust the size of the output data.
  • k may be 1.
  • the NxM weight is called a filter or filter kernel. That is, the process of applying a filter in a plurality of layers 1620, 1630, and 1640 is a process of performing a convolution operation with the filter kernel, and as a result, the extracted result is a "feature map" or "feature. It is called "map image".
  • the layer on which the convolution operation is performed may be referred to as a convolutional layer.
  • multiple-channel feature map refers to a set of feature maps corresponding to a plurality of channels, and may be, for example, a plurality of image data. It may be an input from an arbitrary layer, or an output according to a result of a feature map operation such as a convolution operation, etc.
  • the multi-channel feature maps 1625 and 1635 are feature extraction layers of the convolutional neural network. Fields "or" convolutional layers ", which are created by a plurality of layers 1620, 1630, 1640.
  • Each layer sequentially receives multi-channel feature maps generated in the previous layer, and then as outputs Multi-channel feature maps may be generated.
  • the L (L is an integer) layer 1640 receives the multi-channel feature maps generated by the L-1-th layer (not shown), and the multi-channel features of not shown. You can create maps.
  • feature maps 1625 having K1 channels are outputs according to feature map operation 1620 in layer 1 for input image 1610, and feature map operation 1630 in layer 2 ).
  • feature maps 1635 having K2 channels are outputs according to feature map operation 1630 in layer 2 for input feature maps 1625, and feature map operation in layer 3 (not shown) It becomes the input for.
  • the multi-channel feature maps 1625 generated in the first layer 1620 include feature maps corresponding to K1 (K1 is an integer) channels.
  • the multi-channel feature maps 1635 generated in the second layer 1630 include feature maps corresponding to K2 (K2 is an integer) channels.
  • K1 and K2 representing the number of channels may correspond to the number of filter kernels used in the first layer 1620 and the second layer 1630, respectively. That is, the number of multi-channel feature maps generated in the M (M is an integer of 1 or more and L-1 or less) layer may be the same as the number of filter kernels used in the M layer.
  • 17 is a view for explaining an embodiment of a pooling technique.
  • the window size of the pooling is 2 ⁇ 2 and the stride is 2, and Max pooling may be applied to the input image 1710 to generate the output image 1790.
  • a 2x2 window 1710 is applied to the upper left of the input image 1710, and a representative value (here, maximum value 4) among the values in the window 1710 area is calculated to output the image 1790. ) In the corresponding position 1720.
  • the window is moved by stride, that is, by 2, and a maximum value 3 of the values in the window 1730 area is input to a corresponding position 1740 of the output image 1790.
  • the process is repeated from the position below the stride from the left of the input image. That is, as illustrated in (c) of FIG. 17, the maximum value 5 of the values in the window 1750 area is input to the corresponding position 1760 of the output image 1790.
  • the window is moved by the stride, and a maximum value 2 of the values in the window 1770 area is input to a corresponding position 1780 of the output image 1790.
  • the above process may be repeatedly performed until a window is located in the lower right area of the input image 1710, thereby generating an output image 1790 that applies pooling to the input image 1710.
  • FIG. 18 is a block diagram showing the configuration of an image synthesizing apparatus according to an embodiment of the present disclosure.
  • the image synthesis device 1800 includes an object image extraction unit 1810, an object location information generation unit 1820, an image synthesis unit 1830, and / or an object detection deep learning model learning unit 1840. It can contain. However, this only shows some components necessary to describe the present embodiment, and the components included in the image synthesizing apparatus 1800 are not limited to the above-described examples. For example, two or more components may be implemented in one component, or an operation executed in one component may be divided and implemented to be executed in two or more components. In addition, some components may be omitted or additional components may be added. Or, among the components of the image analysis device 112 of FIG. 1, the image enhancement device 500 of FIG. 5, the image analysis device 1200 of FIG. 12, and the image synthesis device 1800 of FIG. 18, the same function or similar The component performing the function may be implemented as one component.
  • the image synthesizing apparatus 1800 receives a first image including a first object and a second image including a second object, and objects for each of the first image and the second image And a background, the first object and the second object are generated, and the first object and the second object are based on the first object and the second object.
  • 3 images can be generated, and an object detection deep learning model can be trained using location information of a first object, location information of a second object, and a third image.
  • the input image 1850 may include an image including a single object.
  • the description of the input image 1850 is the same as the description of the input image described with reference to FIG. 1 and the like.
  • the object image extraction unit 1810 may receive an image 1850 including a single object and distinguish the received image into an object and a background.
  • the description of the object image extraction unit 1810 is the same as the description of the object image extraction unit 520 described with reference to FIGS. 5 and 6.
  • the object location information generation unit 1820 may determine the location of the object extracted from the object image extraction unit 1810. For example, the object location information generating unit 1820 specifies a bounding box surrounding the object area, and generates location information of the object classified by the object image extraction unit 1810 based on the specified square box. can do.
  • the description of the method for generating the location information of the object is the same as the description for the method with reference to FIG. 6.
  • location information of an object included in an image may be automatically generated, a hassle of having to manually input location information of an object for each image is avoided by a reader for artificial intelligence learning. Can be.
  • the image synthesizing unit 1830 uses a plurality of single object images obtained through the object image extraction unit 1810 and the object location information generation unit 1820 to obtain multi-object images.
  • Can generate For example, for the first image including the first object and the second image including the second object, the location information of the first object through the object image extraction unit 1810 and the object location information generation unit 1820, respectively, and The location information of the second object is obtained, and the image synthesis unit 1830 generates a third image including the first object and the second object based on the obtained location information of the first object and the location information of the second object can do.
  • a detailed process of generating a multi-object image will be described in more detail with reference to FIG. 19.
  • the image synthesizing unit 1900 of FIG. 19 is an embodiment of the image synthesizing unit 1830 of FIG. 18.
  • the image synthesizing unit 1900 includes the first single object image 1910, the second single object image 1920, and the first single object image (obtained through the object image extraction unit and the object location information generation unit).
  • the first single object image 1910 and the second single object image are included in the synthesized multi-object image 1940 and the multi-object image 1940. It is possible to obtain location information 1950 for the objects.
  • the image synthesizing unit 1900 may also use an image 1930 for a background separated from an object when synthesizing the first single object image 1910 and the second single object image 1920.
  • the location information of the first single object image 1910 and the location information of the second single object image 1920 may be arbitrarily modified.
  • image synthesis may be performed based on the corrected location information. By doing so, it is possible to generate a myriad of synthetic images and virtual location information.
  • the object detection deep learning model learning unit 1840 may train an object detection deep learning model using location information of a first object, location information of a second object, and a third image.
  • the object detection deep learning model learning unit 1840 may train the convolutional neural network model.
  • the location information of the first object, the location information of the second object, and the third image may be used for training the convolutional neural network model.
  • the object detection deep learning model learning unit 2000 of FIG. 20 is an embodiment of the object detection deep learning model learning unit 1840 of FIG. 18.
  • a multi-object image 2010 synthesized by using single object images and location information of objects may be used as data necessary for learning.
  • the object detection deep learning model learning unit 2000 may train the convolutional neural network 2020 by projecting the location information of each single object with respect to the multi-object image 2010.
  • an X-Ray image in which a plurality of objects are overlapped may be obtained.
  • a plurality of images may be obtained. Since the convolutional neural network is trained by using the shape of each object together with the positional information of the object, more accurate detection results can be obtained even if overlapping occurs between objects.
  • 21 is a view for explaining a process of analyzing an actual image using an image synthesizing apparatus according to an embodiment of the present disclosure.
  • the image synthesizing apparatus 2100 of FIG. 21 is an embodiment of the image synthesizing apparatus 1800 of FIG. 18.
  • the operations of the object image extraction unit 1810, the object location information generation unit 1820, the image synthesis unit 1830, and the object detection deep learning model learning unit 1840 included in the image synthesis device 1800 of 18 are the same. .
  • the image synthesizing apparatus 2100 includes an object image extraction unit 2104, an object location information generation unit 2106, an image synthesis unit 2108, and an object detection deep learning model learning unit for a plurality of single object images 2102.
  • an object detection device 2120 may detect each object using the convolutional neural network model trained by the image processing device 2100 for the image 2122 including multiple objects in a real environment.
  • the image synthesizing apparatus 2100 of the present disclosure may newly generate a multi-object-containing image based on extraction of a single object region in an X-Ray image. Can be.
  • the object detection device 2120 may find an area where multiple objects included in an article passing through the X-Ray searcher exist. Therefore, by automatically extracting the location of the object with respect to the X-Ray image, the reader can perform the image inspection task more easily, and also includes information on the quantity of the extracted object and the object in the object. It can be used for tasks such as comparing computerized information.
  • 22 is a diagram for explaining a method for synthesizing an image according to an embodiment of the present disclosure.
  • the first image including the first object and the second image including the second object may be input to distinguish an object and a background for each of the first image and the second image. For example, by comparing the pixel value of the input image with a predetermined threshold, the pixel value is binarized, and the objects included in the input image can be distinguished by grouping the binarized pixel values.
  • step S2210 location information of the separated first object and the second object may be generated. For example, a rectangular box surrounding the object area may be specified, and based on the specified rectangular box, location information of the object classified in step S2200 may be generated.
  • a third image including the first object and the second object may be generated.
  • a third image including the first object and the second object may be generated based on the position information of the first object and the position information of the second object obtained in step S2210.
  • the object detection deep learning model may be trained using the location information of the first object, the location information of the second object, and the third image.
  • the convolutional neural network model may be trained, and the location information of the first object and the location information of the second object generated in step S2210 and the third image generated in the step S2220 may be used for training the convolutional neural network model. Can be.
  • the present invention is not limited thereto, and the input image may be an image including two or more objects.
  • the input image may be an image including two or more objects.
  • two or more objects and a background can be distinguished from the input image, and location information can be generated and used for each of the two or more objects.
  • a third image may be generated using two or more single object images and location information of each object. That is, the image processing method and apparatus according to the present disclosure may generate a third image based on two or more images each including one or more objects and location information of each object.
  • control information of a user interface and an image analysis apparatus that may be provided to control the article search system according to the present disclosure will be described with reference to FIGS. 23 to 27.
  • FIG. 23 is a diagram for describing a user interface according to an embodiment of the present disclosure.
  • FIG. 23 an example of a user interface that can be provided to a reader through an output device will be described.
  • the user interface illustrated in FIGS. 23 to 25 is an example, and the detailed configuration of the present disclosure is not limited thereto, and various types of user interfaces capable of providing the same function to the reader are the scope of the present disclosure. Can be included in
  • the user interface 2300 of FIG. 23 may be provided to the reader 140 through the output device 114 of FIG. 1.
  • the read source may use the user interface 2300 through the input of the control information 130 of FIG. 1.
  • the control information may mean information used to control the operation of the image analysis device through a user interface.
  • the control information may refer to information related to the operation of the image analysis device input through the user interface.
  • the user interface 2300 for inputting control information includes, for example, a rectangular box setting UI 2302, an operation mode setting UI 2304, a sensitivity setting UI 2306, a sensitivity default restoration UI 2308, and an image. It may include at least one of the size default restoration UI 2310, the detection function default setting UI 2312, the detection history inquiry UI 2314 and the image processing function UI 2316.
  • the reader can determine whether a rectangular box is shown on the detection execution screen.
  • the square box may mean an object recognition box.
  • the reader can also set the operation mode of the image analysis device through the operation mode setting UI 2304.
  • the operation mode of the video analysis device may include a busy mode and a non-busy mode.
  • the reader can enable the article retrieval system to effectively collect learning data when multiple objects are input to the conveyor belt of the X-ray reading device during the same time period.
  • the operation of the busy mode and the non-busy mode of the image analysis device will be described later.
  • the reader can determine the intensity of the deep learning based model used for object detection through the sensitivity setting UI 2306.
  • Sensitivity in the present specification may mean a degree of determination as to whether the detected object corresponds to an object to be detected.
  • the reader can set AI sensitivity to object detection in various ways, taking into account the degree of overlap of objects, the size of each object, and the importance of detection.
  • setting the sensitivity to a specific object high may mean that the artificial intelligence can easily set an arbitrary object as a specific object. That is, the high sensitivity may mean that the image analysis apparatus determines an arbitrary object as a specific object, even in a situation in which the random object is a specific object.
  • Sensitivity in this disclosure can be expressed as a percentile value.
  • sensitivity for a specific object when the sensitivity for a specific object is set to 50%, it may mean that the image analysis device is configured to detect a case where a probability that an object is determined as a specific object exceeds 50%. That is, in this specification, sensitivity may mean the strength of a deep learning-based model used for object detection.
  • the reader can use the sensitivity setting UI 2304 to set the sensitivity of the AI to be applied differently for each type of object.
  • Figure 23 shows a sensitivity of 94% for laptops, 99% for mobile phones, 98% for hard drives (HDDs), 100% for USBs, and 94% for tablets. Shows the set of sensitivity applied.
  • USB may be significantly smaller in size than other objects, it may be desirable to set the AI sensitivity higher than other objects.
  • the types of objects disclosed in FIG. 23 are exemplary and the scope of the present invention is not limited thereto.
  • the reader may restore the sensitivity setting according to the type of the object to the default value using the sensitivity default value restoration UI 2308.
  • the default value for all sensitivities can be set to 50%.
  • the reader can configure the sensitivity default value as a preset and input it to the image analysis device. In this case, the reader can restore a preset preset sensitivity value using the sensitivity default value recovery UI 2308.
  • the user interface 2300 may include a size adjustment UI that can adjust the size of the image output to the output device to an arbitrary size.
  • the reader can adjust the size of the object image output to the output device through the size adjustment UI.
  • the image size restoration UI 2310 the reader can restore the size of the object image output to the output device to the default value.
  • the size of the reconstructed basic image can be set in advance by a reader.
  • the basic image size may be determined in units of pixels, or may be determined based on an actual size of an image output through the output device.
  • the reader may designate a predetermined AI sensitivity preset or select a preset preset using the detection function basic setting UI 2312.
  • the preset may be determined based on the size of the object to be detected, or may be determined based on the importance, safety, or security of each object to be detected. Readers can directly create AI presets used in the present disclosure.
  • the preset may be determined as an optimal preset for a specific task from a deep learning-based model learned in advance. That is, the image analysis apparatus may be provided with a sensitivity preset of artificial intelligence used in a process for object detection as well as a process for object detection from the learning unit 120 disclosed in FIG. 1.
  • the sensitivity preset of artificial intelligence is simply defined as a preset.
  • the preset provided by the learning unit 120 may be determined by referring to the field and item information in which the item search system is utilized. For example, in a situation where detection of a customs clearance item such as firearms or drugs is prioritized, a high sensitivity is set to firearms or drugs and a preset set with low sensitivity to general passenger cargo will be provided from the learning unit 120. Can be. As another example, when the product search system is used to search for goods for a group of passers who want to enter a space where photography or video shooting is prohibited, a high sensitivity is set for a camera, a video recording device, or a recorder, and is used for general passer's belongings. Presets set with low sensitivity may be provided from the learning unit 120.
  • the reader can add or delete an object to be detected by using the detection function basic setting UI 2312, or set a preset for the object image size.
  • the addition or deletion of an object to be detected may also be determined through a preset provided from the learning unit 120.
  • the reader can query the history of the previously output object image using the detection history inquiry UI 2314.
  • the detection history inquiry UI 2314 may provide an object image and article information on the object image to a reader.
  • the reader uses the detection history inquiry UI 2314 to determine whether the object image detected in the past is suitable for use as training data, and whether or not the learning unit uses the training data for deep learning-based model training. You can decide whether or not. As the reading source excludes incorrect learning data, the image analysis apparatus has an effect of obtaining higher deep learning-based detection accuracy.
  • the reader may perform image processing corresponding to the image processing function of the X-ray reading device on the current object image using the image processing function UI 2316.
  • 24 is a diagram for describing a user interface according to some embodiments of the present disclosure.
  • the object detection user interface 2400 illustrated in FIG. 24 may include a detection result UI 2402, a control information UI 2404, and / or an object image screen UI 2406.
  • the object image obtained by the detection function of the image analysis device may be provided to the reader through the object image screen UI 2406.
  • the detection result of the image analysis device object information included in the object image screen may be shown in the detection result UI 2402.
  • the reader may compare the object image shown through the object image screen UI 2404 with the object detection result shown through the detection result UI 2402 to determine whether to perform a retrofit inspection on the current object.
  • 24 shows an example in which a notebook is detected as an object 2408 in the current object image.
  • the reader may determine whether a rectangular box displaying the location of the object is displayed on the object image screen UI 2406 through the rectangular box setting UI 2304 described above.
  • the control information UI 2404 illustrated in FIG. 24 may be an embodiment of the user interface 2300 of FIG. 23.
  • the reader can control the image analysis device to operate according to the application using the control information UI 2404.
  • the operation of each UI included in the control information UI 2404 is the same as described with reference to FIG. 23 and will be omitted.
  • 25 is a diagram illustrating an administrator user interface according to an embodiment of the present disclosure.
  • the article search system according to the present disclosure is capable of searching for articles at a remote location from the search site as well as a reader at the search site. It can be controlled by the managing administrator.
  • the administrator may control the product retrieval system through a separate device disposed at the retrieval site, or may control the product retrieval system through a separate device from a remote site.
  • 25 is an example of an administrator login UI 2500 for explaining administrator mode operation.
  • the administrator can monitor the search screen of the search site through the administrator mode login.
  • the screen monitored by the administrator may include the object detection user interface 2400 described in FIG. 24.
  • the screen monitored by the administrator may include only the object image screen UI 2406 and the object detection result UI. That is, the manager can monitor the video analysis device using the manager device.
  • the manager can use a manager device to deliver a work order to a reader at the site or to manage the search history.
  • the administrator may manage the database 122 stored in the learning unit illustrated in FIG. 1 or manage learning data through the manager device.
  • the administrator mode may be set through a separate UI provided by the image analysis device, or may be set by a separate article search server controlling the article search system.
  • 26 is another block diagram illustrating a configuration of an image analysis apparatus according to an embodiment of the present disclosure.
  • the image analysis device 2600 may include an input unit 2610, a control unit 2620, and / or a UI generation unit 2630.
  • two or more components may be implemented in one component, or an operation executed in one component may be divided and implemented to be executed in two or more components.
  • some components may be omitted or additional components may be added.
  • the same function or similar may be implemented as one component.
  • the input unit 2610 receives an image provided from an X-ray reading device, product information and / or control information provided from a reading source, and provides it to the control unit 2620.
  • the input unit 2610 may be defined as a receiving unit.
  • the control unit 2620 may manage images, product information, and / or control information, and control operations of the image analysis device.
  • the control unit 2620 may control a specific detection operation of the image analysis apparatus based on the control information and perform object detection on the received image.
  • the controller 2620 may provide an image of the object detection result to the UI generator 2630.
  • the mode setting unit 2621 may determine an operation mode of the image analysis device as one of a busy mode and a non-busy mode.
  • the busy mode may be a mode set in a situation in which a large number of objects to be detected are input to an X-ray reading device.
  • the non-busy mode may be a mode that is set in a situation where a relatively small number of objects to be detected is input to an X-ray reading device.
  • the mode setting unit 2621 may receive necessary control information by the operation mode setting UI 2304 described with reference to FIG. 23.
  • the busy mode and the non-busy mode of the image analysis device 2600 will be described in more detail below.
  • the sensitivity control unit 2622 may set sensitivity of a deep learning based model used for object detection.
  • the sensitivity control unit 2622 may receive necessary control information by the sensitivity setting UI 2306 described with reference to FIG. 23. Also, the sensitivity setting unit 2622 may determine the sensitivity according to the learned preset provided from the learning unit 2650.
  • the information controller 2623 may perform object detection on the input image or provide learning data obtained as a result of object detection to the learning unit 2650.
  • the detailed operation of the information control unit 2623 is the same as that of the image enhancement device 500 of FIG. 5, the image analysis device 1200 of FIG. 12, and the image synthesis device 1800 of FIG. 18, and thus descriptions thereof will be omitted here. .
  • the input unit 2610 may receive an administrator command from the administrator and provide it to the control unit 2620.
  • the control unit 2620 may control the operation of the image analysis device according to the received administrator command.
  • the UI generator 2640 may generate a user interface that visually provides the user with the object image and item information generated by the controller 2650.
  • the user can be a reader and / or an administrator.
  • the output device 2640 may display information processed by the image analysis device 2600.
  • the output device 2640 may display execution screen information of an application program driven by the image analysis device 2600, or a user interface according to the execution screen information, and graphical user interface (GUI) information.
  • GUI graphical user interface
  • the output device 2640 includes a liquid crystal display (LCD), a thin film transistor-liquid crystal display (TFT LCD), an organic light-emitting diode (OLED), and a flexible display (flexible) display), a three-dimensional display (3D display), an electronic ink display (e-ink display).
  • LCD liquid crystal display
  • TFT LCD thin film transistor-liquid crystal display
  • OLED organic light-emitting diode
  • flexible display flexible display
  • 3D display three-dimensional display
  • e-ink display electronic ink display
  • the output device 2640 disclosed by FIG. 26 is illustrated as being configured as a separate device from the image analysis device 2600, but the scope of rights of the present disclosure is not limited thereto.
  • the output unit having the same function as the output device may be configured as part of the image analysis device 2600.
  • the image analysis device may further include an administrator command control unit that transmits object image and product information to the administrator device.
  • 27 is a view for explaining a method of setting an operation mode according to an embodiment of the present disclosure.
  • Ports, airports, or research facilities, etc. pass through a large number of people during the same time, so multiple objects overlap in the image more frequently.
  • a problem occurs in that the object recognition efficiency of the image analysis device is reduced. Furthermore, this reduction in object recognition also causes a problem of lowering the learning efficiency of the object detection deep learning model.
  • an embodiment of the present disclosure proposes a method of operating an image analysis device in a busy mode or a non-busy mode.
  • the busy mode may mean an operation mode applied to a situation in which many objects are input to an X-ray reading device at the same time.
  • the image analysis device may receive a user input for a current operation mode from a reader or an administrator.
  • the user input for the current operation mode may be made through the operation mode setting UI 2304 described with reference to FIG. 23.
  • the input for the operation mode may be automatically determined by an object detection deep learning model provided by the learning unit as well as input by a user.
  • the image analysis apparatus may detect that an object image has a greater number of objects than a threshold, and set the current operation mode to a busy mode based on this.
  • the operation mode may be determined by product information or current time information.
  • product information For example, when the article search device according to the present disclosure is used in a research facility, it is highly probable that more passers-by will pass through the article search system at the time of work or work, so the image analysis device considers the current time, The current operation mode can be determined.
  • the image analysis device may determine a current operation mode according to a task to be performed. For example, when an article search system is used for a baggage inspection, etc., in which a search for a large number of cargoes needs to be performed for a short time, the video analysis device may operate by determining the current mode as a busy mode.
  • step S2710 the image analysis device may check whether the current operation mode is set to a busy mode or a non-busy mode.
  • the image analysis device may receive the streaming image from the X-ray reading device as an image input for the current object. Since the streaming image is composed of a plurality of pictures, in this case, the image analysis apparatus may receive a plurality of reference pictures for one object.
  • the streaming image may mean not only a plurality of pictures that are consecutive in time, but also a plurality of pictures having a predetermined interval obtained based on an image transmitted by an X-ray reading device. That is, in the present disclosure, the streaming image may mean a plurality of pictures obtained from real-time images.
  • the image analysis device may receive a single picture from the X-ray reading device as an image input to the current object.
  • the video analysis device when the operation mode of the video analysis device is set to the busy mode, the video analysis device is an X-ray reading device. You can request to send streaming video sequentially without stopping the conveyor belt.
  • the operation mode of the image analysis device when the operation mode of the image analysis device is set to the non-busy mode, the image analysis device is an X-ray reading device that temporarily stops the conveyor belt in units of objects and requests to send a single picture of the stationary object. Can be.
  • the image analysis device may perform object detection using a deep learning-based model that has been previously learned about an input image or picture.
  • the video analysis device may perform object detection by applying a deep learning-based model to the streaming video. Since the streaming image is composed of a plurality of pictures, in this case, the image analysis device may perform object detection for each of the plurality of pictures included in the streaming image.
  • the video analysis device When the video analysis device operates in a busy mode, the video analysis device performs object detection by a plurality of pictures, so the video analysis device synthesizes object detection results for a plurality of pictures and provides detection results to the reader. Can be. On the other hand, when the video analysis device operates in a non-busy mode, the video analysis device may provide a detection result for a single picture to a reader.
  • the image analysis apparatus may learn the object detection deep learning model using the streaming image or the single picture in which the object was detected.
  • the video analysis device When the video analysis device operates in a busy mode, the video analysis device performs object detection by a plurality of pictures, so the video analysis device can provide object detection results for a plurality of pictures as learning data of deep learning model learning. have.
  • the image analysis device when the image analysis device operates in a non-busy mode, the image analysis device may provide an object detection result for a single object as training data of deep learning model learning.
  • Exemplary methods of the present disclosure are expressed as a series of operations for clarity of description, but are not intended to limit the order in which the steps are performed, and if necessary, each step may be performed simultaneously or in a different order.
  • the steps illustrated may include other steps in addition, other steps may be included in addition to the other steps, or additional other steps may be included in addition to some of the steps.
  • various embodiments of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • Universal It can be implemented by a processor (general processor), a controller, a microcontroller, a microprocessor.
  • the scope of the present disclosure includes software or machine-executable instructions (eg, operating systems, applications, firmware, programs, etc.) that cause actions according to the methods of various embodiments to be executed on a device or computer, and such software or Instructions include a non-transitory computer-readable medium that is stored and executable on a device or computer.
  • software or Instructions include a non-transitory computer-readable medium that is stored and executable on a device or computer.
  • the present invention can be utilized in the field of analyzing images or images.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

Un mode de réalisation de la présente invention concerne un procédé d'analyse d'image qui comprend les étapes consistant à : recevoir une image à analyser relative à un objet comprenant au moins une entité ; obtenir des informations de commande pour détecter la ou les entités ; analyser l'image à analyser à l'aide des informations de commande et d'un modèle basé sur un apprentissage profond ; et délivrer en sortie une image du résultat analysé, les informations de commande comprenant des informations de mode de fonctionnement, et lorsque les informations de mode de fonctionnement indiquent un mode occupé, l'image à analyser peut être une image de diffusion en continu relative à l'objet et lorsque les informations de mode de fonctionnement indiquent un mode non occupé, l'image à analyser peut être une image unique relative à l'objet.
PCT/KR2019/014265 2018-10-31 2019-10-28 Appareil et procédé d'analyse d'image WO2020091337A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2018-0131479 2018-10-31
KR1020180131479A KR101990123B1 (ko) 2018-10-31 2018-10-31 영상 분석 장치 및 방법

Publications (1)

Publication Number Publication Date
WO2020091337A1 true WO2020091337A1 (fr) 2020-05-07

Family

ID=67102950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/014265 WO2020091337A1 (fr) 2018-10-31 2019-10-28 Appareil et procédé d'analyse d'image

Country Status (2)

Country Link
KR (1) KR101990123B1 (fr)
WO (1) WO2020091337A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102497939B1 (ko) * 2020-12-07 2023-02-08 경희대학교 산학협력단 영상 처리를 이용한 물체 인식 시스템 그 동작 방법
KR102462226B1 (ko) 2021-03-04 2022-11-03 인하대학교 산학협력단 신속 적응형 객체 감지를 위한 롤백 딥 러닝 방법
CN113341474A (zh) * 2021-05-31 2021-09-03 上海英曼尼安全装备有限公司 一种辅助判别危险品方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002257751A (ja) * 2001-03-01 2002-09-11 Kawasaki Heavy Ind Ltd 手荷物検査方法および手荷物検査システム
JP2006518039A (ja) * 2003-02-13 2006-08-03 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オブジェクト検査方法及び装置
WO2010138574A1 (fr) * 2009-05-26 2010-12-02 Rapiscan Security Products, Inc. Systèmes d'inspection tomographique aux rayons x pour l'identification d'articles cibles spécifiques
KR20160034385A (ko) * 2013-07-23 2016-03-29 라피스캔 시스템스, 인코포레이티드 대상물검색의 처리속도개선방법
JP2018112550A (ja) * 2017-01-12 2018-07-19 清華大学Tsinghua University 検査機器および銃器検出方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205643B (zh) * 2016-12-16 2020-05-15 同方威视技术股份有限公司 图像匹配方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002257751A (ja) * 2001-03-01 2002-09-11 Kawasaki Heavy Ind Ltd 手荷物検査方法および手荷物検査システム
JP2006518039A (ja) * 2003-02-13 2006-08-03 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オブジェクト検査方法及び装置
WO2010138574A1 (fr) * 2009-05-26 2010-12-02 Rapiscan Security Products, Inc. Systèmes d'inspection tomographique aux rayons x pour l'identification d'articles cibles spécifiques
KR20160034385A (ko) * 2013-07-23 2016-03-29 라피스캔 시스템스, 인코포레이티드 대상물검색의 처리속도개선방법
JP2018112550A (ja) * 2017-01-12 2018-07-19 清華大学Tsinghua University 検査機器および銃器検出方法

Also Published As

Publication number Publication date
KR101990123B1 (ko) 2019-06-18

Similar Documents

Publication Publication Date Title
WO2020138803A1 (fr) Dispositif et procédé d'analyse d'image
WO2019132587A1 (fr) Dispositif et procédé d'analyse d'images
WO2020091337A1 (fr) Appareil et procédé d'analyse d'image
WO2020116923A1 (fr) Appareil et procédé d'analyse d'image
WO2020159232A1 (fr) Procédé, appareil, dispositif électronique et support d'informations lisible par ordinateur permettant de rechercher une image
US10674083B2 (en) Automatic mobile photo capture using video analysis
US7403656B2 (en) Method and apparatus for recognition of character string in scene image
WO2015102361A1 (fr) Appareil et procédé d'acquisition d'image pour une reconnaissance de l'iris à l'aide d'une distance de trait facial
WO2019151735A1 (fr) Procédé de gestion d'inspection visuelle et système d'inspection visuelle
WO2020027519A1 (fr) Dispositif de traitement d'image et son procédé de fonctionnement
WO2020116988A1 (fr) Dispositif d'analyse d'images, procédé d'analyse d'images, et support d'enregistrement
WO2022019675A1 (fr) Dispositif et procédé d'analyse de symboles compris dans un plan d'étage d'un site
WO2022139111A1 (fr) Procédé et système de reconnaissance d'objet marin sur la base de données hyperspectrales
WO2022114731A1 (fr) Système de détection de comportement anormal basé sur un apprentissage profond et procédé de détection pour détecter et reconnaître un comportement anormal
WO2022139110A1 (fr) Procédé et dispositif de traitement de données hyperspectrales pour identifier un objet marin
WO2019132592A1 (fr) Dispositif et procédé de traitement d'image
WO2021006482A1 (fr) Appareil et procédé de génération d'image
WO2020085653A1 (fr) Procédé et système de suivi multi-piéton utilisant un fern aléatoire enseignant-élève
WO2020091268A1 (fr) Appareil électronique et procédé de commande associé
WO2020091253A1 (fr) Dispositif électronique et procédé de commande d'un dispositif électronique
WO2022050558A1 (fr) Appareil électronique et son procédé de commande
WO2020222555A1 (fr) Dispositif et procédé d'analyse d'image
WO2012086965A2 (fr) Procédé de rapport, dispositif électronique et support d'enregistrement le réalisant
WO2019088592A1 (fr) Dispositif électronique et procédé de commande de celui-ci
WO2020036468A1 (fr) Procédé d'application d'effet bokeh sur une image et support d'enregistrement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19877852

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 30.08.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19877852

Country of ref document: EP

Kind code of ref document: A1