TWI768323B

TWI768323B - Image processing apparatus and image processing method thereof

Info

Publication number: TWI768323B
Application number: TW109112722A
Authority: TW
Inventors: 李天; 金東炫; 朴鎔燮; 朴在演; 安一埈; 李炫承; 安泰慶; 文永秀; 李泰美
Original assignee: 南韓商三星電子股份有限公司
Priority date: 2019-05-22
Filing date: 2020-04-16
Publication date: 2022-06-21
Also published as: KR102410907B1; KR20200135102A; TW202044196A

Abstract

An image processing apparatus applies an image to a first learning network model to optimize the edges of the image, applies the image to a second learning network model to optimize the texture of the image, and applies a first weight to the first image and a second weight to the second image based on information on the edge areas and the texture areas of the image to acquire an output image.

Description

Image processing device and image processing method therefor

本揭露是有關於一種影像處理裝置以及其影像處理方法，且更具體而言，是有關於一種藉由使用學習網路模型加強影像的特性的影像處理裝置以及其影像處理方法。 The present disclosure relates to an image processing apparatus and an image processing method thereof, and more particularly, to an image processing apparatus and an image processing method for enhancing the characteristics of an image by using a learning network model.

受電子技術發展的刺激，已開發出並發佈各種類型的電子裝置。具體而言，近年來，影像處理裝置已被用於各種場所(例如，家庭、辦公室及公共空間)且正持續發展。 Stimulated by the development of electronic technology, various types of electronic devices have been developed and released. In particular, in recent years, image processing apparatuses have been used in various places (eg, homes, offices, and public spaces) and are continuing to develop.

近來，已推出並廣泛發佈高解析度顯示面板(例如，4K超高清晰度(Ultra High Definition，UHD)電視(television，TV))。然而，用於在此種高解析度顯示面板上再製的高解析度內容的可用性有所限制。因此，正在開發用於自低解析度內容產生高解析度內容的各種技術。具體而言，對於在有限處理資源內產生高解析度內容所必要的大量操作的高效處理的需求正在增加。 Recently, high-resolution display panels (eg, 4K Ultra High Definition (UHD) television (TV)) have been introduced and widely released. However, the availability of high-resolution content for reproduction on such high-resolution display panels is limited. Therefore, various techniques for generating high-resolution content from low-resolution content are being developed. In particular, there is an increasing need for efficient processing of the large number of operations necessary to produce high-resolution content within limited processing resources.

另外，近來，複製人類級別的智慧的人工智慧(artificial intelligence)系統已被用於各種領域。與傳統的基於規則的智慧型系統不同，人工智慧系統是指其中機器進行自主學習、判斷及實行處理的系統。由於人工智慧系統進行迭代操作，因此該系統顯示出更加改善的辨識率，且例如變得能夠更加正確地理解使用者偏好。因此，傳統的基於規則的智慧型系統正逐漸被基於深度學習的人工智慧系統所取代。 In addition, recently, artificial intelligence systems replicating human-level intelligence have been used in various fields. Different from the traditional rule-based intelligent system, artificial intelligence system refers to the machine in which the machine learns, judges and implements autonomously. processing system. Since the artificial intelligence system operates iteratively, the system shows a more improved recognition rate and, for example, becomes able to understand user preferences more correctly. Therefore, traditional rule-based intelligent systems are gradually being replaced by deep learning-based artificial intelligence systems.

人工智慧技術由機器學習(例如，深度學習)及利用機器學習的元素技術(element technology)組成。 Artificial intelligence technology consists of machine learning (eg, deep learning) and element technology that utilizes machine learning.

機器學習是指對輸入資料的特性進行自主分類/學習的演算法技術。同時，元素技術是指藉由使用機器學習演算法(例如深度學習)模擬人腦的功能(例如，認知及判斷)的技術且包括例如語言理解、視覺理解、推斷/預測、知識表示(knowledge representation)及操作控制等技術領域。 Machine learning refers to algorithmic techniques for autonomous classification/learning of the characteristics of input data. Meanwhile, elemental technology refers to a technology that simulates the functions of the human brain (eg, cognition and judgment) by using machine learning algorithms (eg, deep learning) and includes, for example, language understanding, visual understanding, inference/prediction, knowledge representation ) and operation control and other technical fields.

已試圖藉由在傳統影像處理裝置中使用人工智慧技術來加強影像的特性。然而，存在以下問題：對於傳統影像處理裝置的效能，產生高解析度影像所需的操作的處理量受到限制且花費大量時間。因此，需要一種能夠使影像處理裝置藉由僅執行少量操作來產生高解析度影像並提供影像的技術。 Attempts have been made to enhance the characteristics of images by using artificial intelligence techniques in conventional image processing devices. However, there is a problem that for the performance of conventional image processing apparatuses, the processing amount of the operations required to generate high-resolution images is limited and takes a lot of time. Therefore, there is a need for a technology that enables an image processing device to generate high-resolution images and provide images by performing only a few operations.

本揭露旨在解決上述需求且提供一種藉由使用多個學習網路模型來獲取具有改善的影像特性的高解析度的影像處理裝置以及其影像處理方法。 The present disclosure aims to solve the above-mentioned needs and provide an image processing device and an image processing method thereof to obtain a high-resolution image with improved image characteristics by using a plurality of learning network models.

根據本揭露實施例的用於達成上述目的的一種影像處理方法包括：記憶體，儲存電腦可讀取指令；以及處理器，被配置以執行所述電腦可讀取指令，以：將輸入影像作為第一輸入應用至第一學習網路模型且自所述第一學習網路模型獲取第一影像，所述第一影像包括基於所述輸入影像的邊緣進行最佳化的增強邊緣；以及將所述輸入影像作為第二輸入應用至第二學習網路模型且自所述第二學習網路模型獲取第二影像，所述第二影像包括基於所述輸入影像的紋理進行最佳化的增強紋理。所述處理器辨識所述影像中所包括的邊緣區域及紋理區域，且基於關於所述邊緣區域及所述紋理區域的資訊將第一權重應用至所述第一影像且將第二權重應用至所述第二影像，並且基於應用至所述第一影像的所述第一權重及應用至所述第二影像的所述第二權重自所述輸入影像獲取最佳化的輸出影像。 An image processing method for achieving the above object according to an embodiment of the present disclosure includes: a memory, which stores computer-readable instructions; and a processor, which is configured to to execute the computer-readable instructions to: apply an input image as a first input to a first learning network model and obtain a first image from the first learning network model, the first image including and applying the input image as a second input to a second learning network model and obtaining a second image from the second learning network model, the second The image includes an enhanced texture optimized based on the texture of the input image. The processor identifies edge regions and texture regions included in the image and applies first weights to the first image and second weights to the first image based on information about the edge regions and the texture regions the second image, and obtain an optimized output image from the input image based on the first weight applied to the first image and the second weight applied to the second image.

另外，所述第一學習網路模型的第一類型不同於所述第二學習網路模型的第二類型。 Additionally, the first type of the first learning network model is different from the second type of the second learning network model.

另外，所述第一學習網路模型可為藉由使用多個層來對所述輸入影像的所述邊緣進行最佳化的深度學習模型或被訓練成藉由使用多個預學習濾波器來對所述輸入影像的所述邊緣進行最佳化的機器學習模型中的一者。 Additionally, the first learning network model can be a deep learning model that optimizes the edges of the input image by using multiple layers or is trained to use multiple pre-learning filters to one of the machine learning models that optimize the edges of the input image.

另外，所述第二學習網路模型可為藉由使用多個層來對所述輸入影像的所述紋理進行最佳化的深度學習模型或藉由使用多個預學習濾波器來對所述輸入影像的所述紋理進行最佳化的機器學習模型中的一者。 Additionally, the second learning network model may be a deep learning model that optimizes the texture of the input image by using multiple layers or by using multiple pre-learned filters for the One of the machine learning models that are optimized for the texture of the input image.

同時，所述處理器可基於所述邊緣區域與所述紋理區域的比例資訊來獲取與所述邊緣區域對應的所述第一權重及與所述紋理區域對應的所述第二權重。 At the same time, the processor may be based on the edge region and the texture region to obtain the first weight corresponding to the edge region and the second weight corresponding to the texture region.

另外，所述處理器可按比例縮小所述輸入影像，以獲取具有較所述輸入影像的解析度小的解析度的按比例縮小影像。另外，所述第一學習網路模型可自對所述按比例縮小影像進行按比例放大的所述第一學習網路模型獲取具有所述增強邊緣的所述第一影像，且所述第二學習網路模型可自對所述按比例縮小影像進行按比例放大的所述第二學習網路模型獲取具有所述增強紋理的所述第二影像。 Additionally, the processor may scale down the input image to obtain a scaled-down image having a resolution less than that of the input image. Additionally, the first learning network model may obtain the first image with the enhanced edge from the first learning network model that scales the scaled-down image, and the second A learning network model may obtain the second image with the enhanced texture from the second learning network model that scales up the scaled-down image.

此外，所述處理器可基於所述按比例縮小影像獲取已辨識出所述邊緣區域及所述紋理區域的區域偵測資訊，且將所述區域偵測資訊及所述影像分別提供至所述第一學習網路模型及所述第二學習網路模型。 In addition, the processor may obtain area detection information for which the edge area and the texture area have been identified based on the scaled-down image, and provide the area detection information and the image to the A first learning network model and the second learning network model.

另外，所述第一學習網路模型可藉由按比例放大所述邊緣區域來獲取所述第一影像，且所述第二學習網路模型可藉由按比例放大所述紋理區域來獲取所述第二影像。 Additionally, the first learning network model may obtain the first image by scaling up the edge region, and the second learning network model may obtain the first image by scaling up the texture region Describe the second image.

另外，所述第一影像及所述第二影像可分別為第一殘留影像及第二殘留影像。另外，所述處理器可基於所述邊緣區域將所述第一權重應用至所述第一殘留影像且基於所述紋理區域將所述第二權重應用至所述第二殘留影像，且接著對所述第一殘留影像、所述第二殘留影像及所述輸入影像進行混合以獲取所述輸出影像。 In addition, the first image and the second image may be a first afterimage and a second afterimage, respectively. Additionally, the processor may apply the first weight to the first afterimage based on the edge region and the second weight to the second afterimage based on the texture region, and then apply the The first afterimage, the second afterimage, and the input image are mixed to obtain the output image.

同時，所述第二學習網路模型可為如下模型：所述模型儲存與多個影像圖案中的每一者對應的多個濾波器，且將所述影像中所包括的影像區塊中的每一者分類至所述多個影像圖案中的一者，並且將所述多個濾波器中的與被分類的影像圖案對應的至少一個濾波器應用至所述影像區塊且提供所述第二影像。 Meanwhile, the second learning network model may be the following model: the model storing a plurality of filters corresponding to each of a plurality of image patterns, and classifying each of the image blocks included in the image into one of the plurality of image patterns, and assigning At least one filter of the plurality of filters corresponding to the classified image pattern is applied to the image block and provides the second image.

此處，所述處理器可對與被分類的所述影像區塊中的每一者對應的影像圖案的索引資訊進行累積且基於所述索引資訊將所述影像辨識為自然影像或圖形影像中的一者，並且基於將所述輸入影像辨識為所述自然影像或所述圖形影像中的一者的結果來調整所述第一權重及所述第二權重。 Here, the processor may accumulate index information of image patterns corresponding to each of the image blocks being classified and identify the image as one of a natural image or a graphic image based on the index information and the first weight and the second weight are adjusted based on a result of identifying the input image as one of the natural image or the graphic image.

此處，所述處理器可基於所述輸入影像被辨識為所述自然影像而增大所述第一權重或所述第二權重中的至少一者，且基於所述輸入影像被辨識為所述圖形影像而減小所述第一權重或所述第二權重中的至少一者。 Here, the processor may increase at least one of the first weight or the second weight based on the input image being identified as the natural image, and based on the input image being identified as the natural image reducing at least one of the first weight or the second weight by using the graphic image.

同時，根據本揭露實施例的一種影像處理裝置的影像處理方法包括以下步驟：將輸入影像作為第一輸入應用至第一學習網路模型；自所述第一學習網路模型獲取第一影像，所述第一影像包括基於所述輸入影像的邊緣進行最佳化的增強邊緣；將所述輸入影像作為第二輸入應用至第二學習網路模型；自所述第二學習網路模型獲取第二影像，所述第二影像包括基於所述輸入影像的紋理進行最佳化的增強紋理；辨識所述輸入影像中所包括的所述邊緣的邊緣區域；辨識所述輸入影像中所包括的紋理區域；基於所述邊緣區域將第一權重應用至所述第一影像；基於所述紋理區域將第二權重應用至所述第二影像；以及基於應用至所述第一影像的所述第一權重及應用至所述第二影像的所述第二權重，自所述輸入影像獲取最佳化的輸出影像。 Meanwhile, an image processing method of an image processing apparatus according to an embodiment of the present disclosure includes the following steps: applying an input image as a first input to a first learning network model; acquiring a first image from the first learning network model, the first image includes an enhanced edge optimized based on the edge of the input image; applying the input image as a second input to a second learning network model; obtaining the first learning network model from the second learning network model; Two images, the second image including an enhanced texture optimized based on the texture of the input image; identifying the edge region of the edge included in the input image; identifying the texture included in the input image region; applying a first weight to the first image based on the edge region; based on the texture region applying a second weight to the second image; and obtaining an optimum from the input image based on the first weight applied to the first image and the second weight applied to the second image converted output image.

此處，所述第一學習網路模型與所述第二學習網路模型可為彼此不同的類型的學習網路模型。 Here, the first learning network model and the second learning network model may be different types of learning network models from each other.

另外，所述第二學習網路模型可為藉由使用多個層來對所述輸入影像的所述紋理進行最佳化的深度學習模型或被訓練成藉由使用多個預學習濾波器來對所述輸入影像的所述紋理進行最佳化的機器學習模型中的一者。 Additionally, the second learning network model can be a deep learning model that optimizes the texture of the input image by using multiple layers or is trained to use multiple pre-learning filters one of the machine learning models that optimize the texture of the input image.

另外，所述影像處理方法可包括以下步驟：基於所述輸入影像中的所述邊緣區域與所述輸入影像中的所述紋理區域的比例資訊來獲取所述第一權重及所述第二權重。 In addition, the image processing method may include the step of: obtaining the first weight and the second weight based on ratio information of the edge region in the input image and the texture region in the input image .

另外，所述影像處理方法可包括以下步驟：按比例縮小所述輸入影像，以獲取具有較所述輸入影像的解析度小的解析度的按比例縮小影像。同時，所述第一學習網路模型可藉由按比例放大所述按比例縮小影像來獲取所述第一影像，且所述第二學習網路模型可藉由按比例放大所述按比例縮小影像來獲取所述第二影像。 In addition, the image processing method may include the step of scaling down the input image to obtain a scaled-down image having a resolution smaller than that of the input image. Meanwhile, the first learning network model can obtain the first image by scaling up the scaled down image, and the second learning network model can obtain the first image by scaling up the scaled down image image to acquire the second image.

此處，所述影像處理方法可包括以下步驟：獲取對所述輸入影像的所述邊緣區域進行辨識的第一區域偵測資訊及對所述輸入影像的所述紋理區域進行辨識的第二區域偵測資訊，且將所述區域偵測資訊及所述影像分別提供至所述第一學習網路模型及所述第二學習網路模型。 Here, the image processing method may include the following steps: acquiring first area detection information for identifying the edge area of the input image and second area identifying the texture area of the input image detecting information, and providing the region detection information and the image to the first learning network model and the second learning network model, respectively.

同時，所述第一學習網路模型可藉由按比例放大所述邊緣區域來獲取所述第一影像，且所述第二學習網路模型可藉由按比例放大所述紋理區域來獲取所述第二影像。 Meanwhile, the first learning network model can acquire the first image by scaling up the edge region, and the second learning network model can acquire the texture region by scaling up the texture region. Describe the second image.

另外，所述第一影像及所述第二影像可分別為第一殘留影像及第二殘留影像。 In addition, the first image and the second image may be a first afterimage and a second afterimage, respectively.

根據如上所述的本揭露的各種實施例，藉由將彼此不同的學習網路模型應用至影像來產生高解析度的影像，且減少產生高解析度的影像所需的操作量，且因此可在影像處理裝置的有限資源內產生高解析度的影像且可將所述影像提供至使用者。 According to the various embodiments of the present disclosure as described above, high-resolution images are generated by applying mutually different learning network models to images, and the amount of operations required to generate high-resolution images is reduced, and thus it is possible to A high-resolution image is generated within the limited resources of the image processing device and can be provided to the user.

1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32:索引資訊 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32: Index information

10:影像/輸出影像 10: Image/Output Image

10’:輸入影像/影像 10’: Input image/image

20、30:輸出影像 20, 30: output image

30’:最終輸出影像 30': Final output image

100:影像處理裝置/聲響輸出裝置 100: Image processing device/sound output device

110:記憶體 110: Memory

120、1200:處理器 120, 1200: Processor

130:輸入器 130: Input

140:顯示器 140: Display

150:輸出器 150: Exporter

160:使用者介面 160: User Interface

810:拉普拉斯濾波器 810: Laplace Filter

820:梯度向量 820: Gradient Vector

830:濾波器/搜尋 830: Filter/Search

840:應用 840: Application

850:索引矩陣 850: index matrix

860:濾波器資料庫(DB) 860: Filter Database (DB)

1210:學習部件 1210: Learning Parts

1220:辨別部件 1220: Identify parts

S310、S320、S330、S340、S350、S410、S420、S430、S440、S450、S610、S620、S630、S640、S650、S710、S720、S730、S740、S1310、S1320、S1330:操作 S310, S320, S330, S340, S350, S410, S420, S430, S440, S450, S610, S620, S630, S640, S650, S710, S720, S730, S740, S1310, S1320, S1330: Operation

圖1是示出根據本揭露實施例的影像處理裝置的實施實例的圖。 FIG. 1 is a diagram illustrating an implementation example of an image processing apparatus according to an embodiment of the present disclosure.

圖2是示出根據本揭露實施例的影像處理裝置的配置的方塊圖。 FIG. 2 is a block diagram illustrating a configuration of an image processing apparatus according to an embodiment of the present disclosure.

圖3是示出根據本揭露實施例的第一學習網路模型及第二學習網路模型的圖。 FIG. 3 is a diagram illustrating a first learning network model and a second learning network model according to an embodiment of the present disclosure.

圖4是示出根據本揭露實施例的按比例縮小的圖。 FIG. 4 is a scaled-down diagram illustrating an embodiment of the present disclosure.

圖5是示出根據本揭露實施例的深度學習模型及機器學習模型的圖。 FIG. 5 is a diagram illustrating a deep learning model and a machine learning model according to an embodiment of the present disclosure.

圖6是示出根據本揭露另一實施例的第一學習網路模型及第二學習網路模型的圖。 FIG. 6 is a diagram illustrating a first learning network model and a second learning network model according to another embodiment of the present disclosure.

圖7是示出根據本揭露另一實施例的第一學習網路模型及第二學習網路模型的圖。 FIG. 7 is a diagram illustrating a first learning network model and a second learning network model according to another embodiment of the present disclosure.

圖8是示意性地示出根據本揭露實施例的第二學習網路模型的操作的圖。 FIG. 8 is a diagram schematically illustrating an operation of a second learning network model according to an embodiment of the present disclosure.

圖9是示出根據本揭露實施例的索引資訊的圖。 FIG. 9 is a diagram illustrating index information according to an embodiment of the present disclosure.

圖10是示出根據本揭露實施例的獲取最終輸出影像的方法的圖。 FIG. 10 is a diagram illustrating a method of obtaining a final output image according to an embodiment of the present disclosure.

圖11是示出圖2中所示的影像處理裝置的詳細配置的方塊圖。 FIG. 11 is a block diagram showing a detailed configuration of the image processing apparatus shown in FIG. 2 .

圖12是示出根據本揭露實施例的用於學習及使用學習網路模型的影像處理裝置的配置的方塊圖。 12 is a block diagram illustrating a configuration of an image processing apparatus for learning and using a learning network model according to an embodiment of the present disclosure.

圖13是示出根據本揭露實施例的影像處理方法的流程圖。 FIG. 13 is a flowchart illustrating an image processing method according to an embodiment of the present disclosure.

在下文中，將參照隨附圖式詳細闡述本揭露。 Hereinafter, the present disclosure will be explained in detail with reference to the accompanying drawings.

考慮到本揭露中闡述的功能，儘量選擇傳統上廣泛使用的一般性用語作為在本揭露的實施例中所使用的用語。然而，所述用語可依據相關領域中熟習此項技術者的意圖或新技術的出現而有所變化。另外，在特定情形中，可存在被指定的用語，且在此種情形中，所述用語的含義將在本揭露中的相關說明中進行詳細闡述。因此，本揭露中所使用的用語應基於所述用語的含義及本揭露的整體內容而並非僅基於所述用語的名稱進行定義。 Considering the functions set forth in the present disclosure, general terms that are traditionally widely used are selected as terms used in the embodiments of the present disclosure as much as possible. However, the terminology may be changed according to the intention of those skilled in the art in the relevant field or the appearance of new technologies. changes. In addition, in certain cases, there may be designated terms, and in such cases, the meanings of the terms will be elaborated in the relevant descriptions in the present disclosure. Therefore, the terms used in the present disclosure should be defined based on the meaning of the terms and the overall content of the present disclosure, rather than only based on the names of the terms.

在本說明書中，例如「具有(have)」、「可具有(may have)」、「包括(include)」及「可包括(may include)」等表達應被視為表示存在此種特性(例如，如數值、功能、操作及組件等元件)，且所述用語並非旨在排除附加特性的存在。 In this specification, expressions such as "have", "may have", "include" and "may include" should be considered to indicate the presence of such a property (eg , such as numerical values, functions, operations, and components), and the terms are not intended to exclude the presence of additional characteristics.

另外，表達「A及/或B中的至少一者」應被解釋為意指「A」或「B」中的任何一者或者「A及B」。 Additionally, the expression "at least one of A and/or B" should be construed to mean either "A" or "B" or "A and B".

另外，本說明書中所使用的表達「第一(first)」、「第二(second)」等可用於闡述各種元件而不考慮任何次序及/或重要程度。另外，此種表達僅用於將一個元件與另一元件區分開，且並非旨在限制所述元件。 Additionally, the expressions "first", "second", etc. used in this specification may be used to describe various elements without regard to any order and/or importance. Additionally, such expressions are used only to distinguish one element from another, and are not intended to limit the element.

同時，本揭露中的說明將一個元件(例如，第一元件)與另一元件(例如，第二元件)「(可操作地或可通訊地)耦合」或「(可操作地或可通訊地)耦合至」另一元件(例如，第二元件)或者「連接至」另一元件(例如，第二元件)應被解釋為意指所述一個元件直接耦合至所述另一元件、或者所述一個元件經由又一元件(例如，第三元件)耦合至所述另一元件。 Meanwhile, the descriptions in this disclosure refer to an element (eg, a first element) with another element (eg, a second element) "(operably or communicatively) coupled" or "(operably or communicatively) ) coupled to" another element (eg, a second element) or "connected to" another element (eg, a second element) should be construed to mean that the one element is directly coupled to the other element, or all The one element is coupled to the other element via a further element (eg, a third element).

除非在上下文中進行明顯不同地定義，否則單數表達包括複數表達。此外，在本揭露中，例如「包括(include)」及「具有(have)」等用語應被視為指明存在說明書中所闡述的此種特性、數目、步驟、操作、元件、組件或其組合，而不是預先排除其他特性、數目、步驟、操作、元件、組件或其組合中的一或多者的存在或添加的可能。 Singular expressions include plural expressions unless the context clearly defines otherwise. In addition, in this disclosure, for example, "include" and "have" Terms such as "have" should be construed as indicating the presence of such features, numbers, steps, operations, elements, components or combinations thereof stated in the specification, rather than pre-exclusion of other features, numbers, steps, operations, elements, The presence or possibility of addition of one or more of the components or combinations thereof.

另外，在本揭露中，「模組」或「單元」可實行至少一個功能或操作，且可被實施成硬體或軟體或者被實施成硬體與軟體的組合。此外，多個「模組」或多個「單元」可被整合至至少一個模組中且可被實施成至少一個處理器，但需要被實施成特定硬體的「模組」或「單元」除外。 In addition, in the present disclosure, a "module" or "unit" may perform at least one function or operation, and may be implemented as hardware or software, or as a combination of hardware and software. Furthermore, multiple "modules" or multiple "units" may be integrated into at least one module and may be implemented as at least one processor, but need to be implemented as "modules" or "units" of specific hardware except.

另外，在本說明書中，用語「使用者」可指操作電子裝置的人或裝置(例如，人工智慧電子裝置)。 In addition, in this specification, the term "user" may refer to a person or a device (eg, an artificial intelligence electronic device) operating an electronic device.

在下文中，將參照隨附圖式更詳細地闡述本揭露的實施例。 Hereinafter, embodiments of the present disclosure will be explained in more detail with reference to the accompanying drawings.

影像處理裝置100可被實施成如圖1中所示的電視(TV)。然而，影像處理裝置100並非僅限於此，且影像處理裝置100可被實施成裝配有影像處理功能及/或顯示功能的以下裝置中的任意者：例如智慧型電話、平板個人電腦(personal computer，PC)、膝上型個人電腦、頭戴式顯示器(head mounted display，HMD)、近眼顯示器(near eye display，NED)、大型顯示器(large format display，LFD)、數位標牌(digital signage)、數位資訊顯示器(digital information display，DID)、視訊牆(video wall)、投影機顯示器、相機、攝錄影機(camcorder)、列印機等，而不受限制。 The image processing apparatus 100 may be implemented as a television (TV) as shown in FIG. 1 . However, the image processing apparatus 100 is not limited thereto, and the image processing apparatus 100 may be implemented as any of the following apparatuses equipped with image processing functions and/or display functions: for example, smart phones, tablet personal computers (personal computers, etc.) PC), laptop personal computer, head mounted display (HMD), near eye display (NED), large format display (LFD), digital signage (digital signage), digital information Display (digital information display, DID), video wall (video wall), projector display, camera, camcorder (camcorder), printer, etc. without limitation.

影像處理裝置100可接收各種解析度的影像或各種壓縮影像。舉例而言，影像處理裝置100可接收根據以下影像中的任意者進行格式化的影像10：標準清晰度(standard definition，SD)影像、高清晰度(high definition，HD)影像、全HD影像及超HD影像。另外，影像處理裝置100可接收例如以下編碼格式或壓縮形式的影像10：動態影像專家群壓縮標準(Moving Picture Experts Group，MPEG)(例如，MP2、MP4、MP7等)、高級視訊編碼(advanced video coding，AVC)、H.264、高效率視訊編碼(high efficiency video coding，HEVC)等。 The image processing device 100 can receive images of various resolutions or compressed images. For example, the image processing device 100 may receive the image 10 formatted according to any of the following images: standard definition (SD) images, high definition (HD) images, full HD images, and Ultra HD video. In addition, the image processing device 100 may receive the image 10 in the following encoding formats or compression formats: Moving Picture Experts Group (MPEG) (eg, MP2, MP4, MP7, etc.), advanced video coding (advanced video) coding, AVC), H.264, high efficiency video coding (high efficiency video coding, HEVC) and so on.

儘管根據本揭露實施例影像處理裝置100被實施成UHD電視，但由於UHD內容的有限可用性，存在許多其中僅以下影像可用的情況：例如標準清晰度(SD)影像、高清晰度(HD)影像及全HD影像(在下文中被稱為低解析度的影像)等影像。在此種情形中，可提供一種將低解析度的輸入影像放大至UHD影像(在下文中被稱為高解析度的影像)並提供所得影像的方法。作為實例，可將低解析度的影像作為輸入應用至學習網路模型，使得低解析度的影像可被放大，且因此高解析度的影像可被作為輸出獲取以在影像處理裝置100上顯示。 Although the image processing apparatus 100 according to the disclosed embodiment is implemented as a UHD television, due to the limited availability of UHD content, there are many situations where only the following images are available: eg standard definition (SD) images, high definition (HD) images and full HD images (hereinafter referred to as low-resolution images) and other images. In this case, a method of upscaling a low-resolution input image to a UHD image (hereinafter referred to as a high-resolution image) and providing the resulting image can be provided. As an example, a low-resolution image can be applied as input to the learning network model, so that the low-resolution image can be upscaled, and thus the high-resolution image can be acquired as output for display on the image processing device 100 .

然而，為將低解析度的影像放大至高解析度的影像，一般需要大量複雜的處理操作來對影像資料進行轉變。因此，需要具有高效能及高複雜度的影像處理裝置100來執行此種轉變。作為實例，為將解析度為820×480的SD級中的60P影像按比例放大至高解析度的影像，影像處理裝置100應每秒實行820×480×60畫素的操作。因此，需要具有高效能的處理單元，例如中央處理單元(central processing unit，CPU)或圖形處理單元(graphics processing unit，GPU)或其組合。作為另一實例，影像處理裝置100應每秒實行3840×2160×60畫素的操作，以將解析度為4K的UHD級中的60P影像按比例放大至解析度為8K的影像。因此，需要一種能夠處理大量操作的處理單元，如為在SD級中按比例放大影像的情形的至少24倍的量。 However, in order to enlarge a low-resolution image to a high-resolution image, a large number of complex processing operations are generally required to transform the image data. Therefore, it is necessary to have There is a high-performance and high-complexity image processing apparatus 100 to perform this transformation. As an example, in order to scale up a 60P image in an SD class with a resolution of 820×480 to a high-resolution image, the image processing apparatus 100 should perform an operation of 820×480×60 pixels per second. Therefore, there is a need for a processing unit with high performance, such as a central processing unit (CPU) or a graphics processing unit (GPU) or a combination thereof. As another example, the image processing apparatus 100 should perform an operation of 3840×2160×60 pixels per second to scale up a 60P image in a UHD level with a resolution of 4K to an image with a resolution of 8K. Therefore, there is a need for a processing unit capable of handling a large number of operations, such as at least 24 times the amount that would be the case in the SD class for upscaling images.

因此，在下文中，將闡述提供影像處理裝置100的各種實施例，影像處理裝置100減少將較低解析度的影像按比例放大至較高解析度的影像所需的操作量，且因此使影像處理裝置100的有限資源最大化。 Accordingly, in the following, various embodiments will be described that provide an image processing apparatus 100 that reduces the amount of operations required to scale up a lower resolution image to a higher resolution image, and thus enables image processing The limited resources of the device 100 are maximized.

另外，將闡述其中影像處理裝置100在加強或增強輸入影像的各種特性中的至少一個影像特性的同時獲取輸出影像的各種實施例。 Additionally, various embodiments will be described in which the image processing apparatus 100 acquires an output image while enhancing or enhancing at least one image characteristic of various characteristics of the input image.

根據圖2，影像處理裝置100包括記憶體110及處理器120。 According to FIG. 2 , the image processing apparatus 100 includes a memory 110 and a processor 120 .

記憶體110與處理器120電性連接且可儲存執行本揭露的各種實施例所必要的資料。舉例而言，記憶體110可被實施成內部記憶體，例如處理器120中所包括的唯讀記憶體(read-only memory，ROM)(例如，電性可抹除可程式化唯讀記憶體(electrically erasable programmable read-only memory，EEPROM))、隨機存取記憶體(random access memory，RAM)等、或與處理器120分離的記憶體。 The memory 110 is electrically connected to the processor 120 and can store and execute the present disclosure necessary information for various implementations. For example, the memory 110 may be implemented as an internal memory, such as a read-only memory (ROM) included in the processor 120 (eg, an electrically erasable programmable read-only memory) (electrically erasable programmable read-only memory, EEPROM)), random access memory (random access memory, RAM), etc., or a memory separate from the processor 120.

記憶體110可根據儲存的資料的使用而以以下形式來實施：嵌置於影像處理裝置100中的記憶體的形式、或者被實施成可附接於影像處理裝置100上或自影像處理裝置100拆離的記憶體的形式。舉例而言，在資料用於操作影像處理裝置100的情形中，所述資料可儲存於嵌置在影像處理裝置100中的記憶體中，且在資料用於影像處理裝置100的擴展功能的情形中，所述資料可儲存於可附接於影像處理裝置100上或自影像處理裝置100拆離的記憶體中。在記憶體110被實施成嵌置於影像處理裝置100中的記憶體的情形中，記憶體110可為以下記憶體中的至少一者：揮發性記憶體(例如，動態RAM(dynamic RAM，DRAM)、靜態RAM(static RAM，SRAM)、同步動態RAM(synchronous dynamic RAM，SDRAM)等)或非揮發性記憶體(例如，一次可程式化ROM(one time programmable ROM，OTPROM)、可程式化ROM(programmable ROM，PROM)、可抹除及可程式化ROM(erasable and programmable ROM，EPROM)、電性可抹除及可程式化ROM(electrically erasable and programmable ROM，EEPROM)、遮罩 ROM(mask ROM)、快閃ROM(flash ROM)、快閃記憶體(例如，反及快閃(NAND flash)或反或快閃(NOR flash)等)、硬驅動機(hard drive)或固態驅動機(solid state drive，SSD))。 The memory 110 may be implemented in the form of memory embedded in the image processing device 100 , or implemented as being attachable to or from the image processing device 100 , depending on the use of the stored data. form of detached memory. For example, in the case where the data is used to operate the image processing device 100 , the data may be stored in a memory embedded in the image processing device 100 , and in the case where the data is used for extended functions of the image processing device 100 , the data may be stored in a memory that may be attached to or detached from the image processing device 100 . In the case where the memory 110 is implemented as a memory embedded in the image processing device 100 , the memory 110 may be at least one of the following: volatile memory (eg, dynamic RAM (DRAM) ), static RAM (static RAM, SRAM), synchronous dynamic RAM (synchronous dynamic RAM, SDRAM), etc.) or non-volatile memory (for example, one time programmable ROM (OTPROM), programmable ROM (programmable ROM, PROM), erasable and programmable ROM (erasable and programmable ROM, EPROM), electrically erasable and programmable ROM (electrically erasable and programmable ROM, EEPROM), mask ROM (mask ROM), flash ROM (flash ROM), flash memory (eg, NAND flash or NOR flash, etc.), hard drive or solid state drive (solid state drive, SSD)).

同時，在記憶體110被實施成可附接於影像處理裝置100上或自影像處理裝置100拆離的記憶體的情形中，記憶體110可為記憶卡(例如，緊湊式快閃(compact flash，CF)、安全數位(secure digital，SD)、微型安全數位(micro secure digital，Micro-SD)、迷你安全數位(mini secure digital，Mini-SD)、極限數位(extreme digital，xD)、多媒體卡(multi-media card，MMC)等)、可連接至通用串列匯流排(universal serial bus，USB)埠的外部記憶體(例如，USB記憶體)等。 Meanwhile, in the case where the memory 110 is implemented as a memory that can be attached to or detached from the image processing device 100 , the memory 110 may be a memory card (eg, a compact flash). , CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (Mini-SD), extreme digital (xD), multimedia card (multi-media card, MMC, etc.), external memory (eg, USB memory) that can be connected to a universal serial bus (USB) port, etc.

根據本揭露的實施例，記憶體110可儲存用於使指令由處理器120執行的至少一個程式。此處，指令可為用於處理器120以藉由將影像10應用至學習網路來獲取輸出影像的指令。 According to embodiments of the present disclosure, the memory 110 may store at least one program for causing instructions to be executed by the processor 120 . Here, the instructions may be instructions for the processor 120 to obtain an output image by applying the image 10 to the learning network.

根據本揭露的另一實施例，記憶體110可儲存根據本揭露各種實施例的學習網路模型。 According to another embodiment of the present disclosure, the memory 110 may store the learning network model according to various embodiments of the present disclosure.

根據本揭露實施例的學習網路模型是基於人工智慧演算法、基於多個影像進行訓練的判斷模型，且學習網路可為基於神經網路的模型。經訓練的判斷模型可被設計成在電腦上模擬人類智慧及決策，且可包括多個具有權重的網路節點，所述網路節點模擬人類神經網路的神經元。所述多個網路節點中的每一者可形成連接關係，以模擬經由突觸傳送及接收訊號的神經元的突觸活動。另外，經訓練的判斷模型可包括例如機器學習模型、神經網路模型或自神經網路模型發展的深度學習模型。在深度學習模型中，多個網路節點可位於彼此不同的深度(或層)中且根據捲積連接關係傳送及接收資料。 The learning network model according to the embodiment of the present disclosure is a judgment model based on an artificial intelligence algorithm and trained based on a plurality of images, and the learning network may be a neural network-based model. The trained judgment model can be designed to simulate human intelligence and decision-making on a computer, and can include a number of weighted network nodes that simulate the neurons of a human neural network. Each of the plurality of network nodes may form connections to simulate synaptic activity of neurons that transmit and receive signals through synapses. Additionally, the trained judgment model may include, for example, a machine learning model, a neural network model, or a deep learning model developed from a neural network model. In a deep learning model, multiple network nodes may be located in different depths (or layers) from each other and transmit and receive data according to convolutional connections.

作為實例，學習網路模型可為基於影像進行訓練的捲積神經網路(convolution neural network，CNN)模型。CNN是具有為語音處理、影像處理等而設計的特定連接結構的多層式神經網路。同時，學習網路模型並非僅限於CNN。舉例而言，學習網路模型可被實施成以下模型中的至少一個深度神經網路(deep neural network，DNN)模型：遞歸神經網路(recurrent neural network，RNN)模型、長短期記憶體網路(long short term memory network，LSTM)模型、閘控遞歸單元(gated recurrent unit，GRU)模型或生成對抗網路(generative adversarial network，GAN)模型。 As an example, the learning network model may be a convolution neural network (CNN) model trained on images. CNN is a multi-layer neural network with a specific connection structure designed for speech processing, image processing, etc. At the same time, learning network models is not limited to CNNs. For example, the learning network model may be implemented as at least one deep neural network (DNN) model among the following models: recurrent neural network (RNN) model, long short-term memory network (long short term memory network, LSTM) model, gated recurrent unit (GRU) model or generative adversarial network (generative adversarial network, GAN) model.

舉例而言，學習網路模型可基於超解析度GAN(Super-resolution GAN，SRGAN)將低解析度的影像恢復或轉換至高解析度的影像。同時，根據本揭露實施例的記憶體110可儲存相同種類或不同種類的多個學習網路模型。學習網路模型的數目及類型不受限制。然而，根據本揭露的另一實施例，根據本揭露各種實施例的至少一個學習網路模型可儲存於外部裝置或外部伺服器中的至少一者中。 For example, the learning network model can restore or convert low-resolution images to high-resolution images based on Super-resolution GAN (SRGAN). Meanwhile, the memory 110 according to the embodiment of the present disclosure can store a plurality of learning network models of the same type or different types. The number and type of learning network models is not limited. However, according to another embodiment of the present disclosure, at least one learning network model according to various embodiments of the present disclosure may be stored in at least one of an external device or an external server.

處理器120與記憶體110電性連接且控制影像處理裝置100的總體操作。 The processor 120 is electrically connected to the memory 110 and controls the overall operation of the image processing apparatus 100 .

根據本揭露的實施例，處理器120可被實施成對數位訊號進行處理的數位訊號處理器(digital signal processor，DSP)、微處理器、人工智慧(AI)處理器及時序控制器(timing controller，T-CON)。然而，處理器120並非僅限於此，且處理器120可包括中央處理單元(central processing unit，CPU)、微控制器單元(micro controller unit，MCU)、微處理單元(micro processing unit，MPU)、控制器、應用處理器(application processor，AP)、通訊處理器(communication processor，CP)及高階RISC機器(Advanced RISC Machine，ARM)處理器中的一或多者，或者可由用語定義。另外，處理器120可被實施成其中儲存有處理演算法的系統晶片(system on chip，SoC)或大型積體(large scale integration，LSI)，或以現場可程式化閘陣列(field programmable gate array，FPGA)的形式來實施。 According to an embodiment of the present disclosure, the processor 120 may be implemented as a digital signal processor (DSP), a microprocessor, an artificial intelligence (AI) processor, and a timing controller for processing digital signals , T-CON). However, the processor 120 is not limited thereto, and the processor 120 may include a central processing unit (CPU), a micro controller unit (MCU), a micro processing unit (MPU), One or more of a controller, an application processor (AP), a communication processor (CP), and an Advanced RISC Machine (ARM) processor, or may be defined by terms. In addition, the processor 120 may be implemented as a system on chip (SoC) or a large scale integration (LSI) in which processing algorithms are stored, or as a field programmable gate array (field programmable gate array). , FPGA) form to implement.

處理器120可將影像10作為輸入應用至學習網路模型並獲取具有改善的、增強的、最佳化的或加強的影像特性的影像。此處，影像10的特性可意指根據影像10中所包括的多個畫素的邊緣方向、邊緣強度、紋理、灰值、亮度、反差或伽瑪值(gamma value)中的至少一者。舉例而言，處理器120可將影像應用至學習網路模型並獲取其中邊緣及紋理已得到增強的影像。此處，影像的邊緣可意指其中在空間上相鄰的畫素的值急劇改變的區域。舉例而言，邊緣可為其中影像的亮度自低值急劇改變至高值或自高值急劇改變至低值的區域。影像的紋理可為影像中被視為相同特性的區域的獨特圖案或形狀。同時，影像的紋理亦可由精細邊緣組成，且因此處理器120可獲取其中等於或大於第一臨限強度(或臨限厚度)的邊緣分量及小於第二臨限強度(或臨限厚度)的邊緣分量已得到改善的影像。此處，第一臨限強度可為用於對根據本揭露實施例的邊緣分量進行劃分的值，且第二臨限強度可為用於對根據本揭露實施例的紋理分量進行劃分的值，且第一臨限強度及第二臨限強度可為預定值或基於影像的特性設定的值。然而，在下文中，為便於闡釋，上述特性將被稱為邊緣及紋理。 The processor 120 may apply the image 10 as input to the learning network model and obtain images with improved, enhanced, optimized or enhanced image characteristics. Here, the characteristics of the image 10 may mean at least one of edge directions, edge strengths, textures, gray values, brightness, contrast, or gamma values according to a plurality of pixels included in the image 10 . For example, the processor 120 may apply the imagery to the learning network model and obtain images in which edges and textures have been enhanced. Here, the edge of the image may mean an area in which the values of spatially adjacent pixels change sharply. For example, an edge can be an area where the brightness of the image changes sharply from a low value to a high value or from a high value to a low value. The texture of an image can be viewed as the same feature in the image. Distinctive patterns or shapes of sexual areas. At the same time, the texture of the image can also be composed of fine edges, and thus the processor 120 can obtain the edge components equal to or greater than the first threshold intensity (or threshold thickness) and the components less than the second threshold intensity (or threshold thickness) therein. Image with improved edge components. Here, the first threshold strength may be a value for dividing edge components according to an embodiment of the present disclosure, and the second threshold strength may be a value for dividing texture components according to an embodiment of the present disclosure, And the first threshold intensity and the second threshold intensity may be predetermined values or values set based on characteristics of the image. However, in the following, for convenience of explanation, the above-mentioned characteristics will be referred to as edge and texture.

同時，根據本揭露實施例的影像處理裝置100可包括多個學習網路模型。所述多個學習網路模型中的每一者可加強影像10的不同特性。將參照圖3對此進行詳細闡釋。 Meanwhile, the image processing apparatus 100 according to the embodiment of the present disclosure may include a plurality of learning network models. Each of the plurality of learned network models may enhance different properties of the image 10 . This will be explained in detail with reference to FIG. 3 .

參照圖3，在操作S310處，根據本揭露實施例的處理器120可將影像10作為輸入應用至第一學習網路模型並獲取其中影像10的邊緣已得到改善的第一影像作為輸出。在操作S320處，亦可將影像10作為輸入供應至第二學習網路模型並獲取其中影像10的紋理已得到改善的第二影像作為輸出。 3 , at operation S310 , the processor 120 according to an embodiment of the present disclosure may apply the image 10 as an input to a first learning network model and obtain a first image in which the edge of the image 10 has been improved as an output. At operation S320, the image 10 may also be supplied as an input to the second learning network model and a second image in which the texture of the image 10 has been improved is obtained as an output.

同時，根據本揭露實施例的影像處理裝置100可並列使用基於彼此不同的人工智慧演算法的第一學習網路模型與第二學習網路模型。作為另外一種選擇，影像10可由第一學習網路模型及第二學習網路模型連續處理。此處，第一學習網路模型可為藉由使用除第二學習網路模型的資源之外的更大資源進行訓練的模型。此處，資源可為訓練及/或處理學習網路模型所必要的各種資料且可包括例如是否實行即時學習、學習資料的量、學習網路模型中所包括的捲積層的數目、參數的數目、學習網路模型中所使用的記憶體的容量、學習網路使用GPU的程度等。 Meanwhile, the image processing apparatus 100 according to the embodiment of the present disclosure may use the first learning network model and the second learning network model based on different artificial intelligence algorithms in parallel. Alternatively, the image 10 may be processed sequentially by the first learning network model and the second learning network model. Here, the first learning network model can be obtained by A model trained using larger resources than the resources of the second learning network model. Here, the resources may be various data necessary to train and/or process the learning network model and may include, for example, whether immediate learning is performed, the amount of learning data, the number of convolutional layers included in the learning network model, the number of parameters , the capacity of the memory used in the learning network model, the extent to which the learning network uses the GPU, etc.

舉例而言，影像處理裝置100中提供的GPU可包括紋理單元、特定功能單元(special function unit，SFU)、算術邏輯裝置等。此處，紋理單元是用於向影像10添加材料或紋理的資源，且特定功能單元是用於對例如平方根、倒數及代數函數的複雜運算進行處理的資源。同時，整數算術邏輯單元(integer arithmetic logic unit，ALU)是對浮點、整數運算、比較及資料移動進行處理的資源。幾何單元是對物件的位置或視點、光源的方向等進行計算的資源。光柵單元(raster unit)是將三維資料投射於二維螢幕上的資源。在此種情形中，深度學習模型可較機器學習模型使用GPU中所包括的各種資源來進行學習及操作。同時，影像處理裝置100的資源並非僅限於GPU的資源，且所述資源可為影像處理裝置100中所包括的各種組件的資源，例如記憶體110的儲存區域、電力等。 For example, the GPU provided in the image processing device 100 may include a texture unit, a special function unit (SFU), an arithmetic logic device, and the like. Here, a texture unit is a resource for adding materials or textures to the image 10, and a specific functional unit is a resource for processing complex operations such as square root, reciprocal, and algebraic functions. Meanwhile, an integer arithmetic logic unit (ALU) is a resource for processing floating point, integer arithmetic, comparison and data movement. A geometric unit is a resource for computing the position or viewpoint of an object, the direction of a light source, and so on. A raster unit is a resource that projects 3D data onto a 2D screen. In this case, the deep learning model may use various resources included in the GPU to learn and operate compared to the machine learning model. Meanwhile, the resources of the image processing apparatus 100 are not limited to the resources of the GPU, and the resources may be resources of various components included in the image processing apparatus 100 , such as the storage area of the memory 110 and power.

根據本揭露實施例的第一學習網路模型及第二學習網路模型可為不同類型的學習網路模型。 The first learning network model and the second learning network model according to the embodiments of the present disclosure may be different types of learning network models.

作為實例，第一學習網路模型可為基於深度學習的模型或機器學習模型中的一者，基於深度學習的模型學習基於多個影像來改善影像10的邊緣，機器學習模型被訓練成藉由使用多個預學習濾波器來改善影像的邊緣。第二學習網路模型可為深度學習模型或基於機器學習的模型，深度學習模型學習藉由使用多個層來改善影像的紋理，基於機器學習的模型被訓練成藉由使用基於多個影像的預學習資料庫(database，DB)及多個預學習濾波器來改善影像的紋理。此處，預學習DB可為與多個影像圖案中的每一者對應的多個濾波器，且第二學習網路模型可辨識與影像10中所包括的影像區塊對應的影像圖案且藉由使用多個濾波器中與被辨識的圖案對應的濾波器來對影像10的紋理進行最佳化。根據本揭露的實施例，第一學習網路模型可為深度學習模型，且第二學習網路模型可為機器學習模型。 As an example, the first learning network model may be one of a deep learning-based model or a machine learning model, and the deep learning-based model learns based on multiple Like to improve the edges of the image 10, a machine learning model is trained to improve the edges of the image by using a number of pre-learned filters. The second learning network model can be a deep learning model or a machine learning based model. The deep learning model learns to improve the texture of the image by using multiple layers. The machine learning based model is trained to improve the texture of the image by using multiple layers. Pre-learning database (DB) and multiple pre-learning filters to improve image texture. Here, the pre-learning DB can be a plurality of filters corresponding to each of the plurality of image patterns, and the second learning network model can identify the image patterns corresponding to the image blocks included in the image 10 and use the The texture of the image 10 is optimized by using the filter of the plurality of filters corresponding to the recognized pattern. According to an embodiment of the present disclosure, the first learning network model may be a deep learning model, and the second learning network model may be a machine learning model.

機器學習模型包括基於各種資訊及資料輸入方法(例如監督式學習(supervised learning)、無監督式學習(unsupervised learning)及半監督式學習(semi-supervised learning))而預先進行學習的多個預學習濾波器，且在所述多個濾波器中辨識將被應用至影像10的濾波器。 Machine learning models include multiple pre-learners that learn in advance based on various information and data input methods, such as supervised learning, unsupervised learning, and semi-supervised learning filter, and the filter to be applied to image 10 is identified among the plurality of filters.

深度學習模型是基於大量資料實行學習的模型且包括位於輸入層與輸出層之間的多個隱藏層。因此，深度學習模型可能需要較機器學習模型的資源多的影像處理裝置100的附加資源來實行學習及操作。 A deep learning model is a model that performs learning based on a large amount of data and includes multiple hidden layers between an input layer and an output layer. Therefore, the deep learning model may require additional resources of the image processing apparatus 100 to perform learning and operation than the resources of the machine learning model.

作為另一實例，第一學習網路模型及第二學習網路模型可為基於相同的人工智慧演算法的模型，但具有不同的大小或配置。舉例而言，第二學習網路模型可為具有較第一學習網路模型的大小小的大小的低複雜度模型。此處，學習網路模型的大小及複雜度可與構成模型的捲積層的數目及參數的數目成比例關係。另外，根據本揭露的實施例，第二學習網路模型可為深度學習模型，且第一學習網路模型可為使用較第二學習網路模型的捲積層少的捲積層的深度學習模型。 As another example, the first learning network model and the second learning network model may be models based on the same artificial intelligence algorithm, but with different sizes or configurations set. For example, the second learning network model may be a low-complexity model having a size smaller than that of the first learning network model. Here, the size and complexity of the learning network model may be proportional to the number of convolutional layers and the number of parameters constituting the model. In addition, according to an embodiment of the present disclosure, the second learning network model may be a deep learning model, and the first learning network model may be a deep learning model using fewer convolutional layers than the second learning network model.

作為又一實例，第一學習網路模型及第二學習網路模型中的每一者可為機器學習模型。舉例而言，第二學習網路模型可為具有較第一學習網路模型的大小小的大小的低複雜度機器學習模型。 As yet another example, each of the first learning network model and the second learning network model may be a machine learning model. For example, the second learning network model may be a low-complexity machine learning model having a size smaller than that of the first learning network model.

同時，已基於假設第一學習網路模型是藉由使用較第二學習網路模型的資源多的資源進行訓練的模型來闡釋了本揭露的各種實施例，但此僅為實例且本揭露並非僅限於此。舉例而言，第一學習網路模型與第二學習網路模型可為具有相同或相似複雜度的模型，且第二學習網路模型可為藉由使用較第一學習網路模型的資源多的資源進行訓練的模型。 Also, various embodiments of the present disclosure have been explained based on the assumption that the first learning network model is a model trained by using more resources than the second learning network model, but this is only an example and the present disclosure is not Just that. For example, the first learning network model and the second learning network model can be models of the same or similar complexity, and the second learning network model can be achieved by using more resources than the first learning network model. resources to train the model.

在操作S330處，根據本揭露實施例的處理器120可辨識影像10中所包括的邊緣區域及紋理區域。接著，在操作S340處，處理器120可基於關於邊緣區域及紋理區域的資訊而將第一權重應用至第一影像且將第二權重應用至第二影像。作為實例，處理器120可基於關於影像10中所包括的邊緣區域與紋理區域的比例的資訊來獲取與邊緣區域對應的第一權重及與紋理區域對應的第二權重。舉例而言，若根據比例存在較紋理區域多的邊緣區域，則處理器120可對其中邊緣區域已得到改善的第一影像應用較其中紋理已得到改善的第二影像大的權重。作為另一實例，若根據比例存在較邊緣區域多的紋理區域，則處理器120可對其中紋理已得到改善的第二影像應用較其中邊緣已得到改善的第一影像大的權重。接著，處理器120可基於已被應用第一權重的第一影像及已被應用第二權重的第二影像來獲取輸出影像10。 At operation S330 , the processor 120 according to an embodiment of the present disclosure may identify edge regions and texture regions included in the image 10 . Then, at operation S340, the processor 120 may apply the first weight to the first image and the second weight to the second image based on the information about the edge region and the texture region. As an example, the processor 120 may obtain the first weight corresponding to the edge region and the first weight corresponding to the texture region based on information about the ratio of the edge region to the texture region included in the image 10 second weight. For example, if there are more edge areas than texture areas according to the ratio, the processor 120 may apply a greater weight to the first image in which the edge area has been improved than the second image in which the texture has been improved. As another example, if there are more texture areas than edge areas according to the ratio, the processor 120 may apply a greater weight to the second image in which the texture has been improved than the first image in which the edge has been improved. Next, the processor 120 may obtain the output image 10 based on the first image to which the first weight has been applied and the second image to which the second weight has been applied.

作為又一實例，自第一學習網路模型及第二學習網路模型獲取的第一影像及第二影像可為殘留影像。此處，殘留影像可為除原始影像之外僅包括殘留資訊的影像。作為實例，第一學習網路模型可辨識影像10中的邊緣區域且對被辨識的邊緣區域進行最佳化並獲取第一影像。第二學習網路模型可辨識影像10中的紋理區域且對被識別的紋理區域進行最佳化並獲取第二影像。 As yet another example, the first and second images obtained from the first learned network model and the second learned network model may be residual images. Here, the residual image may be an image including only residual information in addition to the original image. As an example, a first learning network model may identify edge regions in the image 10 and optimize the identified edge regions and obtain the first image. The second learning network model can identify texture regions in the image 10 and optimize the identified texture regions and acquire the second image.

接著，處理器120可將影像10與第一影像及第二影像進行混合並獲取輸出影像20。此處，混合可為將第一影像及第二影像中的每一者的對應的畫素值添加至影像10中所包括的每一畫素的值的處理。在此種情形中，由於第一影像及第二影像，輸出影像20可為具有已得到增強的邊緣及紋理的影像。 Next, the processor 120 may mix the image 10 with the first image and the second image and obtain the output image 20 . Here, blending may be the process of adding the corresponding pixel value of each of the first image and the second image to the value of each pixel included in image 10 . In this case, the output image 20 may be an image with enhanced edges and textures due to the first image and the second image.

根據本揭露實施例的處理器120可將第一權重及第二權重分別應用至第一影像及第二影像，且接著將所述影像與影像10進行混合，且因此獲取輸出影像20。 The processor 120 according to an embodiment of the present disclosure may apply the first weight and the second weight to the first image and the second image, respectively, and then mix the image with the image 10 , and thus obtain the output image 20 .

作為另一實例，處理器120可將影像10劃分成多個區域。接著，處理器120可辨識所述多個區域中的每一者的邊緣區域與紋理區域的比例。對於所述多個區域中邊緣區域的比例高的第一區域，處理器120可將第一權重設定成較第二權重大的值。另外，對於所述多個區域中紋理區域的比例高的第二區域，處理器120可將第二權重設定成較第一權重大的值。 As another example, processor 120 may divide image 10 into regions area. Next, processor 120 may identify a ratio of edge area to texture area for each of the plurality of areas. For the first region in which the ratio of edge regions is high among the plurality of regions, the processor 120 may set the first weight to a value greater than the second weight. In addition, the processor 120 may set the second weight to a larger value than the first weight for the second area in which the ratio of the texture area is high among the plurality of areas.

接著，在操作S340處，處理器120可將已被應用權重的第一影像及第二影像與影像10進行混合並獲取輸出影像。影像10及與影像10對應的輸出影像20可被表達為以下方程式1。 Next, at operation S340, the processor 120 may mix the first image and the second image to which the weights have been applied with the image 10 and obtain an output image. The image 10 and the output image 20 corresponding to the image 10 can be expressed as Equation 1 below.

[方程式1] Y _res=Y _img+a*Network_Model1(Y _img)+b*Network_Model2(Y _img) [Equation 1] Y _res = Y _img + a * Network_Model 1( Y _img )+b* Network_Model 2( Y _img )

此處，Y _img意指影像10，Network_Model1(Y _img)意指第一影像，Network_Model2(Y _img)意指第二影像，「a」意指與第一影像對應的第一權重，且「b」意指與第二影像對應的第二權重。 Here, Y _img means image 10, Network_Model1( Y _img ) means the first image, Network_Model2( Y _img ) means the second image, "a" means the first weight corresponding to the first image, and "b" ” means the second weight corresponding to the second image.

同時，作為又一實例，處理器120可將影像10作為輸入應用至第三學習網路模型並獲取用於應用至第一影像的第一權重及應用至第二影像的第二權重。舉例而言，第三學習網路模型可被訓練成辨識影像10中所包括的邊緣區域及紋理區域且基於被辨識的邊緣區域與紋理區域的比例、影像10的特性等輸出對邊緣區域進行加強的第一權重及對紋理區域進行加強的第二權重。 Meanwhile, as yet another example, the processor 120 may apply the image 10 as an input to a third learning network model and obtain a first weight for applying to the first image and a second weight for applying to the second image. For example, the third learning network model may be trained to identify edge regions and texture regions included in the image 10 and enhance the edge regions based on the ratio of the identified edge regions and texture regions, characteristics of the image 10, etc. The first weight of and the second weight to enhance the texture area.

參照圖4，在操作S410處，處理器120可辨識輸入影像10’中的邊緣區域及紋理區域並獲取與邊緣區域對應的第一權重及與紋理區域對應的第二權重。作為實例，處理器120可將導引濾波器(guided filter)應用至輸入影像10’且辨識邊緣區域及紋理區域。導引濾波器可為用於將影像10劃分成基礎層及細節層的濾波器。處理器120可基於基礎層辨識邊緣區域且基於細節層辨識紋理區域。 Referring to FIG. 4 , at operation S410, the processor 120 may identify edge regions and texture regions in the input image 10' and obtain a first weight corresponding to the edge regions and a second weight corresponding to the texture region. As an example, the processor 120 may apply a guided filter to the input image 10' and identify edge regions and texture regions. The guided filter may be a filter used to divide the image 10 into a base layer and a detail layer. The processor 120 may identify edge regions based on the base layer and texture regions based on the detail layer.

接著，在操作S420處，處理器120可按比例縮小輸入影像10’並獲取解析度較輸入影像10’的解析度小的影像10。作為實例，處理器120可對輸入影像10’應用子採樣並將輸入影像10’的解析度按比例縮小至目標解析度。此處，目標解析度可為較輸入影像10’的解析度低的低解析度。舉例而言，目標解析度可為與輸入影像10’對應的原始影像的解析度。此處，原始影像的解析度可藉由解析度估測程式來估測，或者基於與輸入影像10’一同接收的附加資訊來辨識，但原始影像的解析度及辨識並非僅限於此。同時，處理器120可應用除子採樣之外的各種已知的按比例縮小方法，且因此獲取與輸入影像10’對應的影像10。 Next, at operation S420, the processor 120 may scale down the input image 10' and acquire the image 10 with a resolution smaller than that of the input image 10'. As an example, processor 120 may apply subsampling to input image 10' and scale down the resolution of input image 10' to a target resolution. Here, the target resolution may be a low resolution lower than the resolution of the input image 10'. For example, the target resolution may be the resolution of the original image corresponding to the input image 10'. Here, the resolution of the original image may be estimated by a resolution estimation program, or identified based on additional information received with the input image 10', but the resolution and identification of the original image are not limited thereto. Meanwhile, the processor 120 may apply various known downscaling methods other than subsampling, and thus acquire the image 10 corresponding to the input image 10'.

作為實例，若輸入影像10’是解析度為4K的UHD影像，則為將輸入影像10’作為輸入應用至第一學習網路模型及第二學習網路模型並獲取輸出影像20，需要較將解析度為820×480的SD影像應用至第一學習網路模型及第二學習網路模型的情形大至少5.33倍(3840/820)的列緩衝器記憶體(line buffer memory)。另外，存在以下問題：隨著第一學習網路模型獲取第一影像所需的操作量的增大，儲存第一學習網路模型中所包括的多個隱藏層中的每一者的中間操作結果的記憶體110的空間以及所需的CPU/GPU的效能以指數方式增大。 As an example, if the input image 10 ′ is a UHD image with a resolution of 4K, in order to apply the input image 10 ′ as an input to the first learning network model and the second learning network model and obtain the output image 20 , it is necessary to compare the When the SD image with a resolution of 820×480 is applied to the first learning network model and the second learning network model, the line buffer memory is at least 5.33 times (3840/820) larger. In addition, there is the following problem: as the amount of operations required for the first learning network model to acquire the first image increases, storing the The memory 110 space and required CPU/GPU performance for each intermediate operation result increases exponentially.

因此，根據本揭露實施例的處理器120可將按比例縮小的輸入影像10應用至第一學習網路模型及第二學習網路模型，以減小第一學習網路模型及第二學習網路模型中所需的操作量、記憶體110的儲存空間等。 Therefore, the processor 120 according to an embodiment of the present disclosure can apply the scaled-down input image 10 to the first learning network model and the second learning network model to reduce the size of the first learning network model and the second learning network model. The amount of operations required in the road model, the storage space of the memory 110, and the like.

在操作S430處，當輸入按比例縮小的影像10時，根據本揭露實施例的第一學習網路模型可實行對與輸入影像10中所包括的邊緣對應的高頻分量進行增強的按比例放大並獲取高解析度的第一影像。同時，在操作S440處，第二學習網路模型可實行對與影像10中所包括的紋理對應的高頻分量進行增強的按比例放大並獲取高解析度的第二影像。此處，第一影像及第二影像的解析度可與輸入影像10’的解析度相同。舉例而言，若輸入影像10是4K解析度的影像且按比例縮小的影像10是2K解析度的影像，則第一學習網路模型及第二學習網路模型可對影像10實行按比例放大並獲取4K解析度的影像作為影像10的輸出。 At operation S430 , when the scaled-down image 10 is input, the first learning network model according to an embodiment of the present disclosure may implement a scale-up that enhances high-frequency components corresponding to edges included in the input image 10 And get a high-resolution first image. Meanwhile, at operation S440, the second learning network model may perform upscaling to enhance high frequency components corresponding to textures included in the image 10 and acquire a high-resolution second image. Here, the resolution of the first image and the second image may be the same as the resolution of the input image 10'. For example, if the input image 10 is an image of 4K resolution and the scaled-down image 10 is an image of 2K resolution, the first learning network model and the second learning network model can perform the scaling up of the image 10 And obtain a 4K resolution image as the output of image 10 .

在操作S450處，根據本揭露實施例的處理器120可將按比例放大的第一影像及第二影像與輸入影像10’進行混合並獲取其中輸入影像10’中的邊緣及紋理已得到增強的高解析度的輸出影像20。根據圖4中所示的實施例，獲取輸入影像10’及與輸入影像10’對應的輸出影像20的過程可被表達為以下方程式2。 At operation S450, the processor 120 according to an embodiment of the present disclosure may blend the scaled-up first image and the second image with the input image 10' and obtain an image in which edges and textures in the input image 10' have been enhanced High-resolution output image 20 . According to the embodiment shown in FIG. 4, the process of acquiring the input image 10' and the output image 20 corresponding to the input image 10' can be expressed as Equation 2 below.

[方程式2] Y _res=Y _org+a*Network_Model1(DownScaling(Y _org))+b*Network_Model2(DownS caling(Y _org)) [Equation 2] Y _res = Y _org + a * Network_Model 1( DownScaling ( Y _org ))+b* Network_Model 2( DownS caling ( Y _org ))

此處，Y _org意指輸入影像10’，DownScaling(Y _org)意指影像10，Network_Model1(DownScaling(Y _org))意指第一影像，Network_Model2(DownScaling(Y _org))意指第二影像，「a」意指與第一影像對應的第一權重，且「b」意指與第二影像對應的第二權重。 Here, Y _org means the input image 10 ′, DownScaling( Y _org ) means the image 10 , Network_Model1(DownScaling( Y _org )) means the first image, Network_Model2(DownScaling( Y _org )) means the second image, "a" means the first weight corresponding to the first image, and "b" means the second weight corresponding to the second image.

參照圖5，如上所述，第一學習網路模型可為學習藉由使用多個層來加強影像10的邊緣的深度學習模型，且第二學習網路模型可為被訓練成藉由使用多個預學習濾波器來加強影像10的紋理的機器學習模型。 5, as described above, the first learning network model may be a deep learning model that learns to enhance the edges of the image 10 by using multiple layers, and the second learning network model may be trained to enhance the edges of the image 10 by using multiple layers. A machine learning model of a pre-learned filter to enhance the texture of the image 10.

根據本揭露的實施例，可以其中重複兩個捲積層及一個集用層(pooling layer)的配置將深度學習模型建模成共包括十個或更多個層的深度結構。另外，深度學習模型可藉由使用各種類型的激活函數(例如恆等函數(Identity Function)、邏輯S形函數(Logistic Sigmoid Function)、雙曲正切(Hyperbolic Tangent，tanh)函數、線性整流(rectified linear unit，ReLU)函數、漏失ReLU函數(Leaky ReLU Function)等)來實行操作。另外，在實行捲積的過程中，深度學習模型可藉由實行填補、跨步等來不同地調整大小。此處，填補意指在接收到的輸入值周圍填充入與預定大小一般大的特定值(例如，畫素值)。跨步意指當實行捲積時加權矩陣的移位間隔。舉例而言，若跨步=3，則學習網路模型可在一次性將權重矩陣移位與三個空間一般多時對輸入值實行捲積。 According to an embodiment of the present disclosure, a deep learning model may be modeled as a deep structure including ten or more layers in total in a configuration in which two convolutional layers and one pooling layer are repeated. In addition, deep learning models can be implemented by using various types of activation functions (such as Identity Function, Logistic Sigmoid Function, Hyperbolic Tangent (tanh) function, rectified linear unit, ReLU) function, leaky ReLU function (Leaky ReLU Function), etc.) to implement the operation. In addition, deep learning models can be sized differently by performing padding, striding, etc., during the process of performing convolutions. Here, padding means to fill in a specific value (eg, pixel value) that is generally larger than a predetermined size around the received input value. Stride means that the weighting matrix when performing convolution shift interval. For example, if stride=3, then the learning network model can perform convolution on the input values while shifting the weight matrix by as many as three spaces at a time.

根據本揭露的實施例，深度學習模型可學習對影像10的各種特性中使用者靈敏性高的一個特性進行最佳化，且機器學習模型可藉由使用多個預學習濾波器來對影像10的其餘特性中的至少一者進行最佳化。舉例而言，可假設其中在邊緣區域的透明度(例如，邊緣方向、邊緣強度)與使用者感覺到的影像10的透明度之間存在密切關係的情形。影像處理裝置100可藉由使用深度學習模型來增強影像10的邊緣，且作為其餘特性的實例，影像處理裝置100可藉由使用機器學習模型來增強紋理。由於深度學習模型基於較機器學習模型多的大量資料進行學習且實行迭代操作，因此假設深度學習模型的處理結果優於機器學習模型的處理結果。然而，本揭露未必僅限於此，且第一學習網路模型及第二學習網路模型二者皆可被實施成基於深度學習的模型或者被實施成基於機器學習的模型。作為另一實例，第一學習網路模型可被實施成基於機器學習的模型且第二學習網路模型可被實施成基於深度學習的模型。 According to an embodiment of the present disclosure, the deep learning model can learn to optimize a characteristic of the image 10 that has high user sensitivity, and the machine learning model can use a plurality of pre-learning filters to optimize the image 10 at least one of the remaining properties of . For example, a situation may be assumed in which there is a close relationship between the transparency of the edge region (eg, edge direction, edge intensity) and the user's perceived transparency of the image 10 . The image processing device 100 may enhance the edges of the image 10 by using a deep learning model, and as an example of the remaining characteristics, the image processing device 100 may enhance the texture by using a machine learning model. Since the deep learning model learns based on a larger amount of data than the machine learning model and performs iterative operations, it is assumed that the processing result of the deep learning model is better than the processing result of the machine learning model. However, the present disclosure is not necessarily limited thereto, and both the first learning network model and the second learning network model may be implemented as deep learning based models or as machine learning based models. As another example, the first learning network model may be implemented as a machine learning based model and the second learning network model may be implemented as a deep learning based model.

另外，儘管基於第一學習網路模型加強邊緣且第二學習網路模型加強紋理的假設對本揭露的各種實施例進行闡釋，然而學習網路模型的具體操作並非僅限於此。舉例而言，可假設其中在影像10的雜訊的處理程度與使用者感覺到的影像10的透明度之間存在最密切關係的情形。在此種情形中，影像處理裝置100可藉由使用深度學習模型對影像10的雜訊實行影像處理，且作為其餘影像特性的實例，影像處理裝置100可藉由使用機器學習模型來加強紋理。作為另一實例，若在影像10的亮度的處理程度與使用者感覺到的影像10的透明度之間存在最密切關係，則影像處理裝置100可藉由使用深度學習模型對影像10的亮度實行影像處理，且作為其餘影像特性的實例，影像處理裝置100可藉由使用機器學習模型來過濾雜訊。 Additionally, although various embodiments of the present disclosure are explained based on the assumption that the first learned network model enhances edges and the second learned network model enhances texture, the specific operations of the learned network model are not limited thereto. For example, a situation may be assumed in which there is the closest relationship between the processing level of the noise of the image 10 and the transparency of the image 10 perceived by the user. In this case, the image processing apparatus 100 may By performing image processing on the noise of the image 10 using a deep learning model, and as an example of the remaining image properties, the image processing device 100 can enhance texture by using a machine learning model. As another example, if there is the closest relationship between the processing degree of the brightness of the image 10 and the transparency of the image 10 perceived by the user, the image processing apparatus 100 may perform imaging on the brightness of the image 10 by using a deep learning model processing, and as an example of the remaining image characteristics, the image processing device 100 may filter noise by using a machine learning model.

圖6是示出根據本揭露實施例的第一學習網路模型及第二學習網路模型的圖。 FIG. 6 is a diagram illustrating a first learning network model and a second learning network model according to an embodiment of the present disclosure.

根據本揭露實施例的處理器120可在操作S610處按比例縮小輸入影像10’並獲取相對較低的解析度的影像10，且在操作S620處獲取區域偵測資訊，區域偵測資訊已基於按比例縮小的影像10’而辨識出邊緣區域及紋理區域。根據圖5中所示的實施例，處理器120可辨識原始影像的解析度的輸入影像10’中所包括的邊緣區域及紋理區域。參照圖6，處理器120可辨識其中輸入影像10’的解析度已被按比例縮小至目標解析度的影像10中所包括的邊緣區域及紋理區域。 The processor 120 according to an embodiment of the present disclosure may scale down the input image 10' and obtain the image 10 with a relatively lower resolution at operation S610, and obtain area detection information at operation S620, which has been based on A scaled-down image 10' identifies edge regions and textured regions. According to the embodiment shown in FIG. 5, the processor 120 may identify edge regions and texture regions included in the input image 10' of the resolution of the original image. 6, the processor 120 can identify edge regions and texture regions included in the image 10 in which the resolution of the input image 10' has been scaled down to the target resolution.

接著，根據本揭露實施例的處理器120可分別將區域偵測資訊及影像10提供至第一學習網路模型及第二學習網路模型。 Then, the processor 120 according to an embodiment of the present disclosure may provide the region detection information and the image 10 to the first learning network model and the second learning network model, respectively.

在操作S630處，根據本揭露實施例的第一學習網路模型可基於區域偵測資訊實行僅對影像10的邊緣區域進行加強的按比例放大。在操作S640處，第二學習網路模型可基於區域偵測資訊實行僅對影像10的紋理區域進行加強的按比例放大。 At operation S630, the first learning network model according to an embodiment of the present disclosure may perform a scaling-up that only enhances the edge region of the image 10 based on the region detection information. At operation S640, the second learning network model may detect information based on the region The information performs upscaling that enhances only the textured regions of the image 10 .

作為另一實例，處理器120可基於區域偵測資訊而將僅包括影像10中所包括的畫素資訊中的一些畫素資訊的影像提供至學習網路模型。由於處理器120僅將影像10中所包括的一些資訊而非影像10提供至學習網路模型，因此學習網路模型進行的操作量可減少。舉例而言，處理器120可基於區域偵測資訊而將僅包括與邊緣區域對應的畫素資訊的影像提供至第一學習網路模型且將僅包括與紋理區域對應的畫素資訊的影像提供至第二學習網路模型。 As another example, the processor 120 may provide an image including only some of the pixel information included in the image 10 to the learning network model based on the region detection information. Since the processor 120 only provides some information included in the image 10 instead of the image 10 to the learning network model, the amount of operations performed by the learning network model can be reduced. For example, the processor 120 may provide an image including only pixel information corresponding to the edge region to the first learning network model and provide an image including only pixel information corresponding to the texture region based on the region detection information to the second learning network model.

接著，第一學習網路模型可按比例放大邊緣區域並獲取第一影像，且第二學習網路模型可按比例放大紋理區域並獲取第二影像。 Then, the first learning network model can scale up the edge region and acquire the first image, and the second learning network model can scale up the texture region and acquire the second image.

接下來，在操作S650處，處理器120可將第一影像及第二影像添加至輸入影像10’並獲取輸出影像20。 Next, at operation S650, the processor 120 may add the first image and the second image to the input image 10' and obtain the output image 20.

參照圖7，在操作S710處，根據本揭露實施例的處理器120可將輸入影像10’作為輸入應用至第一學習網路模型並獲取第一影像。作為實例，由於在操作S710處第一學習網路模型實行對與輸入影像10’中所包括的邊緣對應的高頻分量進行加強的按比例放大，因此處理器120可獲取高解析度的第一影像。此處，第一影像可為殘留影像。殘留影像可為除原始影像之外僅包括殘留資訊的影像。殘留資訊可指示原始影像與高解析度影像的每一畫素或畫素群組之間的差異。 Referring to FIG. 7 , at operation S710, the processor 120 according to an embodiment of the present disclosure may apply the input image 10' as an input to the first learning network model and acquire the first image. As an example, since the first learning network model performs upscaling to enhance the high frequency components corresponding to the edges included in the input image 10 ′ at operation S710 , the processor 120 may acquire the high-resolution first image. Here, the first image may be an afterimage. Afterimages can include only residual information in addition to the original image. image of the news. Residual information may indicate the difference between each pixel or group of pixels of the original image and the high-resolution image.

另外，在操作S720處，根據本揭露實施例的處理器120可將輸入影像10’作為輸入應用至第二學習網路模型並獲取第二影像。作為實例，由於第二學習網路模型實行對與輸入影像10’中所包括的紋理對應的高頻分量進行加強的按比例放大，因此處理器120可獲取高解析度的第二影像。此處，第二影像可為殘留影像。殘留資訊可指示原始影像與高解析度影像的每一畫素或畫素群組之間的差異。 In addition, at operation S720, the processor 120 according to an embodiment of the present disclosure may apply the input image 10' as an input to the second learning network model and acquire the second image. As an example, the processor 120 may acquire a high-resolution second image because the second learning network model performs upscaling that enhances high frequency components corresponding to textures included in the input image 10'. Here, the second image may be an afterimage. Residual information may indicate the difference between each pixel or group of pixels of the original image and the high-resolution image.

根據本揭露的實施例，第一學習網路模型及第二學習網路模型分別實行對輸入影像10’的特性中的至少一個特性進行加強的按比例放大，且因此相較於輸入影像10’，第一影像及第二影像具有高解析度。舉例而言，若輸入影像10’的解析度為2K，則第一影像及第二影像的解析度可為4K，且若輸入影像10’的解析度為4K，則第一影像及第二影像的解析度可為8K。 According to embodiments of the present disclosure, the first learning network model and the second learning network model, respectively, implement a scale-up that enhances at least one of the characteristics of the input image 10', and thus is compared to the input image 10'. , the first image and the second image have high resolution. For example, if the resolution of the input image 10' is 2K, the resolution of the first image and the second image may be 4K, and if the resolution of the input image 10' is 4K, the resolution of the first image and the second image The resolution can be 8K.

在操作S730處，根據本揭露實施例的處理器120可按比例放大輸入影像10’並獲取第三影像。根據本揭露的實施例，影像處理裝置100可包括對輸入影像10’進行按比例放大的單獨的處理器，且處理器120可按比例放大輸入影像10’並獲取高解析度的第三影像。舉例而言，處理器120可藉由使用雙線性內插(bilinear interpolation)、雙三次內插(bicubic interpolation)、三次樣條內插(cubic spline interpolation)、蘭佐斯內插(Lanczos interpolation)、邊緣定向內插(edge directed interpolation，EDI)等對輸入影像10’實行按比例放大。同時，此僅為實例且處理器120可基於各種按比例放大(或超解析度)方法來按比例放大輸入影像10’。 At operation S730, the processor 120 according to an embodiment of the present disclosure may scale up the input image 10' and acquire a third image. According to an embodiment of the present disclosure, the image processing apparatus 100 may include a separate processor for scaling up the input image 10', and the processor 120 may scale up the input image 10' and obtain a high-resolution third image. For example, the processor 120 may use bilinear interpolation, bicubic interpolation, cubic spline interpolation, Lanczos interpolation interpolation), edge directed interpolation (EDI), etc. perform scaling up of the input image 10'. Also, this is only an example and the processor 120 may upscale the input image 10' based on various upscaling (or super-resolution) methods.

作為另一實例，處理器120可將輸入影像10’作為輸入應用至第三學習網路模型並獲取與輸入影像10’對應的高解析度的第三影像。此處，第三學習網路模型可為基於深度學習的模型或基於機器學習的模型。根據本揭露的實施例，若輸入影像10’的解析度為4K，則第三影像的解析度可為8K。另外，根據本揭露的實施例，第一影像至第三影像的解析度可相同。 As another example, the processor 120 may apply the input image 10' as an input to a third learning network model and obtain a high-resolution third image corresponding to the input image 10'. Here, the third learning network model may be a deep learning based model or a machine learning based model. According to the embodiment of the present disclosure, if the resolution of the input image 10' is 4K, the resolution of the third image may be 8K. In addition, according to the embodiment of the present disclosure, the resolutions of the first image to the third image may be the same.

接著，在操作S740處，處理器120可對第一影像至第三影像進行混合並獲取輸出影像20。 Next, at operation S740 , the processor 120 may mix the first image to the third image and obtain the output image 20 .

根據本揭露實施例的處理器120可對第一殘留影像、第二殘留影像及第三殘留影像進行混合並獲取輸出影像，第一殘留影像藉由加強輸入影像10’中的邊緣來按比例放大輸入影像10’，第二殘留影像藉由加強輸入影像10’中的紋理來按比例放大輸入影像10’，第三殘留影像按比例放大輸入影像10’。此處，處理器120可辨識輸入影像10’中的邊緣區域且將被辨識的邊緣區域應用至第一學習網路模型，並且加強邊緣區域，且因此獲取按比例放大的第一殘留影像。另外，處理器120可辨識輸入影像10’中的紋理區域且將被辨識的紋理區域應用至第二學習網路模型，並且加強紋理區域，且因此獲取按比例放大的第二殘留影像。同時，此僅為實例且配置及操作並非僅限於此。舉例而言，處理器120可將輸入影像10’應用至第一學習網路模型及第二學習網路模型。接著，第一學習網路模型可基於輸入影像10’的各種影像特性中的邊緣特性來辨識邊緣區域且加強被辨識的邊緣區域，且因此獲取按比例放大的高解析度的第一殘留影像。第二學習網路模型可基於輸入影像10’的各種影像特性中的紋理特性來辨識紋理區域且加強被辨識的紋理區域，且因此獲取按比例放大的高解析度的第二殘留影像。 The processor 120 according to an embodiment of the present disclosure may mix the first afterimage, the second afterimage, and the third afterimage to obtain an output image, and the first afterimage is scaled up by enhancing the edges in the input image 10' The input image 10', the second residual image scales the input image 10' by enhancing the texture in the input image 10', and the third residual image scales the input image 10'. Here, the processor 120 may identify edge regions in the input image 10' and apply the identified edge regions to the first learning network model, and enhance the edge regions, and thus obtain a scaled-up first residual image. Additionally, the processor 120 may identify textured regions in the input image 10' and apply the identified textured regions to the second learning network model, and enhance the textured regions, and thereby obtain a scaled-up second residual image. At the same time, this is only instance and configuration and operation are not limited to this. For example, the processor 120 may apply the input image 10' to the first learning network model and the second learning network model. The first learning network model may then identify edge regions and enhance the identified edge regions based on edge characteristics among various image characteristics of the input image 10', and thus obtain a scaled-up high-resolution first residual image. The second learning network model may identify textured regions and enhance the identified textured regions based on texture properties among various image properties of the input image 10', and thereby obtain a scaled-up high-resolution second residual image.

另外，根據本揭露實施例的處理器120可按比例放大輸入影像10’且獲取高解析度的第三影像。此處，第三影像可為藉由按比例放大原始影像而非殘留影像所獲取的影像。 In addition, the processor 120 according to an embodiment of the present disclosure can scale up the input image 10' and obtain a high-resolution third image. Here, the third image may be an image obtained by scaling up the original image instead of the residual image.

根據本揭露的實施例，處理器120可對第一影像至第三影像進行混合並獲取解析度較輸入影像10’大的輸出影像20。此處，輸出影像20可為其中邊緣區域及紋理區域已得到加強的按比例放大影像，而非其中僅解析度已被按比例放大的影像。同時，此僅為實例且處理器120可獲取其中輸入影像10’的各種影像特性已得到加強的多個殘留影像且對按比例放大輸入影像10’的第三影像與多個殘留影像進行混合，並且基於對影像進行混合的結果獲取輸出影像20。 According to an embodiment of the present disclosure, the processor 120 may mix the first image to the third image and obtain an output image 20 with a higher resolution than the input image 10'. Here, the output image 20 may be a scaled-up image in which edge and texture regions have been enhanced, rather than an image in which only the resolution has been scaled up. Meanwhile, this is only an example and the processor 120 may obtain a plurality of afterimages in which various image characteristics of the input image 10' have been enhanced and blend a third image that scales up the input image 10' with the plurality of afterimages, And an output image 20 is obtained based on the result of mixing the images.

根據本揭露實施例的處理器120可將影像10作為輸入應用至第二學習網路模型並獲取其中紋理已得到增強的第二影像。 The processor 120 according to an embodiment of the present disclosure can take the image 10 as an input Applying to a second learning network model and obtaining a second image in which the texture has been enhanced.

根據本揭露實施例的第二學習網路模型可儲存與多個影像圖案中的每一者對應的多個濾波器。此處，可根據影像區塊的特性對所述多個影像圖案進行分類。舉例而言，第一影像圖案可為在水平方向上具有大量線條的影像圖案，且第二影像圖案可為在旋轉方向上具有大量線條的影像圖案。所述多個濾波器可為藉由人工智慧演算法進行預先學習的濾波器。 A second learning network model according to an embodiment of the present disclosure may store a plurality of filters corresponding to each of a plurality of image patterns. Here, the plurality of image patterns may be classified according to the characteristics of the image blocks. For example, the first image pattern may be an image pattern with a large number of lines in the horizontal direction, and the second image pattern may be an image pattern with a large number of lines in the rotational direction. The plurality of filters may be pre-learned filters by artificial intelligence algorithms.

另外，根據本揭露實施例的第二學習網路模型可自影像10讀取預定大小的影像區塊。此處，影像區塊可為包括影像10中所包括的對象畫素及多個周圍畫素的多個畫素的群組。作為實例，第二學習網路模型可讀取影像10的左上端上的3×3畫素大小的第一影像區塊且對第一影像區塊實行影像處理。接著，第二學習網路模型可自影像10的左上端向右掃描與單位畫素一般多的量，且讀取3×3畫素大小的第二影像區塊且對第二影像區塊實行影像處理。藉由掃描畫素區塊，第二學習網路模型可對影像10實行影像處理。同時，第二學習網路模型可自影像10自主讀取第一影像區塊至第n影像區塊，且處理器120可將第一影像區塊至第n影像區塊作為輸入依序應用至第二學習網路模型且對影像10實行影像處理。 In addition, the second learning network model according to the embodiment of the present disclosure can read image blocks of a predetermined size from the image 10 . Here, the image block may be a group of a plurality of pixels including a target pixel included in the image 10 and a plurality of surrounding pixels. As an example, the second learning network model may read a first image block of size 3×3 pixels on the upper left end of the image 10 and perform image processing on the first image block. Then, the second learning network model can scan from the upper left end of the image 10 to the right by an amount as much as a unit pixel, and read a second image block with a size of 3×3 pixels and execute the execution on the second image block image processing. The second learning network model can perform image processing on the image 10 by scanning the pixel blocks. Meanwhile, the second learning network model can autonomously read the first image block to the n th image block from the image 10, and the processor 120 can sequentially apply the first image block to the n th image block as input to the The second learns the network model and performs image processing on the image 10 .

為偵測影像區塊中的高頻分量，第二學習網路模型可將預定大小的濾波器應用至影像區塊。作為實例，第二學習網路模型可將與影像區塊的大小對應的3×3大小的拉普拉斯濾波器(Laplacian filter)810應用至影像區塊，且因此消除影像10中的低頻分量並偵測高頻分量。作為另一實例，第二學習網路模型可藉由將各種類型的濾波器(例如索貝爾(Sobel)、普雷維特(Prewitt)、羅伯特(Robert)、坎尼(Canny)等)應用至影像區塊來獲取影像10的高頻分量。 To detect high frequency components in the image block, the second learning network model may apply a filter of predetermined size to the image block. As an example, the second learning network model may apply a Laplacian filter 810 of size 3x3 corresponding to the size of the image block to the image block, and thus eliminate the Low frequency components and detect high frequency components. As another example, the second learning network model may be applied to the image by applying various types of filters (eg, Sobel, Prewitt, Robert, Canny, etc.) block to obtain the high frequency components of the image 10 .

接著，第二學習網路模型可基於自影像區塊獲取的高頻分量來計算梯度向量820。具體而言，第二學習網路模型可計算水平梯度及垂直梯度且基於水平梯度及垂直梯度計算梯度向量。此處，梯度向量可表達基於每一畫素相對於位於預定方向上的畫素的改變量。另外，第二學習網路模型可基於梯度向量的方向性將影像區塊分類為多個影像圖案中的一者。 Next, the second learning network model may calculate gradient vectors 820 based on the high frequency components obtained from the image blocks. Specifically, the second learning network model may calculate horizontal and vertical gradients and calculate gradient vectors based on the horizontal and vertical gradients. Here, the gradient vector may express the amount of change based on each pixel with respect to pixels located in a predetermined direction. Additionally, the second learning network model may classify the image block into one of a plurality of image patterns based on the directionality of the gradient vector.

接下來，第二學習網路模型可藉由使用索引矩陣850來搜尋將被應用至自影像10偵測到的高頻分量的濾波器(實行濾波器搜尋)830。明確而言，第二學習網路模型可基於索引矩陣來辨識指示影像區塊的圖案的索引資訊，且搜尋830與索引資訊對應的濾波器。舉例而言，若與影像區塊對應的索引資訊被辨識為指示影像區塊的圖案的1至32的索引資訊中的索引資訊32，則第二學習網路模型可自所述多個濾波器中獲取映射至索引資訊32的濾波器。同時，以上特定索引值僅為實例且索引資訊可根據濾波器的數目減少或增加。另外，索引資訊可用除整數之外的各種方式來表達。 Next, the second learned network model may search for the filter (performing a filter search) 830 to be applied to the high frequency components detected from the image 10 by using the index matrix 850 . Specifically, the second learning network model may identify index information indicative of the pattern of the image block based on the index matrix, and search 830 the filter corresponding to the index information. For example, if the index information corresponding to the image block is identified as index information 32 among the index information 1 to 32 indicating the pattern of the image block, the second learning network model may be derived from the plurality of filters The filters mapped to the index information 32 are obtained in . Meanwhile, the above specific index values are only examples and the index information may be decreased or increased according to the number of filters. In addition, index information can be expressed in various ways other than integers.

之後，第二學習網路模型可基於搜尋結果獲取濾波器資料庫(DB)860中所包括的所述多個濾波器中的至少一個濾波器，且將所述至少一個濾波器應用840至影像區塊，進而獲取第二影像。作為實例，第二學習網路模型可基於搜尋結果辨識所述多個濾波器中與影像區塊的圖案對應的濾波器且將被辨識的濾波器應用至影像區塊，進而獲取其中紋理區域已被按比例放大的第二影像。 After that, the second learning network model may obtain at least one filter of the plurality of filters included in the filter database (DB) 860 based on the search result, And applying 840 the at least one filter to the image block to obtain a second image. As an example, the second learning network model may identify the filter corresponding to the pattern of the image block among the plurality of filters based on the search result and apply the identified filter to the image block, and then obtain the texture area in which the texture area has been The second image is scaled up.

此處，可根據藉由人工智慧演算法對低解析度的影像區塊與高解析度的影像區塊之間的關係進行學習的結果來獲取濾波器資料庫860中所包括的濾波器。舉例而言，第二學習網路模型可藉由人工智慧演算法學習低解析度的第一影像區塊與其中第一影像區塊的紋理區域已被按比例放大的高解析度的第二影像區塊之間的關係，且辨識將被應用至第一影像區塊的濾波器並將被辨識的濾波器儲存於濾波器資料庫860中。然而，此僅為實例且本揭露並非僅限於此。舉例而言，第二學習網路模型可藉由使用人工智慧演算法進行學習的結果來辨識對影像區塊的各種特性中的至少一者進行加強的濾波器且將被辨識的濾波器儲存於濾波器資料庫860中。 Here, the filters included in the filter database 860 can be acquired according to the result of learning the relationship between the low-resolution image blocks and the high-resolution image blocks by the artificial intelligence algorithm. For example, the second learning network model can learn a low-resolution first image block and a high-resolution second image in which the texture area of the first image block has been scaled up by an artificial intelligence algorithm The relationship between the blocks is identified, and the filter to be applied to the first image block is identified and the identified filter is stored in the filter database 860 . However, this is only an example and the present disclosure is not limited thereto. For example, the second learning network model may identify a filter that enhances at least one of the various characteristics of the image block by using the result of the learning using an artificial intelligence algorithm and store the identified filter in a Filter database 860.

根據本揭露實施例的處理器120可對與被分類的每一影像區塊對應的影像圖案的索引資訊進行累積且獲取累積結果。參照圖9，處理器120可獲取指示影像圖案的索引資訊中與影像區塊的影像圖案對應的索引資訊。接著，處理器120可對影像10中所包括的所述多個影像區塊中的每一者的索引資訊進行累積，且因此獲取累積結果，如圖9中所示。 The processor 120 according to an embodiment of the present disclosure may accumulate the index information of the image patterns corresponding to each classified image block and obtain an accumulation result. Referring to FIG. 9 , the processor 120 may acquire index information corresponding to the image pattern of the image block among the index information indicating the image pattern. The processor 120 may then accumulate the index information for each of the plurality of image blocks included in the image 10, and thus obtain an accumulation result, as shown in FIG. 9 .

處理器120可對累積結果進行分析且將影像10辨識為自然影像或圖形影像中的一者。舉例而言，若影像10中所包括的影像區塊中的不包括圖案(或不顯示方向性)的影像區塊的數目等於或大於臨限值，則基於累積結果，處理器120可將影像10辨識為圖形影像。作為另一實例，若影像10中所包括的影像區塊中的不包括圖案的影像區塊的數目小於臨限值，則基於累積結果，處理器120可將影像10辨識為自然影像。作為又一實例，若具有在垂直方向上的圖案或水平方向上的圖案的影像區塊的數目等於或大於臨限值，則基於累積結果，處理器120可將影像10辨識為自然影像。同時，影像的辨識及分類僅為示例性的且可根據製造商的目的、使用者的設定等來指定臨限值。 The processor 120 may analyze the accumulated results and identify the image 10 as one of a natural image or a graphic image. For example, if the number of image blocks that do not include a pattern (or show no directionality) among the image blocks included in the image 10 is equal to or greater than a threshold value, then based on the accumulated results, the processor 120 may convert the image 10 is recognized as a graphic image. As another example, if the number of image blocks that do not include patterns among the image blocks included in the image 10 is less than a threshold value, the processor 120 may recognize the image 10 as a natural image based on the accumulated results. As yet another example, if the number of image blocks with patterns in the vertical direction or patterns in the horizontal direction is equal to or greater than the threshold value, the processor 120 may recognize the image 10 as a natural image based on the accumulated results. Meanwhile, the identification and classification of the images are only exemplary and the threshold value may be specified according to the purpose of the manufacturer, the setting of the user, and the like.

作為另一實例，處理器120可基於累積結果計算特定索引資訊的數目及比例，且基於累積結果將影像10的類型辨識為自然影像或圖形影像。舉例而言，處理器120可基於累積結果計算至少三個特徵。 As another example, the processor 120 may calculate the number and ratio of specific index information based on the accumulated result, and identify the type of the image 10 as a natural image or a graphic image based on the accumulated result. For example, the processor 120 may calculate at least three features based on the accumulated results.

若索引資訊中的特定索引資訊是指示圖案未被辨識(或未顯示方向性)的影像區塊的資訊，則處理器120可自累積結果計算索引資訊的比例。在下文中，未被辨識的圖案的影像區塊一般被稱為包括平坦區域的影像區塊。包括平坦區域的影像區塊在整個影像區塊中的比例可基於以下方程式3來計算。 If specific index information in the index information is information indicating an image block whose pattern is not recognized (or does not show directionality), the processor 120 may calculate the ratio of the index information from the accumulated result. In the following, image blocks of unrecognized patterns are generally referred to as image blocks including flat areas. The proportion of the image block including the flat area in the entire image block can be calculated based on Equation 3 below.

[方程式3]

[Equation 3]

此處，Histogram[i]意指具有基於累積結果而辨識的索引資訊「i」的影像區塊的數目。另外，基於指示包括平坦區域的影像區塊的索引資訊是32的假設，Histogram[32]意指具有索引資訊32的影像區塊的數目，且P1意指包括平坦區域的影像區塊在整個影像區塊中的比例。 Here, Histogram[i] means the number of image blocks with index information "i" identified based on the accumulation result. In addition, based on the assumption that the index information indicating the image block including the flat area is 32, Histogram[32] means the number of image blocks with index information 32, and P1 means that the image block including the flat area is in the whole image The proportion in the block.

若影像區塊包括圖案，則處理器120可基於索引資訊來辨識所述圖案是否位於影像區塊內部的中心區域中。作為實例，相較於索引資訊為1至12及17至31的影像區塊，索引資訊為13至16的影像區塊的圖案可位於區塊內部的中心區域中。在下文中，影像圖案位於影像區塊內部的中心區域中的影像區塊一般被稱為居中分佈的影像區塊。接著，處理器120可基於累積結果、基於以下方程式4來計算居中分佈的影像區塊的比例。 If the image block includes a pattern, the processor 120 may identify whether the pattern is located in a central area inside the image block based on the index information. As an example, compared to the image blocks whose index information is 1-12 and 17-31, the pattern of the image blocks whose index information is 13-16 may be located in the central area inside the block. Hereinafter, the image blocks in which the image patterns are located in the central area inside the image blocks are generally referred to as the image blocks distributed in the center. Next, the processor 120 may calculate the proportion of the image blocks distributed in the center based on the following Equation 4 based on the accumulated results.

此處，處理器120可計算具有索引資訊1至31的影像區塊的數目

，以辨識除包括平坦區域的影像區塊之外的包括圖案的影像區塊的數目。另外，處理器120可計算居中分佈的影像區塊的數目

。同時，具有索引資訊13至15的影像區塊僅為其中圖案位於影像區塊內部的中心區域的情形的實例且本揭露未必僅限於此。作為另一實例，可基於索引資訊11至17的數目來計算P2。 Here, the processor 120 may count the number of image blocks with index information 1 to 31

, to identify the number of image blocks including patterns other than image blocks including flat areas. In addition, the processor 120 may calculate the number of image blocks distributed in the center

. Meanwhile, the image blocks with the index information 13 to 15 are only examples of the case where the pattern is located in the central area inside the image block and the present disclosure is not necessarily limited thereto. As another example, P2 may be calculated based on the number of index information 11-17.

接著，處理器120可基於影像10中所包括的多個影像區塊中的每一者的索引資訊來獲取影像10的平均索引資訊。根據本揭露的實施例，處理器120可基於以下方程式5計算平均索引資訊。 Then, the processor 120 may obtain average index information of the image 10 based on the index information of each of the plurality of image blocks included in the image 10 . According to an embodiment of the present disclosure, the processor 120 may calculate the average index information based on Equation 5 below.

此處，「i」意指索引資訊，Histogram[i]意指與索引資訊i對應的影像區塊的數目，且P3意指平均索引資訊。 Here, "i" means index information, Histogram[i] means the number of image blocks corresponding to index information i, and P3 means average index information.

根據本揭露實施例的處理器120可基於以下方程式6計算「Y」值。 The processor 120 according to an embodiment of the present disclosure may calculate the "Y" value based on Equation 6 below.

[方程式6]Y=W1*P1+W2*P2+W3*P3+Bias [Equation 6] Y = W 1* P 1+ W 2* P 2+ W 3* P 3+ Bias

此處，P1意指包括平坦區域的影像區塊的比例，P2意指居中分佈的影像區塊的比例，且P3意指平均索引資訊。另外，W1、W2、W3及Bias意指藉由使用人工智慧演算法模型預先學習的參數。 Here, P1 means the proportion of image blocks including flat areas, P2 means the proportion of image blocks distributed in the middle, and P3 means the average index information. In addition, W1, W2, W3 and Bias refer to parameters pre-learned by using an artificial intelligence algorithm model.

若Y值超過0，則根據本揭露實施例的處理器120可將影像10辨識為圖形影像，且若「Y」值等於或小於0，則處理器120可將影像10辨識為自然影像。 If the Y value exceeds 0, the processor 120 according to an embodiment of the present disclosure may recognize the image 10 as a graphic image, and if the "Y" value is equal to or less than 0, the processor 120 may recognize the image 10 as a natural image.

接著，處理器120可基於辨識結果來調整分別與第一影像及第二影像對應的第一權重及第二權重。作為實例，若將影像10辨識為自然影像，則處理器120可增大與第一影像對應的第一權重或與第二影像對應的第二權重中的至少一者。另外，處理器120可增大方程式1及方程式2中的參數「a」或「b」中的至少一者。同時，若影像10是自然影像，則處理器120可獲取高解析度的影像，由於其中邊緣已得到改善的第一影像或者其中紋理已得到改善的第二影像被添加至影像10或輸入影像10’，因此所述高解析度的影像的透明度已得到改善，且因此處理器120可增大第一權重或第二權重中的至少一者。 Then, the processor 120 may adjust the first weight and the second weight respectively corresponding to the first image and the second image based on the identification result. As an example, if the image 10 is identified as a natural image, the processor 120 may increase at least one of the first weight corresponding to the first image or the second weight corresponding to the second image. Additionally, processor 120 may increase at least one of the parameters "a" or "b" in Equation 1 and Equation 2. Meanwhile, if the image 10 is a natural image, the processor 120 can acquire a high-resolution image, since the first image in which the edges have been improved or the second image in which the texture has been improved is added to the image 10 or the input image 10 ', so the transparency of the high-resolution image has been improved, and thus the processor 120 can increase at least one of the first weight or the second weight.

作為另一實例，若影像10被辨識為圖形影像，則處理器120可減小與第一影像對應的第一權重或與第二影像對應的第二權重中的至少一者。另外，處理器120可減小方程式1及方程式2中的參數「a」或「b」中的至少一者。同時，若影像10是圖形影像，則由於其中邊緣已得到增強的第一影像或其中紋理已得到增強的第二影像被添加至影像10或輸入影像10’，因此處理器120可獲取其中發生失真的影像，且因此處理器120可減小第一權重或第二權重中的至少一者，且因此使失真的發生最小化。 As another example, if the image 10 is identified as a graphic image, the processor 120 may reduce at least one of the first weight corresponding to the first image or the second weight corresponding to the second image. Additionally, processor 120 may reduce at least one of the parameters "a" or "b" in Equation 1 and Equation 2. Meanwhile, if the image 10 is a graphic image, since the first image in which the edge has been enhanced or the second image in which the texture has been enhanced is added to the image 10 or the input image 10 ′, the processor 120 can obtain the distortion in which the distortion occurs. , and thus the processor 120 can reduce at least one of the first weight or the second weight, and thus minimize the occurrence of distortion.

此處，圖形影像可為對實際世界的影像進行操縱的影像，或者是藉由使用電腦、成像裝置等新創建的影像。舉例而言，圖形影像可包括藉由使用已知軟體產生的例示影像、電腦圖形(computer graphic，CG)影像、動畫影像等。自然影像可為除圖形影像之外的其餘影像。舉例而言，自然影像可包括由攝影裝置拍攝的實際世界的影像、風景影像、肖像影像等。 Here, the graphic image may be an image manipulated with an image of the real world, or an image newly created by using a computer, an imaging device, or the like. For example, graphic images may include example images, computer graphic (CG) images, animation images, etc. generated by using known software. Natural images can be removed image other than the shape image. For example, natural images may include real-world images, landscape images, portrait images, etc. captured by a photographing device.

根據本揭露的實施例，在最終輸出影像30’(即顯示影像)是具有較輸出影像30的解析度大的解析度的影像的情形中，在操作S350處，處理器120可按比例放大輸出影像30且獲取最終輸出影像30’。舉例而言，若輸出影像30是4K的UHD影像且最終輸出影像是8K的影像，則處理器120可將輸出影像30按比例放大至8K的UHD影像並獲取最終輸出影像30’。同時，根據本揭露的另一實施例，可在影像處理裝置100中提供實行輸出影像30的按比例放大的單獨的處理器。舉例而言，影像處理裝置100可包括第一處理器及第二處理器，並且藉由使用第一處理器獲取其中邊緣及紋理已得到加強的輸出影像30且藉由使用第二處理器獲取高解析度的最終輸出影像30’，最終輸出影像30’放大了輸出影像30的解析度。 According to an embodiment of the present disclosure, in the case where the final output image 30 ′ (ie, the display image) is an image with a higher resolution than that of the output image 30 , at operation S350 , the processor 120 may scale up the output image 30 and obtain the final output image 30'. For example, if the output image 30 is a 4K UHD image and the final output image is an 8K image, the processor 120 may scale the output image 30 to an 8K UHD image and obtain the final output image 30'. Meanwhile, according to another embodiment of the present disclosure, a separate processor for performing the upscaling of the output image 30 may be provided in the image processing device 100 . For example, the image processing apparatus 100 may include a first processor and a second processor, and obtain the output image 30 in which edges and textures have been enhanced by using the first processor and obtain high-resolution images by using the second processor The final output image 30 ′ of the resolution zooms in on the resolution of the output image 30 .

同時，根據本揭露各種實施例的第一學習網路模型及第二學習網路模型中的每一者可為其中影像處理裝置100在不依賴於外部裝置的情況下自行實行學習的設備上機器學習模型(on-device machine learning model)。同時，此僅為實例且一些學習網路模型可以基於設備上進行操作的形式來實施，且其他學習網路模型可以基於外部伺服器進行操作的形式來實施。 Meanwhile, each of the first learning network model and the second learning network model according to various embodiments of the present disclosure may be an on-device machine in which the image processing apparatus 100 performs learning by itself without relying on an external device Learning model (on-device machine learning model). Also, this is just an example and some learning network models may be implemented based on on-device operations, and other learning network models may be implemented based on external servers.

根據圖11，影像處理裝置100’包括記憶體110、處理器120、輸入器130、顯示器140、輸出器150及使用者介面160。同時，在闡釋圖11中所示的組件時，將省略對與圖2中所示的組件相似的組件的冗餘闡釋。 According to FIG. 11 , the image processing apparatus 100 ′ includes a memory 110 , a processor 120 , an input device 130 , a display 140 , an output device 150 and a user interface 160 . Meanwhile, in explaining the components shown in FIG. 11 , redundant explanation of components similar to those shown in FIG. 2 will be omitted.

根據本揭露的實施例，記憶體110可被實施成儲存自根據本揭露的各種操作產生的資料的單個記憶體。 According to embodiments of the present disclosure, memory 110 may be implemented as a single memory that stores data generated from various operations in accordance with the present disclosure.

記憶體110可被實施成包括第一記憶體至第三記憶體。 The memory 110 may be implemented to include a first memory to a third memory.

第一記憶體可儲存藉由輸入器130輸入的影像的至少一部分。具體而言，第一記憶體可儲存輸入影像訊框的至少一些區域。在此種配置中，所述至少一些區域可為實行根據本揭露實施例的影像處理所必需的區域。同時，根據本揭露的實施例，第一記憶體可被實施成N行記憶體。舉例而言，N行記憶體可為在水平方向上具有如17行一般多的容量的記憶體，但所述記憶體並非僅限於此。舉例而言，在輸入1080畫素(解析度為1920x1080)的全HD影像的情形中，可於第一記憶體中僅儲存全HD影像中的17行的影像區域。如上所述，第一記憶體被實施成N行記憶體且僅儲存輸入影像訊框的一些區域以用於影像處理的原因是第一記憶體的記憶體容量因硬體限制而受到限制。同時，第二記憶體可為記憶體110的整個區域中的被分配至學習網路模型的記憶體區域。 The first memory can store at least a part of the image input through the input device 130 . Specifically, the first memory can store at least some regions of the input image frame. In such a configuration, the at least some regions may be regions necessary to perform image processing in accordance with embodiments of the present disclosure. Meanwhile, according to an embodiment of the present disclosure, the first memory may be implemented as an N-row memory. For example, the N-row memory may be a memory having as much capacity as 17 rows in the horizontal direction, but the memory is not limited thereto. For example, in the case of inputting a full HD image of 1080 pixels (with a resolution of 1920×1080), only the image area of 17 lines in the full HD image can be stored in the first memory. As mentioned above, the reason why the first memory is implemented as N-line memory and only stores some areas of the input image frame for image processing is that the memory capacity of the first memory is limited due to hardware limitations. Meanwhile, the second memory may be a memory area allocated to the learning network model in the entire area of the memory 110 .

第三記憶體是其中儲存有第一影像及第二影像以及輸出影像的記憶體，且根據本揭露的各種實施例，第三記憶體可被實施成各種大小的記憶體。根據本揭露的實施例，處理器120將對輸入影像10’按比例縮小的影像10應用至第一學習網路模型及第二學習網路模型，且因此儲存自第一學習網路模型及第二學習網路模型獲取的第一影像及第二影像的第三記憶體的大小可被實施成與第一記憶體的大小相同或相似的大小。 The third memory stores therein the first image and the second image and the output The memory for outputting the image, and according to various embodiments of the present disclosure, the third memory may be implemented as memory of various sizes. According to an embodiment of the present disclosure, the processor 120 applies the scaled down image 10 of the input image 10' to the first learning network model and the second learning network model, and is thus stored from the first learning network model and the second learning network model. The size of the third memory of the first image and the second image acquired by the two learning network models may be implemented to be the same or similar to the size of the first memory.

輸入器130可為接收各種類型的內容(例如來自影像源的影像訊號)的通訊介面(例如有線乙太網路介面(Ethernet interface)或無線通訊介面)。舉例而言，輸入器130可藉由例如以下通訊方法而經由例如網際網路(Internet)等一或多個網路自外部裝置(例如，源裝置)、外部儲存媒體(例如，USB)、外部伺服器(例如，網路硬碟(webhard))等以流式方法或下載方法接收影像訊號：基於AP的Wi-Fi(無線局部區域網路(Local Area Network，LAN)網路)、藍芽、紫蜂(Zigbee)、有線/無線局部區域網路(LAN)、廣域網路(Wide Area Network，WAN)、乙太網路、長期演進(Long Trem Evolution，LTE)、第5代行動通訊技術(5th-generation，5G)、電機電子工程師學會(Institute of Electrical and Electronic Engineers，IEEE)1394、高清晰度多媒體介面(High Definition Multimedia Interface，HDMI)、行動高清晰度鏈路(Mobile High-Definition Link，MHL)、通用串列匯流排(USB)、顯示埠(Display Port，DP)、雷電接口(Thunderbolt)、視訊圖形陣列(Video Graphic Array，VGA)埠、紅綠藍(RGB)埠、D-超小型(D-subminiature，D-SUB)、數位可視介面(Digital Visual Interface，DVI)等。具體而言，5G通訊系統是使用超高頻率(毫米波(mmWave))頻帶(例如，如26、28、38及60十億赫茲頻帶的毫米波頻率頻帶)進行的通訊，且影像處理裝置100可在流式環境中傳送或接收4K及8K的UHD影像。 The input device 130 may be a communication interface (eg, a wired Ethernet interface or a wireless communication interface) for receiving various types of content (eg, image signals from an image source). For example, the input device 130 can be accessed from an external device (eg, source device), an external storage medium (eg, USB), an external device (eg, a source device), an external storage medium (eg, USB), an A server (eg, a webhard), etc., receives video signals in a streaming method or a download method: AP-based Wi-Fi (Wireless Local Area Network (LAN) network), Bluetooth , Zigbee, Wired/Wireless Local Area Network (LAN), Wide Area Network (WAN), Ethernet, Long Trem Evolution (LTE), 5th Generation Mobile Communication Technology ( 5th-generation, 5G), Institute of Electrical and Electronic Engineers (IEEE) 1394, High Definition Multimedia Interface (HDMI), Mobile High-Definition Link, MHL), Universal Serial Bus (USB), Display Port (DP), Thunderbolt (Thunderbolt), Video Graphic Array (VGA) port, Red, Green and Blue (RGB) port, D-Ultra Small (D-subminiature, D-SUB), digital visual interface (Digital Visual Interface, DVI) and so on. Specifically, the 5G communication system is communication using ultra-high frequency (millimeter wave (mmWave)) frequency bands (eg, millimeter wave frequency bands such as 26, 28, 38, and 60 gigahertz frequency bands), and the image processing device 100 UHD video in 4K and 8K can be sent or received in a streaming environment.

此處，影像訊號可為數位訊號，但影像訊號並非僅限於此。 Here, the video signal may be a digital signal, but the video signal is not limited to this.

顯示器140可以例如以下各種形式實施：液晶顯示器(liquid crystal display，LCD)、有機發光二極體(organic light-emitting diode，OLED)、發光二極體(light-emitting diode，LED)、微型LED、量子點發光二極體(quantum dot light-emitting diode，QLED)、矽上液晶(liquid crystal on silicon，LCoS)、數位光處理(digital light processing，DLP)及量子點(quantum dot，QD)顯示面板。具體而言，根據本揭露實施例的處理器120可控制顯示器140顯示輸出影像30或最終輸出影像30’。此處，最終輸出影像30’可包括4K或8K的即時UHD影像、流式影像等。 The display 140 may be implemented in various forms such as liquid crystal display (LCD), organic light-emitting diode (OLED), light-emitting diode (LED), micro LED, Quantum dot light-emitting diode (QLED), liquid crystal on silicon (LCoS), digital light processing (DLP) and quantum dot (QD) display panels . Specifically, the processor 120 according to the embodiment of the present disclosure can control the display 140 to display the output image 30 or the final output image 30'. Here, the final output image 30' may include 4K or 8K real-time UHD images, streaming images, and the like.

輸出器150輸出聲響訊號。 The output device 150 outputs an acoustic signal.

舉例而言，輸出器150可將在處理器120處進行處理的數位聲響訊號轉換成類比聲響訊號，且放大訊號並輸出訊號。舉例而言，輸出器150可包括至少一個揚聲器單元、數位/類比(digital-to-analog，D/A)轉換器、音訊放大器等，輸出器150可輸出至少一個通道。根據本揭露的實施例，輸出器150可被實施成輸出各種多通道聲響訊號。在此種情形中，處理器120可控制輸出器150對聲響訊號輸入實行增強處理，以對應於輸入影像的增強處理，並輸出所述訊號。舉例而言，處理器120可將輸入的雙通道聲響訊號轉換成虛擬多通道(例如，5.1通道)聲響訊號，或者辨識其中影像處理裝置100被放置於房間或建築物的環境內的位置且將所述訊號處理為針對該空間進行最佳化的立體聲響訊號，或者提供根據輸入影像的類型(例如，內容的文類)進行最佳化的聲響訊號。同時，使用者介面160可被實施成例如按鈕、觸控板、滑鼠及鍵盤等裝置或被實施成觸控螢幕、可接收使用者輸入以實行上述顯示功能及操縱輸入功能二者的遙控接收器。遙控收發器可藉由紅外通訊、藍芽通訊或Wi-Fi通訊中的至少一種通訊方法自外部遙控裝置接收遙控訊號或向外部遙控裝置傳送遙控訊號。同時，儘管在圖9中未示出，然而根據本揭露的實施例，可在影像處理之前應用去除輸入影像的雜訊的自由濾波。舉例而言，可藉由應用藉由對預定波導的影像進行比較來過濾輸入影像的平滑濾波器(例如高斯濾波器(Gaussian filter)、導引濾波器等)來去除明顯的雜訊。 For example, the outputter 150 may convert the digital audio signal processed at the processor 120 into an analog audio signal, amplify the signal, and output the signal. For example, the outputter 150 may include at least one speaker unit, a digital-to-analog (D/A) converter, an audio amplifier, etc., and the outputter 150 may output at least one channel. According to an embodiment of the present disclosure, the outputter 150 may be implemented to output each A multi-channel sound signal. In this case, the processor 120 can control the output unit 150 to perform enhancement processing on the input of the audio signal, so as to correspond to the enhancement processing of the input image, and output the signal. For example, the processor 120 may convert the input two-channel acoustic signal into a virtual multi-channel (eg, 5.1-channel) acoustic signal, or identify the location within the environment where the image processing device 100 is placed in a room or building and convert the The signal processing is a stereo signal optimized for the space, or provides an audio signal optimized according to the type of input image (eg, the genre of the content). Meanwhile, the user interface 160 can be implemented as a device such as a button, a touchpad, a mouse, and a keyboard, or as a touchscreen, which can receive user input to perform both the above-mentioned display function and the remote control reception of the manipulation input function. device. The remote control transceiver can receive the remote control signal from the external remote control device or transmit the remote control signal to the external remote control device by at least one communication method among infrared communication, bluetooth communication or Wi-Fi communication. Meanwhile, although not shown in FIG. 9, according to embodiments of the present disclosure, free filtering to remove noise of the input image may be applied before image processing. For example, significant noise can be removed by applying a smoothing filter (eg, Gaussian filter, steering filter, etc.) that filters the input image by comparing images of predetermined waveguides.

參照圖12，處理器1200可包括學習部件1210或辨別部件1220中的至少一者。圖12中的處理器1200可對應於圖2中的影像處理裝置100的處理器120或資料學習伺服器的處理器。 Referring to FIG. 12 , the processor 1200 may include at least one of a learning part 1210 or a discriminating part 1220 . The processor 1200 in FIG. 12 may correspond to the processor 120 of the image processing apparatus 100 in FIG. 2 or the processor of the data learning server.

用於學習及使用第一學習網路模型及第二學習網路模型的影像處理裝置100的處理器1200可包括學習部件1210或辨別部件1220中的至少一者。 For learning and using the first learning network model and the second learning network model The processor 1200 of the image processing apparatus 100 of the type 1200 may include at least one of the learning component 1210 or the identifying component 1220 .

根據本揭露實施例的學習部件1210可獲取其中影像10的影像特性已得到加強的影像，且基於影像10及其中影像10的影像特性已得到加強的影像來獲取輸出影像。接著，學習部件1210可產生或訓練具有用於使影像10的失真最小化並獲取與影像10對應的高解析度的按比例放大影像的標準的辨別模型。另外，學習部件1210可藉由使用收集的學習資料產生具有判斷標準的辨別模型。 The learning component 1210 according to an embodiment of the present disclosure may acquire an image in which the image characteristic of the image 10 has been enhanced, and acquire an output image based on the image 10 and the image in which the image characteristic of the image 10 has been enhanced. Next, the learning component 1210 may generate or train a discrimination model with criteria for minimizing distortion of the image 10 and obtaining a high-resolution scaled-up image corresponding to the image 10 . In addition, the learning component 1210 can generate a discrimination model with judgment criteria by using the collected learning data.

作為實例，學習部件1210可產生、訓練或更新學習網路模型，使得輸出影像30的邊緣區域或紋理區域中的至少一者被增強至多於輸入影像10’的邊緣區域或紋理區域。 As an example, learning component 1210 may generate, train, or update a learning network model such that at least one of edge or texture regions of output image 30 is enhanced to more than edge or texture regions of input image 10'.

辨別部件1220可使用預定資料(例如，輸入影像)作為經訓練的辨別模型的輸入資料，且因此估測用於辨別的對象或預定資料中所包括的情況。 The discrimination component 1220 can use predetermined data (eg, input images) as input data for a trained discrimination model, and thus estimate objects for discrimination or conditions included in the predetermined data.

學習部件1210的至少一部分及辨別部件1220的至少一部分可被實施成軟體模組或被製造成至少一個硬體晶片形式，且安裝於影像處理裝置上。舉例而言，可將學習部件1210或辨別部件1220中的至少一者製造成專用於人工智慧(AI)的硬體晶片的形式，或者製造成傳統通用處理器(例如，CPU或應用處理器)或圖形專用處理器(例如，GPU)的一部分，且安裝於上述各種類型的影像處理裝置或對象辨別裝置上。此處，專用於人工智慧的硬體晶片是特定用於概率運算中的專用處理器且具有較傳統通用處理器高的並列處理效能，且能夠快速處理人工智慧領域中的操作(如機器學習)。在學習部件1210及辨別部件1220被實施成一或多個軟體模組(或包括指令的程式模組)的情形中，軟體模組可儲存於非暫態電腦可讀取媒體(non-transitory computer readable medium)中。在此種情形中，可由作業系統(operating system，OS)或由特定應用提供軟體模組。作為另外一種選擇，軟體模組的一部分可由作業系統(OS)提供，且其他部分可由特定應用提供。 At least a portion of the learning component 1210 and at least a portion of the identification component 1220 may be implemented as a software module or fabricated in the form of at least one hardware chip and mounted on an image processing device. For example, at least one of the learning component 1210 or the discrimination component 1220 may be fabricated in the form of a hardware chip dedicated to artificial intelligence (AI), or as a conventional general-purpose processor (eg, a CPU or application processor) Or part of a graphics-specific processor (eg, GPU), and installed on the above-mentioned various types of image processing devices or object recognition devices. Here, the hardware dedicated to artificial intelligence The bulk chip is a special-purpose processor for probabilistic operations and has higher parallel processing performance than traditional general-purpose processors, and can quickly process operations in the field of artificial intelligence (such as machine learning). In the case where the learning component 1210 and the identifying component 1220 are implemented as one or more software modules (or program modules including instructions), the software modules may be stored on a non-transitory computer readable medium (non-transitory computer readable medium). medium). In this case, the software module may be provided by an operating system (OS) or by a specific application. Alternatively, a portion of the software module may be provided by an operating system (OS) and other portions may be provided by a specific application.

在此種情形中，學習部件1210及辨別部件1220可安裝於一個影像處理裝置上，或者可分別安裝於單獨的影像處理裝置上。舉例而言，學習部件1210及辨別部件1220中的一者可被包括於影像處理裝置100中，且另一者可被包括於外部伺服器中。另外，學習部件1210與辨別部件1220可以有線方式或無線方式連接，或者可為更大的軟體模組或應用的單獨的軟體模組。可將由學習部件1210構建的模型資訊提供至辨別部件1220，且可將輸入至辨別部件1220的資料提供至學習部件1210作為附加學習資料。 In this case, the learning part 1210 and the discriminating part 1220 may be installed on one image processing device, or may be installed on separate image processing devices respectively. For example, one of the learning component 1210 and the identifying component 1220 may be included in the image processing device 100, and the other may be included in an external server. In addition, the learning component 1210 and the identifying component 1220 may be connected in a wired or wireless manner, or may be a larger software module or a separate software module for an application. The model information constructed by the learning component 1210 may be provided to the identifying component 1220, and the data input to the identifying component 1220 may be provided to the learning component 1210 as additional learning materials.

根據圖13中所示的影像處理方法，首先，在操作S1310處，將影像應用至第一學習網路模型並獲取其中影像的邊緣已得到增強的第一影像。 According to the image processing method shown in FIG. 13 , first, at operation S1310 , the image is applied to the first learning network model and the first image in which the edge of the image has been enhanced is obtained.

接著，在操作S1320處，將影像應用至第二學習網路模型並獲取其中影像的紋理已得到增強的第二影像。 Next, at operation S1320, the image is applied to the second learning network model and a second image in which the texture of the image has been enhanced is obtained.

接下來，在操作S1330處，辨識影像中所包括的邊緣區域及紋理區域，且基於關於邊緣區域及紋理區域的資訊而將第一權重應用至第一影像且將第二權重應用至第二影像，並獲取輸出影像。 Next, at operation S1330, an edge region and a texture region included in the image are identified, and a first weight is applied to the first image and a second weight is applied to the second image based on the information about the edge region and the texture region , and get the output image.

此處，第一學習網路模型與第二學習網路模型可為彼此不同的類型的學習網路模型。 Here, the first learning network model and the second learning network model may be different types of learning network models from each other.

根據本揭露實施例的第一學習網路模型可為學習藉由使用多個層來增強影像的邊緣的深度學習模型或者被訓練成藉由使用多個預學習濾波器來增強影像的邊緣的機器學習模型中的一者。 The first learning network model according to an embodiment of the present disclosure can be a deep learning model that learns to enhance the edge of an image by using multiple layers or a machine that is trained to enhance the edge of an image by using multiple pre-learning filters One of the learning models.

另外，根據本揭露實施例的第二學習網路模型可為學習藉由使用多個層來對影像的紋理進行最佳化的深度學習模型或者被訓練成藉由使用多個預學習濾波器來對影像的紋理進行最佳化的機器學習模型中的一者。 In addition, the second learning network model according to an embodiment of the present disclosure may be a deep learning model that learns to optimize the texture of an image by using multiple layers or is trained to use multiple pre-learning filters to One of the machine learning models that optimizes the texture of an image.

另外，獲取輸出影像的操作S1330可包括以下步驟：基於邊緣區域與紋理區域的比例資訊獲取與邊緣區域對應的第一權重及與紋理區域對應的第二權重。 In addition, the operation S1330 of obtaining the output image may include the following steps: obtaining a first weight corresponding to the edge area and a second weight corresponding to the texture area based on the ratio information of the edge area and the texture area.

另外，根據本揭露實施例的影像處理方法可包括按比例縮小輸入影像並獲取解析度較輸入影像的解析度小的影像的步驟。同時，第一學習網路模型可藉由實行對影像的邊緣進行加強的按比例放大來獲取第一影像，且第二學習網路模型可藉由實行對影像的紋理進行加強的按比例放大來獲取第二影像。 In addition, the image processing method according to the embodiment of the present disclosure may include the step of scaling down the input image and obtaining an image with a resolution smaller than that of the input image. At the same time, the first learning network model can enhance the edge of the image by performing a button The first image is acquired by scaling up, and the second learning network model may acquire the second image by performing the scaling up that enhances the texture of the image.

另外，根據本揭露實施例的影像處理方法可包括以下步驟：獲取區域偵測資訊，區域偵測資訊基於按比例縮小影像辨識出邊緣區域及紋理區域，且將區域偵測資訊及影像分別提供至第一學習網路模型及第二學習網路模型。 In addition, the image processing method according to the embodiment of the present disclosure may include the following steps: acquiring area detection information, the area detection information identifies the edge area and the texture area based on the scaled-down image, and the area detection information and the image are respectively provided to The first learning network model and the second learning network model.

此處，將區域偵測資訊及影像分別提供至第一學習網路模型及第二學習網路模型的步驟可包括以下步驟：基於區域偵測資訊將僅包括與邊緣區域對應的畫素資訊的影像提供至第一學習網路模型且將僅包括與紋理區域對應的畫素資訊的影像提供至第二學習網路模型。同時，第一學習網路模型可藉由按比例放大邊緣區域來獲取第一影像，且第二學習網路模型可藉由按比例放大紋理區域來獲取第二影像。 Here, the step of providing the region detection information and the image to the first learning network model and the second learning network model respectively may include the following steps: based on the region detection information, only the pixel information corresponding to the edge region will be included. The image is provided to the first learning network model and the image including only the pixel information corresponding to the texture region is provided to the second learning network model. Meanwhile, the first learning network model can acquire the first image by scaling up the edge area, and the second learning network model can acquire the second image by scaling up the texture area.

另外，根據本揭露實施例的第一影像及第二影像可分別為第一殘留影像及第二殘留影像。另外，在獲取輸出影像的操作S1330中，可將第一權重應用至第一殘留影像且可將第二權重應用至第二殘留影像，且接著可將殘留影像與影像進行混合以獲取輸出影像。 In addition, the first image and the second image according to the embodiment of the present disclosure may be the first afterimage and the second afterimage, respectively. In addition, in the operation S1330 of obtaining the output image, the first weight may be applied to the first afterimage and the second weight may be applied to the second afterimage, and then the afterimage and the image may be mixed to obtain the output image.

此外，第二學習網路模型可為儲存與多個影像圖案中的每一者對應的多個濾波器的模型且將影像中所包括的影像區塊中的每一者分類至所述多個影像圖案中的一者，並且將所述多個濾波器中與被分類的影像圖案對應的至少一個濾波器應用至影像區塊並提供第二影像。 Furthermore, the second learning network model may be a model that stores a plurality of filters corresponding to each of the plurality of image patterns and classifies each of the image blocks included in the image into the plurality of one of the image patterns, and applying at least one of the plurality of filters corresponding to the classified image pattern to the image area block and provide a second image.

另外，根據本揭露實施例的獲取輸出影像的操作S1330可包括以下步驟：對與被分類的影像區塊中的每一者對應的影像圖案的索引資訊進行累積且基於累積結果辨識影像的類型(例如自然影像或圖形影像中的一者)，並且基於辨識結果調整權重。 In addition, the operation S1330 of acquiring an output image according to an embodiment of the present disclosure may include the following steps: accumulating index information of image patterns corresponding to each of the classified image blocks and identifying the type of the image based on the accumulation result ( such as one of natural images or graphic images), and adjust the weights based on the recognition results.

此處，調整權重的步驟可包括以下步驟：基於影像被辨別為自然影像，增大與第一影像對應的第一權重或與第二影像對應的第二權重中的至少一者，且基於影像被辨別為圖形影像，減小第一權重或第二權重中的至少一者。 Here, the step of adjusting the weight may include the steps of increasing at least one of a first weight corresponding to the first image or a second weight corresponding to the second image based on the image being identified as a natural image, and based on the image Recognized as a graphic image, at least one of the first weight or the second weight is reduced.

輸出影像可為4K的超高清晰度(UHD)影像，且根據本揭露實施例的影像處理方法可包括將輸出影像按比例放大至8K的UHD影像的步驟。 The output image may be a 4K Ultra High Definition (UHD) image, and the image processing method according to an embodiment of the present disclosure may include the step of upscaling the output image to an 8K UHD image.

同時，本揭露的各種實施例可應用至能夠實行影像處理的所有電子裝置(例如影像接收裝置(如機上盒(set top box)及影像處理裝置等))以及影像處理裝置。 Meanwhile, various embodiments of the present disclosure can be applied to all electronic devices capable of performing image processing (eg, image receiving devices (such as set top boxes and image processing devices)) and image processing devices.

另外，可以可藉由使用軟體、硬體或其組合的由電腦或與電腦類似的裝置讀取的記錄媒體來實施上述各種實施例。在一些情形中，本說明書中所述的實施例可被實施成處理器120本身。根據軟體的實施方案，本說明書中所述的例如程序及功能等的實施例可被實施成單獨的軟體模組。所述軟體模組中的每一者可實行本說明書中所述的功能及操作中的一或多者。 In addition, the various embodiments described above can be implemented by a recording medium readable by a computer or a computer-like device using software, hardware, or a combination thereof. In some cases, the embodiments described in this specification may be implemented as the processor 120 itself. Depending on the implementation of the software, the embodiments such as procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules may perform one or more of the functions and operations described in this specification.

同時，用於實行根據本揭露的上述各種實施例的聲響輸出裝置100的處理操作的電腦指令可儲存於非暫態電腦可讀取媒體中。當儲存於此種非暫態電腦可讀取媒體中的電腦可讀取指令由特定裝置的處理器執行時，根據上述各種實施例的聲響輸出裝置100處的處理操作由特定裝置實行。 Meanwhile, for implementing the sound input according to the above-described various embodiments of the present disclosure Computer instructions for processing operations of the device 100 may be stored in a non-transitory computer readable medium. The processing operations at the audio output device 100 according to the various embodiments described above are performed by the specific device when the computer-readable instructions stored in such a non-transitory computer-readable medium are executed by the processor of the specific device.

非暫態電腦可讀取媒體是指半永久地儲存資料而非在短時間內儲存資料的媒體(例如，暫存器、高速緩衝記憶體及記憶體)的媒體且可由機器讀取。作為非暫態電腦可讀取媒體的特定實例，可為光碟(compact disc，CD)、數位多功能磁碟(digital versatile disk，DVD)、硬碟、藍光碟、USB、記憶卡、ROM等。 A non-transitory computer-readable medium refers to a medium that stores data semi-permanently rather than for a short period of time (eg, scratchpad, cache, and memory) and is readable by a machine. Specific examples of non-transitory computer-readable media include compact disc (CD), digital versatile disk (DVD), hard disk, Blu-ray disc, USB, memory card, ROM, and the like.

儘管已示出並闡述了本揭露的實施例，然而本揭露並非僅限於上述具體實施例，且顯而易見的是在不背離隨附申請專利範圍所要求的本揭露的要旨的條件下，熟習本揭露所屬領域的技術者可做出各種修改。另外，此種修改並非旨在獨立於本揭露的技術思想或前景來單獨地解釋。 While embodiments of the present disclosure have been shown and described, the present disclosure is not limited to the specific embodiments described above, and it will be apparent to those familiar with the present disclosure without departing from the gist of the present disclosure as claimed in the scope of the appended claims. Various modifications may be made by those skilled in the art. In addition, such modifications are not intended to be interpreted independently of the technical idea or prospect of the present disclosure.

110:記憶體 110: Memory

120:處理器 120: Processor

Claims

An image processing device, comprising: a memory storing computer-readable instructions; and a processor configured to execute the computer-readable instructions to: provide an input image as a first input of a first neural model, from which obtaining a first image from the first neural model, the first image including enhanced edges optimized based on edges of the input image, providing the input image as a second input for a second neural model, from the first image Two neural models obtain a second image, the second image including an enhanced texture optimized based on the texture of the input image, and identify the edge region of the edge included in the input image to obtain a region corresponding to the edge of the input image. a first weight of the edge area, identifying a texture area of the texture included in the input image to obtain a second weight corresponding to the texture area, providing the first weight to the area including the enhanced edge the first image, the second image provided with the second weight to the second image including the enhanced texture, and based on the first image provided with the first weight and the second image provided with the second weight The second image is described to obtain the output image.

The image processing apparatus of claim 1, wherein the first type of the first neural model is different from the second type of the second neural model.

The image processing device according to claim 1, wherein the first god The model is a deep learning model that optimizes the edges of the input image by using multiple layers or is trained to optimize the edges of the input image by using multiple pre-learned filters One of the optimized machine learning models.

The image processing apparatus of claim 1, wherein the second neural model is a deep learning model that optimizes the texture of the input image by using a plurality of layers or is trained by using one of a plurality of machine learning models that pre-learn filters to optimize the texture of the input image.

The image processing device of claim 1, wherein the processor executing the computer-readable instructions is further configured to: based on the edge region in the input image and the edge region in the input image The first weight and the second weight are obtained from the scale information of the texture area.

The image processing device of claim 1, wherein the processor executing the computer-readable instructions is further configured to: scale down the input image to obtain a resolution having a smaller resolution than the input image A scaled-down image of a resolution of the first image having the enhanced edge, providing the scaled-down image as the second input to the second neural model, and providing the scaled-down image from the first image that scales the scaled-down image two neural models type to obtain the second image with the enhanced texture.

The image processing device of claim 1, wherein the processor executing the computer-readable instructions is further configured to: obtain first region detection information for identifying the edge region of the input image and second region detection information identifying the textured region of the input image, providing the input image and the first region detection information as the first input to the first neural model, and The input image and the second region detection information are provided as the second input to the second neural model.

The image processing device of claim 7, wherein the first neural model obtains the first image by scaling up the edge region, and the second neural model obtains the first image by scaling up the texture area to obtain the second image.

The image processing device according to claim 1, wherein the second neural model is a model: the model stores a plurality of filters corresponding to a plurality of image patterns, and the image blocks included in the input image Each image block in is classified into an image pattern of the plurality of image patterns, and at least one filter of the plurality of filters corresponding to the image pattern is provided to the image block.

The image processing device of claim 1, wherein the first image is a first afterimage and the second image is a second afterimage, and wherein the processor executing the computer-readable instructions further is configured with: The first weight is provided to the first afterimage based on the edge region, the second weight is provided to the second afterimage based on the texture region, and the first weight and the After the second weight, the first residual image, the second residual image and the input image are mixed to obtain the output image.

The image processing device of claim 10, wherein the processor executing the computer-readable instructions is further configured to: perform processing of image patterns corresponding to each of the image blocks of the input image Index information is accumulated, identifying the input image as one of a natural image or a graphic image based on the index information, and based on identifying the input image as one of the natural image or the graphic image The first weight and the second weight are adjusted as a result.

The image processing device of claim 11, wherein the processor executing the computer-readable instructions is further configured to: increase the first weight based on the input image being recognized as the natural image or at least one of the second weights, and reducing at least one of the first weights or the second weights based on the input image being identified as the graphic image.

An image processing method of an image processing device, comprising: providing an input image as a first input to a first neural model; obtaining a first image from the first neural model, the first image including enhanced edges optimized based on edges of the input image; providing the input an image as a second input to a second neural model; obtaining a second image from the second neural model, the second image including an enhanced texture optimized based on the texture of the input image; identifying including an edge region of the edge to obtain a first weight corresponding to the edge region; identifying a texture region included in the input image to obtain a second weight corresponding to the texture region; providing the providing the first weight to the first image including the enhanced edge; providing the second weight to the second image including the enhanced texture; and providing the first weight based on the first image an image and the second image provided with the second weight to obtain an output image.

The image processing method of claim 13, wherein the first neural model is a deep learning model that optimizes the edges of the input image by using a plurality of layers or is trained by using one of a plurality of machine learning models that pre-learn filters to optimize the edges of the input image.

The image processing method of claim 13, wherein the second neural model is a deep learning model that optimizes the texture of the input image by using multiple layers or is trained by using multiple pre-learned filters to one of the machine learning models that optimize the texture of the input image.