WO2021039192A1 - Appareil de traitement des images, procédé de traitement des images et programme - Google Patents

Appareil de traitement des images, procédé de traitement des images et programme Download PDF

Info

Publication number
WO2021039192A1
WO2021039192A1 PCT/JP2020/027931 JP2020027931W WO2021039192A1 WO 2021039192 A1 WO2021039192 A1 WO 2021039192A1 JP 2020027931 W JP2020027931 W JP 2020027931W WO 2021039192 A1 WO2021039192 A1 WO 2021039192A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
unit
information
area
removal
Prior art date
Application number
PCT/JP2020/027931
Other languages
English (en)
Japanese (ja)
Inventor
高橋 修一
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2021039192A1 publication Critical patent/WO2021039192A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/387Composing, repositioning or otherwise geometrically modifying originals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present technology relates to an image processing apparatus, an image processing method and a program, and more particularly to an image processing apparatus for fitting an interpolated image into an additional information removal area of an input image.
  • image information not only the image information but also additional information such as logo, telop, graphics, etc. are superimposed and displayed to provide the viewer with explanation and supplementary information.
  • additional information such as logo, telop, graphics, etc.
  • logo trademark of a person superimposed on an image
  • location of a landscape the location of a landscape
  • logo mark of a program or a producer and the like.
  • information that requires breaking news such as news and weather information, may be superimposed and displayed regardless of the image.
  • Patent Document 1 A technique for removing such additional information superimposed and displayed in an image has been conventionally provided.
  • a pixel constituting a character of a telop is detected to detect the telop, and the pixel after removing the pixel is interpolated (spatial interpolation) with the information of another pixel in the same frame.
  • a processing device that interpolates (time interpolation) with information of the same pixel in another frame is disclosed.
  • Patent Document 2 based on the fineness and roughness of the texture in the vicinity of the telop, the pixels in the telop region are interpolated by propagating the pixel values outside the region into the region or repeatedly copying several pixels in the vicinity. The method of erasing the telop is disclosed. Further, Patent Document 3 discloses a video / audio signal recording device that erases and interpolates the telop display inserted in the video signal by using an interpolated signal including the original video signal information in which the telop is not inserted.
  • Patent Document 1 when the pixel group of the telop or graphics that is always displayed in the image is removed, the pixel group does not change with time, so that time interpolation cannot be performed and spatial interpolation is applied. I have no choice. However, this Patent Document 1 does not specifically describe what kind of spatial interpolation is performed.
  • the purpose of this technique is to fit the interpolated image well into the additional information removal area of the input image.
  • An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network and fits the interpolated image into the removal region of the input image to obtain an output image.
  • the processing unit generates an interpolated image to be fitted in the removal area of the input image using a generation neural network, and fits this interpolated image in the removal area of the input image to obtain an output image.
  • the processing unit is a segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image, and a processing unit on the input image based on the information of the removal area and the area information of each subject.
  • a training data area specification unit that specifies a data area that can be used for training data required for training by a generation neural network, and a training data generation that generates training data from an input image based on this data area.
  • the generating network learning unit further learns from the learning data based on the generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. May be made. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • an interpolated image to be fitted in the removal region of the input image is generated by using the generating neural network. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
  • a detection unit for obtaining information on the removal region included in the input image may be further provided.
  • the detection unit may obtain information on the removal region based on information from the outside. This makes it possible to improve the accuracy of the information in the removed area.
  • a processing update unit that updates the function of the processing unit based on the evaluation information of the output image may be further provided. This makes it possible to improve the interpolated image generated by the processing unit to a more appropriate one.
  • the processing update unit may update the function of the processing unit based on the external setting information. This makes it possible to make improvements efficiently.
  • FIG. 1 shows a configuration example of the television receiving device 10 as an embodiment.
  • the television receiving device 10 includes a receiving antenna 101, a digital broadcast receiving unit 102, a display unit 103, a recording / reproducing unit 104, an image processing unit 105, a CPU 106, and a user operation unit 107.
  • the CPU 106 controls the operation of each part of the television receiving device 10.
  • the user can perform various operation inputs by the user operation unit 107.
  • the user operation unit 107 includes a remote control unit, a touch panel unit that performs operation input by proximity / touch, a mouse, a keyboard, a gesture input unit that detects operation input with a camera, a voice input unit that is operated by voice, and the like.
  • the digital broadcast receiving unit 102 processes the television broadcast signal input from the receiving antenna 101 to obtain an image signal related to the broadcast content.
  • the recording / reproducing unit 104 records the image signal obtained by the digital broadcasting receiving unit 102 and reproduces it at an appropriate timing.
  • the display unit 103 displays an image based on the image signal obtained by the digital broadcast receiving unit 102 or the image signal reproduced by the recording / reproducing unit 104.
  • the image processing unit 105 performs image processing on the image signal reproduced by the recording / reproducing unit 104, returns the processed image signal to the recording / reproducing unit 104, and causes the image signal to be recorded as the processed image signal.
  • the image processing unit 105 detects an area of additional information such as a logo, telop, and graphics from the input image as a removal area, generates an interpolation image to be fitted in the removal area using a generation neural network, and performs this interpolation. The process of fitting the image into the removal area of the input image to obtain the output image is performed.
  • the processing of the image processing unit 105 is performed with the operation of reproducing the image signal before processing and recording the image signal after processing in the recording / reproducing unit 104. At this time, the image by the reproduced image signal is displayed. It can be displayed on the unit 103 or not displayed. In the case of displaying and performing, it is also possible to selectively instruct additional information to be removed by user operation. It is also conceivable that the image processing unit 105 performs image processing in real time on the received image signal obtained by the digital broadcast receiving unit 102.
  • FIG. 2 shows a configuration example of the image processing unit 105.
  • the image processing unit 105 has a detection unit 200, a removal unit 300, and a processing unit 400.
  • the input image signal is supplied to the detection unit 200 and the removal unit 300.
  • the detection unit 200 obtains information on a removal area (for example, an area of additional information such as a logo, telop, graphics, etc.) included in the input image.
  • the information of this removal area is supplied to the removal unit 300 and the processing unit 400.
  • the removal unit 300 refers to the information of the removal area and obtains an image obtained by removing the image of the removal area from the input image.
  • the image signal output from the removal unit 300 is supplied to the processing unit 400.
  • the processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image, refers to this information and the information on the removal area, and uses a biological neural network to obtain information on the input image.
  • An interpolated image to be fitted in the removal area is generated, and the interpolated image is fitted in the removal area of the input image to obtain an output image.
  • the output image signal is output from the processing unit 400.
  • FIG. 3 shows an example of the image state in each part of the image processing unit 105 shown in FIG.
  • FIG. 3A shows an input image of the detection unit 200.
  • the detection unit 200 detects the logo of the program existing in the upper left of the input image, and the area of the logo is set as the removal area.
  • the removal unit 300 obtains an image obtained by removing the image of the removal region from the input image.
  • the processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image.
  • the input image is divided into each subject area, and a label indicating the type is given to each area.
  • it is divided into bridge, sky, mountain, and water areas.
  • the processing unit 400 the information on the area and type of each subject and the information on the removal area are referred to, and as shown in FIG. 3E, the interpolated image for fitting in the removal area is the biological neural network. Is generated using.
  • the processing unit 400 determines that it is desirable to interpolate the bridge information in the removal region, and the biological neural network that has learned the bridge information in advance.
  • a network is used to generate a plausible bridge texture as an interpolated image.
  • FIG. 3F shows an output image obtained by fitting the interpolated image obtained by the processing unit 400 into the removal region of the input image.
  • the pixel information of the peripheral pixels and the front and rear frames is not referred to as it is, but the pattern is generated in consideration of the meaning in the image, so that unnaturalness and discomfort can be significantly suppressed.
  • -Detect target areas such as telops and graphics.
  • telop may be determined based on the frame difference before and after. -If it is a broadcast program, it is possible to predict in which area the telop or graphics will be displayed by analyzing the past broadcast, so the analysis result may be used. -You may refer to the viewer's setting information such as "remove only the program logo without erasing the telop" and "remove the telop and the program logo, but do not remove the graphics".
  • An example of the processing policy of the processing unit 400 is shown below. -Insert a new texture etc. into the area where the telop etc. have been removed. -Generate and fit a plausible pattern using a generating neural network.
  • the texture with high probability of the interpolated area may be estimated, and the generated network or the training data of the network may be selected according to the result.
  • the segmentation information of the scene may be used, and the information of the region having the same segmentation result as the region to be interpolated may be used for the learning data. Since learning can be performed based on information that has a higher correlation with the interpolation area, there is a high possibility that a more plausible pattern can be generated.
  • the removal unit 300 is not always necessary. This is because when the interpolated image is fitted into the removal region of the input image in the processing unit 400, the image in the removal region of the input image is substantially removed.
  • each part of the image processing unit 105 can be performed by software processing by a computer.
  • FIG. 4 shows a configuration example of the detection unit 200.
  • the detection unit 200 includes a change area determination unit 201, a chapter information recording unit 202, a change information recording unit 203, a removal area position identification unit 204, a removal information type determination unit 205, and a removal information recording unit 206.
  • the removal area position specifying unit 204 and the removal information type determination unit 205 constitute a removal area determination unit.
  • the change area determination unit 201 refers to a group of input images and determines an area with a change and an area without a change in the image. For the judgment, the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even if the chapters are straddled is judged as the region where there is no steady change, or the time even within the chapter. A region where a change is seen every time is determined to be a region where there is a steady change.
  • the logo when detecting a program logo, the following measures can be taken. -The logo is often displayed throughout the program, and even if the chapter changes, the area does not change. Therefore, the area that does not change across chapters is judged to be the logo (information that is constantly displayed). .. -If it is a specific program, the area where the logo is displayed can be specified almost uniquely, so the position of the logo is determined by acquiring information about the logo from the outside.
  • telops the presence or absence of display and the displayed contents change with time depending on the displayed contents.
  • the position of the lower right it can be judged that the information with high urgency is not displayed, so that it can be judged that the non-stationary information without urgency is displayed.
  • the area where the telop is displayed can be specified almost uniquely, so the position of the explanatory information is determined by acquiring the information about the telop from the outside.
  • the chapter information recording unit 202 records information about the chapter.
  • the change information recording unit 203 records information regarding the presence or absence of a change in each area with respect to the input image, which is determined by the change area determination unit 201.
  • the removal area position specifying unit 204 identifies the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203, and uses it as the position information of the removal area. Output.
  • the removal information type determination unit 205 is based on the information regarding the presence or absence of a change for each area recorded in the change information recording unit 203, and what kind of information the area has, for example, a program. It is determined whether the logo or telop (explanatory information) is urgent or not, and is output as the type information of the removal information.
  • the removal information recording unit 206 records information on the removal area of the input image. That is, the removal information recording unit 206 integrates and records the position information of the removal area and the type information of the removal information.
  • the detection unit 200 outputs the information of the removal area recorded in the removal information recording unit 206 as an output signal.
  • FIG. 5 shows a configuration example of the processing unit 400.
  • Segmentation processing unit 401 segmentation information recording unit 402, learning data area designation unit 403, learning data generation unit 404, learning data recording unit 405, generation network learning unit 406, and setting parameter recording unit. It has a 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410.
  • the learning data generation unit 404 and the generation network learning unit 406 form a learning unit
  • the interpolation image generation unit 409 and the image integration unit 410 constitute an interpolation processing unit.
  • the segmentation processing unit 401 performs segmentation processing on the input image.
  • the segmentation processing unit 401 divides an area for each subject, and then assigns a label indicating what the subject is, that is, the type of the subject, for each area.
  • the segmentation information recording unit 402 records the area information of the subject obtained by the segmentation process and the label information in association with each other.
  • the learning data area designation unit 403 refers to the information of the removal area supplied from the detection unit 200 or referred to from the detection unit 200, and further, the area information and the label information of the subject obtained by the segmentation process described above. , In order to construct the training data necessary for learning by the network for interpolating the removal area, the area of data that can be used for the training data is specified on the input image.
  • the segmentation including the removed area is a bridge, and the entire area determined to be a bridge on the input image is used to construct training data.
  • the data to use Specify as the data to use.
  • the same image as the image having the removed region can be used for the training data, so that texture generation with extremely high reproducibility can be performed.
  • the learning data generation unit 404 configures the learning data by referring to the area of the input image that can be used for the learning data specified by the learning data area designation unit 403.
  • the training data is composed only of this input image, the data widely acquired based on a predetermined label (for example, a bridge) is diverted, or both are combined. You may take measures.
  • the learning data recording unit 405 records the learning data generated by the learning data generation unit 404.
  • the generating network learning unit 406 reads the learning data from the learning data recording unit 405 and learns the generating neural network used for the interpolation processing. In this case, for example, further learning may be performed using the training data based on a generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • a hostile generation network GAN: Generative Adversarial Network
  • GAN Generative Adversarial Network
  • GAN has a structure in which a generator and a discriminator in the network compete with each other in an adversarial manner. It is possible to improve the performance of the generator by repeatedly competing the generator and the classifier. In this technique, learning of both the generator and the classifier is required.
  • the generator may be trained in advance so that many patterns can be generated, and additional learning may be performed in order to specialize in the input image.
  • the setting parameter recording unit 407 records parameters related to the type of learning data used by the learning data generation unit 404 and the specifications of the learning data, and parameters related to learning such as the number of learning times and the learning rate in the generation network learning unit 406.
  • the generation model recording unit 408 records the network model generated by the generation network learning unit 406. Here, a plurality of network models may be recorded.
  • the interpolated image generation unit 409 selects an appropriate network model from the generation model recording unit 408, and generates a texture most suitable for the removal area as an interpolated image.
  • the image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image.
  • the detection unit 200 is also arranged in the image processing unit 105 like the processing unit 400, but all of the detection unit 200 is arranged outside the image processing unit 105, and the removal area is specified. The same effect can be obtained by inputting the input image to the processing unit 400 arranged inside the image processing unit 105.
  • the flowchart of FIG. 6 shows an example of the processing procedure of the image processing unit 105 shown in FIG. In this example, the process of removing the image in the removal region from the input image in the removal unit 300 is omitted.
  • the processing procedure of the image processing unit 105 is roughly divided into two steps, a detection step and a processing step.
  • the detection step which is the process of the detection unit 200, will be described.
  • the detection step is composed of a change area determination step ST1, a removal area position identification step ST2, and a removal information type determination step ST3.
  • the area where there is a change and the area where there is no change in the image are determined with reference to a group of input images.
  • the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even across chapters is determined to be a region where there is no steady change, or the time even within the chapter.
  • a region where a change is seen every time is determined to be a region where there is a steady change.
  • the position of the area to be removed is specified from the input image and output as the removal position information.
  • the removal information type determination step ST3 it is determined what kind of information the area to be removed has (for example, whether it is a program logo or a telop, whether there is an urgency, etc.), and it is used as the type information. Output.
  • the processing step is composed of a segmentation processing step ST4, a learning data area designation step ST5, a learning data generation step ST6, a generation network learning step ST7, an interpolation image generation step ST8, and an image integration step ST9.
  • a segmentation processing step ST4 a learning data area designation step ST5
  • a learning data generation step ST6 a generation network learning step ST7
  • an interpolation image generation step ST8 an image integration step ST9.
  • Segmentation processing is performed on the input image. After dividing the area for each subject, a label indicating what the subject is, that is, a label indicating the type is given to each area.
  • the removal information obtained in the removal area position identification step ST2 and the removal information type determination step ST3, and the area information and label information of each subject obtained in the segmentation processing step ST4 are referred to.
  • the area of data that can be used for the training data is specified on the input image.
  • the learning data is configured by referring to the area of the input image that can be used for the learning data specified in the learning data area designation step ST5.
  • any means may be taken: the learning data is composed only of this input image, the data widely acquired based on a predetermined label is diverted, or both are combined. ..
  • the generating network learning step ST7 the learning data generated in the learning data generation step ST6 is read out, and the generating neural network used for the interpolation processing is learned.
  • the interpolated image generation step ST8 an appropriate generating system neural network learned in the generating system network learning step ST7 is selected, and the texture most suitable for the removal region is generated as an interpolated image.
  • the image integration step ST9 the interpolated image generated in the interpolation image generation step ST8 is fitted into the removal area of the input image, integrated as one image, and output as an output image.
  • the processing unit 400 uses a generating neural network to generate an interpolated image to be fitted in the removal region of the input image. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
  • the generating network learning unit 406 of the processing unit 400 uses a generating neural network that has been pre-learned to generate an image of the same type as the subject corresponding to the removal region. It is something to learn. Therefore, it is possible to efficiently learn the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • FIG. 7 shows another configuration example of the detection unit 200.
  • the detection unit 200A will be described.
  • the parts corresponding to those in FIG. 4 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • the detection unit 200A also has the change area determination unit 201, the chapter information recording unit 202, the change information recording unit 203, the removal area position identification unit 204, and the removal information type determination. It has a unit 205 and a removal information recording unit 206.
  • the removal area position identification unit 204 and the removal information type determination unit 205 can refer to external information.
  • the removal area position specifying unit 204 identifies and removes the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Output as position information.
  • (1) specify the area to be removed by the viewer himself, (2) refer to the information about the program being watched, and (3) relate to various events that occurred on that day. Referencing the information also helps to locate the area to be removed from the input image.
  • the viewer when the viewer specifies that there is an area to be removed in the upper left area of the image, it is possible to determine whether or not there is information that can be removed in that area. Further, for example, regarding (2), by acquiring information on the appearance and display position of the program logo, or the insertion position and size of the telop from the information on the program being watched, it is removed in any area on the screen. It is possible to determine whether there is information that can be the target.
  • breaking news, weather information, etc. will be displayed from information on events such as incidents, accidents, disasters, elections, etc. that occurred on and around the day when the program was broadcast, and they are It is possible to predict the position on the image where the information of is likely to be displayed. This makes it possible to accurately estimate the position of the region that can be removed.
  • the removal information type determination unit 205 has what kind of information the area has based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Is determined and output as type information.
  • the above-mentioned information (1) to (3) it is possible to estimate the type of information displayed in the area that can be removed with high accuracy.
  • FIG. 8 shows another configuration example of the processing unit 400.
  • the processing unit 400A will be described.
  • the parts corresponding to those in FIG. 5 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • this processing unit 400A also has a segmentation processing unit 401, a segmentation information recording unit 402, a learning data area designation unit 403, a learning data generation unit 404, and learning data. It has a recording unit 405, a generation network learning unit 406, a setting parameter recording unit 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410.
  • the processing unit 400A determines whether or not to update the functions of the evaluation unit 411 that evaluates the output image, the evaluation information recording unit 412 that records the evaluation information, and the processing unit 400A based on the evaluation information. Further, it has a processing update unit 413 for determining.
  • the image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image.
  • the evaluation unit 411 evaluates whether or not the output image is appropriate, and outputs the evaluation result as evaluation information.
  • the evaluation can be directly or explicitly input by the viewer using a remote controller or a terminal, obtained by measuring the viewer's line of sight, emotion, or biological information, or inferred from the viewer's voice information. The method of doing this can be considered.
  • the processing update unit 413 includes a segmentation processing unit 401, a learning data area designation unit 403, a learning data generation unit 404, a generation network learning unit 406, and a setting parameter recording in the processing unit 400A. After determining how to update the functions of the unit 407 and the like, if it is determined that the update is necessary, the functions are updated.
  • the segmentation processing unit 401 it is conceivable to update the segmentation method and the number of classifications. If it becomes possible to accurately recognize and label a region that cannot be classified by the conventional segmentation method, it becomes possible to generate and fit a texture with higher reproduction accuracy when performing interpolation for that region.
  • the accuracy of the area to be designated as the learning data is improved, and the learning data composed of a specific label is mistakenly used. It is possible to avoid mixing the identified learning data, and the learning performance in the generation system neural network learning unit is improved.
  • the learning data generation unit 404 either means of configuring the learning data only from the input image, diverting the data widely acquired based on a predetermined label, or combining both means. It is conceivable to correct the label. It is conceivable to update the network structure for the generating network learning unit 406.
  • parameters related to the type of learning data used in the learning data generation unit 404 and the specifications of the learning data can be updated, and learning such as the number of learning times and the learning rate in the generation network learning unit 406 can be performed. It is conceivable to update the parameters related to.
  • FIG. 9 shows yet another configuration example of the processing unit 400.
  • the processing unit 400B it will be described as the processing unit 400B.
  • the parts corresponding to those in FIG. 8 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • the basic configuration of the processing unit 400B is the same as that of the processing unit 400A shown in FIG.
  • the processing update unit 413 determines whether or not to update the function in the processing unit 400B, not only the evaluation information but also the external setting information is used.
  • the processing update unit 413 determines how to update each function in the processing unit 400A (400B) based on the evaluation information, and then it is necessary to update. If it is determined, the function will be updated.
  • the processing update unit 413 in the processing unit 400B further uses the setting information that can be obtained from the outside, so that the segmentation processing unit 401, the learning data area designation unit 403, the learning data generation unit 404, and the generation network in the processing unit 400B are used.
  • the functions of the learning unit 406 and the setting parameter recording unit 407 can be updated in the same manner. For example, it is conceivable to use the setting information in the processing unit of another viewer having the same function as the processing unit 400B and the evaluation information of the other viewer. This makes it possible to make improvements efficiently.
  • the present technology can have the following configurations.
  • An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fits the interpolated image into the removal region of the input image to obtain an output image.
  • the processing unit is A segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image. Designation of a learning data area for designating a data area that can be used for learning data required for learning by the generation neural network on the input image based on the information of the removal area and the area information of each subject.
  • the Department and A learning data generation unit that generates learning data from the input image based on the data area, A generating network learning unit that learns the generating neural network using the learning data, An interpolated image generator that generates an interpolated image to be fitted in the removal region using the generating neural network,
  • the image processing apparatus according to (1), further comprising an image integration unit for fitting the interpolated image into a removal region of the input image to obtain the output image.
  • the generating network learning unit further learns from the learning data based on the generating neural network that has been previously trained to generate an image of the same type as the subject corresponding to the removal region.
  • the image processing apparatus according to (2) above.
  • the image processing apparatus according to any one of (1) to (3) above, further comprising a detection unit for obtaining information on the removal region included in the input image. (5) The image processing apparatus according to (4), wherein the detection unit obtains information on the removal area based on information from the outside. (6) The image processing apparatus according to any one of (1) to (5) above, further comprising a processing updating unit that updates the function of the processing unit based on the evaluation information of the output image. (7) The image processing apparatus according to (6), wherein the processing update unit further updates the function of the processing unit based on external setting information.
  • An image processing method comprising a procedure of generating an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fitting the interpolated image into the removal region of the input image to obtain an output image.
  • Computer A program that uses a generating neural network to generate an interpolated image to be fitted into the removal area of an input image, and fits the interpolated image into the removal area of the input image to function as a processing means for obtaining an output image.
  • a change area determination unit that determines an area with a change and an area without a change in the image with reference to the input image, and A chapter information recording unit in which information about chapters of the input image is recorded, and A change information recording unit that records information regarding the presence or absence of changes in each region with respect to the input image, and Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position identification unit that identifies the position of the area to be removed from the input image and outputs it as removal position information, Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information.
  • a segmentation processing unit that performs segmentation processing on the input image
  • a segmentation information recording unit that records the area information of the subject obtained by the segmentation process and the label information in association with each other.
  • the area of data that can be used for the learning data is determined.
  • the learning data area specification part specified on the input image and A learning data generation unit that generates learning data by referring to an input image area that can be used for learning data, which is designated by the learning data area designation unit.
  • a learning data recording unit that records learning data generated by the learning data generation unit, A generating network learning unit that reads learning data from the learning data recording unit and learns a generating neural network used for interpolation processing.
  • a setting parameter recording unit that records parameters related to the type of learning data used in the learning data generation unit and learning data specifications, and learning parameters such as the number of learning times and the learning rate in the generation network learning unit.
  • a generation model recording unit that records the network model generated by the generation network learning unit, and a generation model recording unit.
  • An interpolated image generator that selects an appropriate network model from the generated model recording unit and generates the texture that best fits the removal area as an interpolated image.
  • An image processing apparatus including an image integrating unit that fits the interpolated image generated by the interpolated image generation unit into a corresponding area of an input image, integrates the interpolated image as a single image, and outputs the output image.
  • An evaluation unit that evaluates the quality of the image output from the image integration unit, and The image processing apparatus according to (10) or (11), further comprising a processing updating unit that determines whether or not to update the function based on the evaluation information obtained by the evaluation unit.
  • a change area determination step for determining an area where there is a change and an area where there is no change in the image with reference to the input image, and Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position specifying step of specifying the position of the area to be removed from the input image and outputting it as the removal position information, Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information.
  • Steps and A segmentation processing step for performing segmentation processing on the input image In order to construct the learning data necessary for learning by the network for interpolating the removal area by referring to the removal information recorded in the removal information recording unit, the area of data that can be used for the learning data is determined.
  • An image processing method including an image integration step of fitting an interpolated image generated by the interpolated image generation unit into a corresponding region of an input image, integrating the image as a single image, and outputting the image as an output image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Une image interpolée s'ajuste bien dans une zone d'élimination d'informations supplémentaires d'une image d'entrée. L'image interpolée à ajuster dans la zone d'élimination de l'image d'entrée est générée à l'aide d'un réseau de neurones artificiels de générateur, et l'image interpolée s'ajuste dans la zone d'élimination de l'image d'entrée, obtenant ainsi une image de sortie. Comme l'image interpolée à ajuster dans la zone d'élimination de l'image d'entrée est générée à l'aide du réseau de neurones artificiels de générateur, l'image interpolée ayant une gêne visuelle réduite peut s'adapter dans la zone d'élimination de l'image d'entrée sans utiliser un signal spécial pour l'interpolation.
PCT/JP2020/027931 2019-08-30 2020-07-17 Appareil de traitement des images, procédé de traitement des images et programme WO2021039192A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019157826 2019-08-30
JP2019-157826 2019-08-30

Publications (1)

Publication Number Publication Date
WO2021039192A1 true WO2021039192A1 (fr) 2021-03-04

Family

ID=74684141

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/027931 WO2021039192A1 (fr) 2019-08-30 2020-07-17 Appareil de traitement des images, procédé de traitement des images et programme

Country Status (1)

Country Link
WO (1) WO2021039192A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09200684A (ja) * 1996-01-12 1997-07-31 Sony Corp 映像/音声信号記録装置
JP2014212434A (ja) * 2013-04-18 2014-11-13 三菱電機株式会社 映像信号処理装置及び方法、並びにプログラム及び記録媒体
JP2019079114A (ja) * 2017-10-20 2019-05-23 キヤノン株式会社 画像処理装置、画像処理方法およびプログラム
WO2019159424A1 (fr) * 2018-02-16 2019-08-22 新東工業株式会社 Système, dispositif, procédé et programme d'évaluation, et support d'enregistrement

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09200684A (ja) * 1996-01-12 1997-07-31 Sony Corp 映像/音声信号記録装置
JP2014212434A (ja) * 2013-04-18 2014-11-13 三菱電機株式会社 映像信号処理装置及び方法、並びにプログラム及び記録媒体
JP2019079114A (ja) * 2017-10-20 2019-05-23 キヤノン株式会社 画像処理装置、画像処理方法およびプログラム
WO2019159424A1 (fr) * 2018-02-16 2019-08-22 新東工業株式会社 Système, dispositif, procédé et programme d'évaluation, et support d'enregistrement

Similar Documents

Publication Publication Date Title
KR101318459B1 (ko) 수신기 상에서 오디오비주얼 문서를 시청하는 방법 및이러한 문서를 시청하기 위한 수신기
US20100313214A1 (en) Display system, system for measuring display effect, display method, method for measuring display effect, and recording medium
CN101377917B (zh) 显示设备
US20200134792A1 (en) Real time tone mapping of high dynamic range image data at time of playback on a lower dynamic range display
US20120229489A1 (en) Pillarboxing Correction
US8446432B2 (en) Context aware user interface system
CN109379628A (zh) 视频处理方法、装置、电子设备及计算机可读介质
US9773523B2 (en) Apparatus, method and computer program
US20160381290A1 (en) Apparatus, method and computer program
KR20110074107A (ko) 카메라를 이용한 오브젝트 검출 방법
CN101611629A (zh) 图像处理设备、运动图像再现设备及其处理方法和程序
JP2008244811A (ja) フレームレート変換装置および映像表示装置
JP5116513B2 (ja) 画像表示装置及びその制御方法
WO2021039192A1 (fr) Appareil de traitement des images, procédé de traitement des images et programme
KR20050026965A (ko) 비디오 시스템의 작동 제어 방법 및 시스템
JP2013179563A (ja) 映像処理装置、映像表示装置、映像記録装置、映像処理方法、および映像処理プログラム
TW201802664A (zh) 圖像輸出裝置、圖像輸出方法以及電腦程式產品
WO2020234939A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et programme
CN107995538B (zh) 视频批注方法及系统
CN113852757B (zh) 视频处理方法、装置、设备和存储介质
JP6080667B2 (ja) 映像信号処理装置及び方法、並びにプログラム及び記録媒体
US20110235997A1 (en) Method and device for creating a modified video from an input video
KR20120098622A (ko) 비디오 콘텐츠에 보이스 콘텐츠를 부가하는 방법 및 상기 방법을 구현하기 위한 디바이스
CN107743710A (zh) 显示装置及其控制方法
JP6930880B2 (ja) 映像表示装置および映像制作支援方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20858523

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20858523

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP