WO2021039192A1 - Image processing apparatus, image processing method, and program - Google Patents

Image processing apparatus, image processing method, and program Download PDF

Info

Publication number
WO2021039192A1
WO2021039192A1 PCT/JP2020/027931 JP2020027931W WO2021039192A1 WO 2021039192 A1 WO2021039192 A1 WO 2021039192A1 JP 2020027931 W JP2020027931 W JP 2020027931W WO 2021039192 A1 WO2021039192 A1 WO 2021039192A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
unit
information
area
removal
Prior art date
Application number
PCT/JP2020/027931
Other languages
French (fr)
Japanese (ja)
Inventor
高橋 修一
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2021039192A1 publication Critical patent/WO2021039192A1/en

Links

Images

Classifications

    • G06T5/77
    • G06T5/60
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/387Composing, repositioning or otherwise geometrically modifying originals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present technology relates to an image processing apparatus, an image processing method and a program, and more particularly to an image processing apparatus for fitting an interpolated image into an additional information removal area of an input image.
  • image information not only the image information but also additional information such as logo, telop, graphics, etc. are superimposed and displayed to provide the viewer with explanation and supplementary information.
  • additional information such as logo, telop, graphics, etc.
  • logo trademark of a person superimposed on an image
  • location of a landscape the location of a landscape
  • logo mark of a program or a producer and the like.
  • information that requires breaking news such as news and weather information, may be superimposed and displayed regardless of the image.
  • Patent Document 1 A technique for removing such additional information superimposed and displayed in an image has been conventionally provided.
  • a pixel constituting a character of a telop is detected to detect the telop, and the pixel after removing the pixel is interpolated (spatial interpolation) with the information of another pixel in the same frame.
  • a processing device that interpolates (time interpolation) with information of the same pixel in another frame is disclosed.
  • Patent Document 2 based on the fineness and roughness of the texture in the vicinity of the telop, the pixels in the telop region are interpolated by propagating the pixel values outside the region into the region or repeatedly copying several pixels in the vicinity. The method of erasing the telop is disclosed. Further, Patent Document 3 discloses a video / audio signal recording device that erases and interpolates the telop display inserted in the video signal by using an interpolated signal including the original video signal information in which the telop is not inserted.
  • Patent Document 1 when the pixel group of the telop or graphics that is always displayed in the image is removed, the pixel group does not change with time, so that time interpolation cannot be performed and spatial interpolation is applied. I have no choice. However, this Patent Document 1 does not specifically describe what kind of spatial interpolation is performed.
  • the purpose of this technique is to fit the interpolated image well into the additional information removal area of the input image.
  • An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network and fits the interpolated image into the removal region of the input image to obtain an output image.
  • the processing unit generates an interpolated image to be fitted in the removal area of the input image using a generation neural network, and fits this interpolated image in the removal area of the input image to obtain an output image.
  • the processing unit is a segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image, and a processing unit on the input image based on the information of the removal area and the area information of each subject.
  • a training data area specification unit that specifies a data area that can be used for training data required for training by a generation neural network, and a training data generation that generates training data from an input image based on this data area.
  • the generating network learning unit further learns from the learning data based on the generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. May be made. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • an interpolated image to be fitted in the removal region of the input image is generated by using the generating neural network. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
  • a detection unit for obtaining information on the removal region included in the input image may be further provided.
  • the detection unit may obtain information on the removal region based on information from the outside. This makes it possible to improve the accuracy of the information in the removed area.
  • a processing update unit that updates the function of the processing unit based on the evaluation information of the output image may be further provided. This makes it possible to improve the interpolated image generated by the processing unit to a more appropriate one.
  • the processing update unit may update the function of the processing unit based on the external setting information. This makes it possible to make improvements efficiently.
  • FIG. 1 shows a configuration example of the television receiving device 10 as an embodiment.
  • the television receiving device 10 includes a receiving antenna 101, a digital broadcast receiving unit 102, a display unit 103, a recording / reproducing unit 104, an image processing unit 105, a CPU 106, and a user operation unit 107.
  • the CPU 106 controls the operation of each part of the television receiving device 10.
  • the user can perform various operation inputs by the user operation unit 107.
  • the user operation unit 107 includes a remote control unit, a touch panel unit that performs operation input by proximity / touch, a mouse, a keyboard, a gesture input unit that detects operation input with a camera, a voice input unit that is operated by voice, and the like.
  • the digital broadcast receiving unit 102 processes the television broadcast signal input from the receiving antenna 101 to obtain an image signal related to the broadcast content.
  • the recording / reproducing unit 104 records the image signal obtained by the digital broadcasting receiving unit 102 and reproduces it at an appropriate timing.
  • the display unit 103 displays an image based on the image signal obtained by the digital broadcast receiving unit 102 or the image signal reproduced by the recording / reproducing unit 104.
  • the image processing unit 105 performs image processing on the image signal reproduced by the recording / reproducing unit 104, returns the processed image signal to the recording / reproducing unit 104, and causes the image signal to be recorded as the processed image signal.
  • the image processing unit 105 detects an area of additional information such as a logo, telop, and graphics from the input image as a removal area, generates an interpolation image to be fitted in the removal area using a generation neural network, and performs this interpolation. The process of fitting the image into the removal area of the input image to obtain the output image is performed.
  • the processing of the image processing unit 105 is performed with the operation of reproducing the image signal before processing and recording the image signal after processing in the recording / reproducing unit 104. At this time, the image by the reproduced image signal is displayed. It can be displayed on the unit 103 or not displayed. In the case of displaying and performing, it is also possible to selectively instruct additional information to be removed by user operation. It is also conceivable that the image processing unit 105 performs image processing in real time on the received image signal obtained by the digital broadcast receiving unit 102.
  • FIG. 2 shows a configuration example of the image processing unit 105.
  • the image processing unit 105 has a detection unit 200, a removal unit 300, and a processing unit 400.
  • the input image signal is supplied to the detection unit 200 and the removal unit 300.
  • the detection unit 200 obtains information on a removal area (for example, an area of additional information such as a logo, telop, graphics, etc.) included in the input image.
  • the information of this removal area is supplied to the removal unit 300 and the processing unit 400.
  • the removal unit 300 refers to the information of the removal area and obtains an image obtained by removing the image of the removal area from the input image.
  • the image signal output from the removal unit 300 is supplied to the processing unit 400.
  • the processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image, refers to this information and the information on the removal area, and uses a biological neural network to obtain information on the input image.
  • An interpolated image to be fitted in the removal area is generated, and the interpolated image is fitted in the removal area of the input image to obtain an output image.
  • the output image signal is output from the processing unit 400.
  • FIG. 3 shows an example of the image state in each part of the image processing unit 105 shown in FIG.
  • FIG. 3A shows an input image of the detection unit 200.
  • the detection unit 200 detects the logo of the program existing in the upper left of the input image, and the area of the logo is set as the removal area.
  • the removal unit 300 obtains an image obtained by removing the image of the removal region from the input image.
  • the processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image.
  • the input image is divided into each subject area, and a label indicating the type is given to each area.
  • it is divided into bridge, sky, mountain, and water areas.
  • the processing unit 400 the information on the area and type of each subject and the information on the removal area are referred to, and as shown in FIG. 3E, the interpolated image for fitting in the removal area is the biological neural network. Is generated using.
  • the processing unit 400 determines that it is desirable to interpolate the bridge information in the removal region, and the biological neural network that has learned the bridge information in advance.
  • a network is used to generate a plausible bridge texture as an interpolated image.
  • FIG. 3F shows an output image obtained by fitting the interpolated image obtained by the processing unit 400 into the removal region of the input image.
  • the pixel information of the peripheral pixels and the front and rear frames is not referred to as it is, but the pattern is generated in consideration of the meaning in the image, so that unnaturalness and discomfort can be significantly suppressed.
  • -Detect target areas such as telops and graphics.
  • telop may be determined based on the frame difference before and after. -If it is a broadcast program, it is possible to predict in which area the telop or graphics will be displayed by analyzing the past broadcast, so the analysis result may be used. -You may refer to the viewer's setting information such as "remove only the program logo without erasing the telop" and "remove the telop and the program logo, but do not remove the graphics".
  • An example of the processing policy of the processing unit 400 is shown below. -Insert a new texture etc. into the area where the telop etc. have been removed. -Generate and fit a plausible pattern using a generating neural network.
  • the texture with high probability of the interpolated area may be estimated, and the generated network or the training data of the network may be selected according to the result.
  • the segmentation information of the scene may be used, and the information of the region having the same segmentation result as the region to be interpolated may be used for the learning data. Since learning can be performed based on information that has a higher correlation with the interpolation area, there is a high possibility that a more plausible pattern can be generated.
  • the removal unit 300 is not always necessary. This is because when the interpolated image is fitted into the removal region of the input image in the processing unit 400, the image in the removal region of the input image is substantially removed.
  • each part of the image processing unit 105 can be performed by software processing by a computer.
  • FIG. 4 shows a configuration example of the detection unit 200.
  • the detection unit 200 includes a change area determination unit 201, a chapter information recording unit 202, a change information recording unit 203, a removal area position identification unit 204, a removal information type determination unit 205, and a removal information recording unit 206.
  • the removal area position specifying unit 204 and the removal information type determination unit 205 constitute a removal area determination unit.
  • the change area determination unit 201 refers to a group of input images and determines an area with a change and an area without a change in the image. For the judgment, the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even if the chapters are straddled is judged as the region where there is no steady change, or the time even within the chapter. A region where a change is seen every time is determined to be a region where there is a steady change.
  • the logo when detecting a program logo, the following measures can be taken. -The logo is often displayed throughout the program, and even if the chapter changes, the area does not change. Therefore, the area that does not change across chapters is judged to be the logo (information that is constantly displayed). .. -If it is a specific program, the area where the logo is displayed can be specified almost uniquely, so the position of the logo is determined by acquiring information about the logo from the outside.
  • telops the presence or absence of display and the displayed contents change with time depending on the displayed contents.
  • the position of the lower right it can be judged that the information with high urgency is not displayed, so that it can be judged that the non-stationary information without urgency is displayed.
  • the area where the telop is displayed can be specified almost uniquely, so the position of the explanatory information is determined by acquiring the information about the telop from the outside.
  • the chapter information recording unit 202 records information about the chapter.
  • the change information recording unit 203 records information regarding the presence or absence of a change in each area with respect to the input image, which is determined by the change area determination unit 201.
  • the removal area position specifying unit 204 identifies the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203, and uses it as the position information of the removal area. Output.
  • the removal information type determination unit 205 is based on the information regarding the presence or absence of a change for each area recorded in the change information recording unit 203, and what kind of information the area has, for example, a program. It is determined whether the logo or telop (explanatory information) is urgent or not, and is output as the type information of the removal information.
  • the removal information recording unit 206 records information on the removal area of the input image. That is, the removal information recording unit 206 integrates and records the position information of the removal area and the type information of the removal information.
  • the detection unit 200 outputs the information of the removal area recorded in the removal information recording unit 206 as an output signal.
  • FIG. 5 shows a configuration example of the processing unit 400.
  • Segmentation processing unit 401 segmentation information recording unit 402, learning data area designation unit 403, learning data generation unit 404, learning data recording unit 405, generation network learning unit 406, and setting parameter recording unit. It has a 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410.
  • the learning data generation unit 404 and the generation network learning unit 406 form a learning unit
  • the interpolation image generation unit 409 and the image integration unit 410 constitute an interpolation processing unit.
  • the segmentation processing unit 401 performs segmentation processing on the input image.
  • the segmentation processing unit 401 divides an area for each subject, and then assigns a label indicating what the subject is, that is, the type of the subject, for each area.
  • the segmentation information recording unit 402 records the area information of the subject obtained by the segmentation process and the label information in association with each other.
  • the learning data area designation unit 403 refers to the information of the removal area supplied from the detection unit 200 or referred to from the detection unit 200, and further, the area information and the label information of the subject obtained by the segmentation process described above. , In order to construct the training data necessary for learning by the network for interpolating the removal area, the area of data that can be used for the training data is specified on the input image.
  • the segmentation including the removed area is a bridge, and the entire area determined to be a bridge on the input image is used to construct training data.
  • the data to use Specify as the data to use.
  • the same image as the image having the removed region can be used for the training data, so that texture generation with extremely high reproducibility can be performed.
  • the learning data generation unit 404 configures the learning data by referring to the area of the input image that can be used for the learning data specified by the learning data area designation unit 403.
  • the training data is composed only of this input image, the data widely acquired based on a predetermined label (for example, a bridge) is diverted, or both are combined. You may take measures.
  • the learning data recording unit 405 records the learning data generated by the learning data generation unit 404.
  • the generating network learning unit 406 reads the learning data from the learning data recording unit 405 and learns the generating neural network used for the interpolation processing. In this case, for example, further learning may be performed using the training data based on a generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • a hostile generation network GAN: Generative Adversarial Network
  • GAN Generative Adversarial Network
  • GAN has a structure in which a generator and a discriminator in the network compete with each other in an adversarial manner. It is possible to improve the performance of the generator by repeatedly competing the generator and the classifier. In this technique, learning of both the generator and the classifier is required.
  • the generator may be trained in advance so that many patterns can be generated, and additional learning may be performed in order to specialize in the input image.
  • the setting parameter recording unit 407 records parameters related to the type of learning data used by the learning data generation unit 404 and the specifications of the learning data, and parameters related to learning such as the number of learning times and the learning rate in the generation network learning unit 406.
  • the generation model recording unit 408 records the network model generated by the generation network learning unit 406. Here, a plurality of network models may be recorded.
  • the interpolated image generation unit 409 selects an appropriate network model from the generation model recording unit 408, and generates a texture most suitable for the removal area as an interpolated image.
  • the image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image.
  • the detection unit 200 is also arranged in the image processing unit 105 like the processing unit 400, but all of the detection unit 200 is arranged outside the image processing unit 105, and the removal area is specified. The same effect can be obtained by inputting the input image to the processing unit 400 arranged inside the image processing unit 105.
  • the flowchart of FIG. 6 shows an example of the processing procedure of the image processing unit 105 shown in FIG. In this example, the process of removing the image in the removal region from the input image in the removal unit 300 is omitted.
  • the processing procedure of the image processing unit 105 is roughly divided into two steps, a detection step and a processing step.
  • the detection step which is the process of the detection unit 200, will be described.
  • the detection step is composed of a change area determination step ST1, a removal area position identification step ST2, and a removal information type determination step ST3.
  • the area where there is a change and the area where there is no change in the image are determined with reference to a group of input images.
  • the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even across chapters is determined to be a region where there is no steady change, or the time even within the chapter.
  • a region where a change is seen every time is determined to be a region where there is a steady change.
  • the position of the area to be removed is specified from the input image and output as the removal position information.
  • the removal information type determination step ST3 it is determined what kind of information the area to be removed has (for example, whether it is a program logo or a telop, whether there is an urgency, etc.), and it is used as the type information. Output.
  • the processing step is composed of a segmentation processing step ST4, a learning data area designation step ST5, a learning data generation step ST6, a generation network learning step ST7, an interpolation image generation step ST8, and an image integration step ST9.
  • a segmentation processing step ST4 a learning data area designation step ST5
  • a learning data generation step ST6 a generation network learning step ST7
  • an interpolation image generation step ST8 an image integration step ST9.
  • Segmentation processing is performed on the input image. After dividing the area for each subject, a label indicating what the subject is, that is, a label indicating the type is given to each area.
  • the removal information obtained in the removal area position identification step ST2 and the removal information type determination step ST3, and the area information and label information of each subject obtained in the segmentation processing step ST4 are referred to.
  • the area of data that can be used for the training data is specified on the input image.
  • the learning data is configured by referring to the area of the input image that can be used for the learning data specified in the learning data area designation step ST5.
  • any means may be taken: the learning data is composed only of this input image, the data widely acquired based on a predetermined label is diverted, or both are combined. ..
  • the generating network learning step ST7 the learning data generated in the learning data generation step ST6 is read out, and the generating neural network used for the interpolation processing is learned.
  • the interpolated image generation step ST8 an appropriate generating system neural network learned in the generating system network learning step ST7 is selected, and the texture most suitable for the removal region is generated as an interpolated image.
  • the image integration step ST9 the interpolated image generated in the interpolation image generation step ST8 is fitted into the removal area of the input image, integrated as one image, and output as an output image.
  • the processing unit 400 uses a generating neural network to generate an interpolated image to be fitted in the removal region of the input image. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
  • the generating network learning unit 406 of the processing unit 400 uses a generating neural network that has been pre-learned to generate an image of the same type as the subject corresponding to the removal region. It is something to learn. Therefore, it is possible to efficiently learn the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
  • FIG. 7 shows another configuration example of the detection unit 200.
  • the detection unit 200A will be described.
  • the parts corresponding to those in FIG. 4 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • the detection unit 200A also has the change area determination unit 201, the chapter information recording unit 202, the change information recording unit 203, the removal area position identification unit 204, and the removal information type determination. It has a unit 205 and a removal information recording unit 206.
  • the removal area position identification unit 204 and the removal information type determination unit 205 can refer to external information.
  • the removal area position specifying unit 204 identifies and removes the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Output as position information.
  • (1) specify the area to be removed by the viewer himself, (2) refer to the information about the program being watched, and (3) relate to various events that occurred on that day. Referencing the information also helps to locate the area to be removed from the input image.
  • the viewer when the viewer specifies that there is an area to be removed in the upper left area of the image, it is possible to determine whether or not there is information that can be removed in that area. Further, for example, regarding (2), by acquiring information on the appearance and display position of the program logo, or the insertion position and size of the telop from the information on the program being watched, it is removed in any area on the screen. It is possible to determine whether there is information that can be the target.
  • breaking news, weather information, etc. will be displayed from information on events such as incidents, accidents, disasters, elections, etc. that occurred on and around the day when the program was broadcast, and they are It is possible to predict the position on the image where the information of is likely to be displayed. This makes it possible to accurately estimate the position of the region that can be removed.
  • the removal information type determination unit 205 has what kind of information the area has based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Is determined and output as type information.
  • the above-mentioned information (1) to (3) it is possible to estimate the type of information displayed in the area that can be removed with high accuracy.
  • FIG. 8 shows another configuration example of the processing unit 400.
  • the processing unit 400A will be described.
  • the parts corresponding to those in FIG. 5 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • this processing unit 400A also has a segmentation processing unit 401, a segmentation information recording unit 402, a learning data area designation unit 403, a learning data generation unit 404, and learning data. It has a recording unit 405, a generation network learning unit 406, a setting parameter recording unit 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410.
  • the processing unit 400A determines whether or not to update the functions of the evaluation unit 411 that evaluates the output image, the evaluation information recording unit 412 that records the evaluation information, and the processing unit 400A based on the evaluation information. Further, it has a processing update unit 413 for determining.
  • the image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image.
  • the evaluation unit 411 evaluates whether or not the output image is appropriate, and outputs the evaluation result as evaluation information.
  • the evaluation can be directly or explicitly input by the viewer using a remote controller or a terminal, obtained by measuring the viewer's line of sight, emotion, or biological information, or inferred from the viewer's voice information. The method of doing this can be considered.
  • the processing update unit 413 includes a segmentation processing unit 401, a learning data area designation unit 403, a learning data generation unit 404, a generation network learning unit 406, and a setting parameter recording in the processing unit 400A. After determining how to update the functions of the unit 407 and the like, if it is determined that the update is necessary, the functions are updated.
  • the segmentation processing unit 401 it is conceivable to update the segmentation method and the number of classifications. If it becomes possible to accurately recognize and label a region that cannot be classified by the conventional segmentation method, it becomes possible to generate and fit a texture with higher reproduction accuracy when performing interpolation for that region.
  • the accuracy of the area to be designated as the learning data is improved, and the learning data composed of a specific label is mistakenly used. It is possible to avoid mixing the identified learning data, and the learning performance in the generation system neural network learning unit is improved.
  • the learning data generation unit 404 either means of configuring the learning data only from the input image, diverting the data widely acquired based on a predetermined label, or combining both means. It is conceivable to correct the label. It is conceivable to update the network structure for the generating network learning unit 406.
  • parameters related to the type of learning data used in the learning data generation unit 404 and the specifications of the learning data can be updated, and learning such as the number of learning times and the learning rate in the generation network learning unit 406 can be performed. It is conceivable to update the parameters related to.
  • FIG. 9 shows yet another configuration example of the processing unit 400.
  • the processing unit 400B it will be described as the processing unit 400B.
  • the parts corresponding to those in FIG. 8 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.
  • the basic configuration of the processing unit 400B is the same as that of the processing unit 400A shown in FIG.
  • the processing update unit 413 determines whether or not to update the function in the processing unit 400B, not only the evaluation information but also the external setting information is used.
  • the processing update unit 413 determines how to update each function in the processing unit 400A (400B) based on the evaluation information, and then it is necessary to update. If it is determined, the function will be updated.
  • the processing update unit 413 in the processing unit 400B further uses the setting information that can be obtained from the outside, so that the segmentation processing unit 401, the learning data area designation unit 403, the learning data generation unit 404, and the generation network in the processing unit 400B are used.
  • the functions of the learning unit 406 and the setting parameter recording unit 407 can be updated in the same manner. For example, it is conceivable to use the setting information in the processing unit of another viewer having the same function as the processing unit 400B and the evaluation information of the other viewer. This makes it possible to make improvements efficiently.
  • the present technology can have the following configurations.
  • An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fits the interpolated image into the removal region of the input image to obtain an output image.
  • the processing unit is A segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image. Designation of a learning data area for designating a data area that can be used for learning data required for learning by the generation neural network on the input image based on the information of the removal area and the area information of each subject.
  • the Department and A learning data generation unit that generates learning data from the input image based on the data area, A generating network learning unit that learns the generating neural network using the learning data, An interpolated image generator that generates an interpolated image to be fitted in the removal region using the generating neural network,
  • the image processing apparatus according to (1), further comprising an image integration unit for fitting the interpolated image into a removal region of the input image to obtain the output image.
  • the generating network learning unit further learns from the learning data based on the generating neural network that has been previously trained to generate an image of the same type as the subject corresponding to the removal region.
  • the image processing apparatus according to (2) above.
  • the image processing apparatus according to any one of (1) to (3) above, further comprising a detection unit for obtaining information on the removal region included in the input image. (5) The image processing apparatus according to (4), wherein the detection unit obtains information on the removal area based on information from the outside. (6) The image processing apparatus according to any one of (1) to (5) above, further comprising a processing updating unit that updates the function of the processing unit based on the evaluation information of the output image. (7) The image processing apparatus according to (6), wherein the processing update unit further updates the function of the processing unit based on external setting information.
  • An image processing method comprising a procedure of generating an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fitting the interpolated image into the removal region of the input image to obtain an output image.
  • Computer A program that uses a generating neural network to generate an interpolated image to be fitted into the removal area of an input image, and fits the interpolated image into the removal area of the input image to function as a processing means for obtaining an output image.
  • a change area determination unit that determines an area with a change and an area without a change in the image with reference to the input image, and A chapter information recording unit in which information about chapters of the input image is recorded, and A change information recording unit that records information regarding the presence or absence of changes in each region with respect to the input image, and Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position identification unit that identifies the position of the area to be removed from the input image and outputs it as removal position information, Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information.
  • a segmentation processing unit that performs segmentation processing on the input image
  • a segmentation information recording unit that records the area information of the subject obtained by the segmentation process and the label information in association with each other.
  • the area of data that can be used for the learning data is determined.
  • the learning data area specification part specified on the input image and A learning data generation unit that generates learning data by referring to an input image area that can be used for learning data, which is designated by the learning data area designation unit.
  • a learning data recording unit that records learning data generated by the learning data generation unit, A generating network learning unit that reads learning data from the learning data recording unit and learns a generating neural network used for interpolation processing.
  • a setting parameter recording unit that records parameters related to the type of learning data used in the learning data generation unit and learning data specifications, and learning parameters such as the number of learning times and the learning rate in the generation network learning unit.
  • a generation model recording unit that records the network model generated by the generation network learning unit, and a generation model recording unit.
  • An interpolated image generator that selects an appropriate network model from the generated model recording unit and generates the texture that best fits the removal area as an interpolated image.
  • An image processing apparatus including an image integrating unit that fits the interpolated image generated by the interpolated image generation unit into a corresponding area of an input image, integrates the interpolated image as a single image, and outputs the output image.
  • An evaluation unit that evaluates the quality of the image output from the image integration unit, and The image processing apparatus according to (10) or (11), further comprising a processing updating unit that determines whether or not to update the function based on the evaluation information obtained by the evaluation unit.
  • a change area determination step for determining an area where there is a change and an area where there is no change in the image with reference to the input image, and Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position specifying step of specifying the position of the area to be removed from the input image and outputting it as the removal position information, Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information.
  • Steps and A segmentation processing step for performing segmentation processing on the input image In order to construct the learning data necessary for learning by the network for interpolating the removal area by referring to the removal information recorded in the removal information recording unit, the area of data that can be used for the learning data is determined.
  • An image processing method including an image integration step of fitting an interpolated image generated by the interpolated image generation unit into a corresponding region of an input image, integrating the image as a single image, and outputting the image as an output image.

Abstract

An interpolated image fits well in an additional information removal area of an input image. The interpolated image to fit in the removal area of the input image is generated using a generator neural network, and the interpolated image fits in the removal area of the input image, thereby obtaining an output image. Since the interpolated image to fit in the removal area of the input image is generated using the generator neural network, the interpolated image having reduced visual discomfort can fit in the removal area of the input image without using a special signal for the interpolation.

Description

画像処理装置、画像処理方法およびプログラムImage processing equipment, image processing methods and programs
 本技術は、画像処理装置、画像処理方法およびプログラムに関し、詳しくは、入力画像の付加的情報除去領域に補間画像を嵌め込むための画像処理装置等に関する。 The present technology relates to an image processing apparatus, an image processing method and a program, and more particularly to an image processing apparatus for fitting an interpolated image into an additional information removal area of an input image.
 画像内には、画像情報だけではなく、付加的情報、例えばロゴ、テロップ、グラフィクスなども重畳表示され、視聴者に、説明や補足情報を提供している。例えば、画像に重畳表示される人物の氏名、風景の場所、番組や製作者のロゴマークなどである。それ以外にも、画像とは無関係に、ニュースや気象情報等の速報性が求められる情報が重畳表示されることもある。 In the image, not only the image information but also additional information such as logo, telop, graphics, etc. are superimposed and displayed to provide the viewer with explanation and supplementary information. For example, the name of a person superimposed on an image, the location of a landscape, the logo mark of a program or a producer, and the like. In addition to that, information that requires breaking news, such as news and weather information, may be superimposed and displayed regardless of the image.
 画像内に重畳表示されるこのような付加的情報を除去する技術は、従来から提供されている。例えば、特許文献1では、テロップの文字を構成する画素を検出してテロップを検出し、当該画素を除去した後の画素を、同一フレーム内の別の画素の情報で補間(空間補間)する、もしくは別フレーム内の同一画素の情報で補間(時間補間)する処理装置が開示されている。 A technique for removing such additional information superimposed and displayed in an image has been conventionally provided. For example, in Patent Document 1, a pixel constituting a character of a telop is detected to detect the telop, and the pixel after removing the pixel is interpolated (spatial interpolation) with the information of another pixel in the same frame. Alternatively, a processing device that interpolates (time interpolation) with information of the same pixel in another frame is disclosed.
 また、特許文献2では、テロップ近傍のテクスチャの細かさや粗さに基づき、領域外の画素値を領域内に伝搬させたり、近傍の数画素を反復コピーさせたりすることでテロップ領域の画素を補間するテロップ消去方法が開示されている。さらに、特許文献3では、テロップが挿入されない本来の映像信号情報が含まれる補間信号を用いて、映像信号に挿入されたテロップ表示を消去・補間する映像/音声信号記録装置が開示されている。 Further, in Patent Document 2, based on the fineness and roughness of the texture in the vicinity of the telop, the pixels in the telop region are interpolated by propagating the pixel values outside the region into the region or repeatedly copying several pixels in the vicinity. The method of erasing the telop is disclosed. Further, Patent Document 3 discloses a video / audio signal recording device that erases and interpolates the telop display inserted in the video signal by using an interpolated signal including the original video signal information in which the telop is not inserted.
特開2014-212434号公報Japanese Unexamined Patent Publication No. 2014-212434 特開2006-148263号公報Japanese Unexamined Patent Publication No. 2006-148263 特開平9-200684号公報Japanese Unexamined Patent Publication No. 9-200684
 特許文献1に示す発明では、常に画像に表示されているテロップやグラフィクスの画素群を除去する場合、その画素群は時間的に変化していないため、時間補間ができず、空間補間を適用せざるを得ない。しかし、この特許文献1には、具体的にどのような空間補間を行うのかは記載されていない。 In the invention shown in Patent Document 1, when the pixel group of the telop or graphics that is always displayed in the image is removed, the pixel group does not change with time, so that time interpolation cannot be performed and spatial interpolation is applied. I have no choice. However, this Patent Document 1 does not specifically describe what kind of spatial interpolation is performed.
 特許文献2に示す発明では、テロップと判断された領域の近傍の画素値だけで補間するため、その領域に本来何が表示されていたかは考慮されない。従って、画素値が類似しているだけの情報が補間されるだけで、補間された内容そのものは本来表示されていたものとは無関係になる。また、補間対象となる領域の近傍のパターンによっては、画素値の伝搬や反復コピーにより、不自然なパターンが出現する可能性がある。 In the invention shown in Patent Document 2, since only the pixel values in the vicinity of the region determined to be a telop are interpolated, what was originally displayed in that region is not considered. Therefore, only the information having similar pixel values is interpolated, and the interpolated content itself becomes irrelevant to what was originally displayed. Further, depending on the pattern in the vicinity of the region to be interpolated, an unnatural pattern may appear due to the propagation of pixel values or repeated copying.
 特許文献3に示す発明では、テロップの除去に補間信号という特別な信号が必要であるが、このような補間信号は通常は存在せず現実的ではない。 In the invention shown in Patent Document 3, a special signal called an interpolated signal is required to remove the telop, but such an interpolated signal usually does not exist and is not realistic.
 本技術の目的は、入力画像の付加的情報除去領域に補間画像を良好に嵌め込むことにある。 The purpose of this technique is to fit the interpolated image well into the additional information removal area of the input image.
 本技術の概念は、
 生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る処理部を備える
 画像処理装置にある。
The concept of this technology is
An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network and fits the interpolated image into the removal region of the input image to obtain an output image.
 本技術においては、処理部により、生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、この補間画像を入力画像の除去領域に嵌め込んで出力画像を得ることが行われる。 In the present technology, the processing unit generates an interpolated image to be fitted in the removal area of the input image using a generation neural network, and fits this interpolated image in the removal area of the input image to obtain an output image. Will be
 例えば、処理部は、入力画像に対するセグメンテーション処理を行って画像内に含まれる各被写体の領域情報を得るセグメンテーション処理部と、除去領域の情報と各被写体の領域情報に基づいて、入力画像上で、生成系ニューラルネットワークが学習するために必要な学習用データに使用できるデータ領域を指定する学習用データ領域指定部と、このデータ領域に基づいて、入力画像から学習用データを生成する学習用データ生成部と、この学習用データを用いて生成系ニューラルネットワークを学習する生成系ネットワーク学習部と、この生成系ニューラルネットワークを用いて除去領域に嵌め込む補間画像を生成する補間画像生成部と、この補間画像を入力画像の除去領域に嵌め込んで出力画像を得る画像統合部を有する、ようにされてもよい。 For example, the processing unit is a segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image, and a processing unit on the input image based on the information of the removal area and the area information of each subject. A training data area specification unit that specifies a data area that can be used for training data required for training by a generation neural network, and a training data generation that generates training data from an input image based on this data area. A unit, a generation network learning unit that learns a generation neural network using this training data, an interpolation image generation unit that generates an interpolation image to be fitted in a removal area using this generation neural network, and this interpolation. It may be configured to have an image integration unit that fits the image into the removal area of the input image to obtain the output image.
 この場合、例えば、生成系ネットワーク学習部は、除去領域に対応する被写体と同じ種別の画像を生成するように予め学習されている生成系ニューラルネットワークをベースとして、学習用データでさらに学習する、ようにされてもよい。これにより、除去領域に嵌め込む補間画像を生成する生成系ニューラルネットワークの学習を、少ない学習用データで効率よく学習させることが可能となる。 In this case, for example, the generating network learning unit further learns from the learning data based on the generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. May be made. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
 このように本技術においては、生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成するものである。そのため、補間のための特別な信号を用いずに、入力画像の除去領域に見た目の違和感を低減した補間画像を嵌め込むことが可能となる。 As described above, in this technique, an interpolated image to be fitted in the removal region of the input image is generated by using the generating neural network. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
 なお、本技術において、例えば、入力画像に含まれる除去領域の情報を得る検出部をさらに備える、ようにされてもよい。この場合、例えば、検出部は、外部からの情報に基づいて、除去領域の情報を得る、ようにされてもよい。これにより、除去領域の情報の精度を高めることが可能となる。 Note that, in the present technology, for example, a detection unit for obtaining information on the removal region included in the input image may be further provided. In this case, for example, the detection unit may obtain information on the removal region based on information from the outside. This makes it possible to improve the accuracy of the information in the removed area.
 また、本技術において、例えば、出力画像の評価情報に基づいて、処理部の機能を更新する処理更新部をさらに備える、ようにされてもよい。これにより、処理部で生成される補間画像をより適正なものに改善していくことが可能となる。この場合、例えば、処理更新部は、さらに外部設定情報に基づいて、処理部の機能を更新する、ようにされてもよい。これにより、改善を効率よく行うことが可能となる。 Further, in the present technology, for example, a processing update unit that updates the function of the processing unit based on the evaluation information of the output image may be further provided. This makes it possible to improve the interpolated image generated by the processing unit to a more appropriate one. In this case, for example, the processing update unit may update the function of the processing unit based on the external setting information. This makes it possible to make improvements efficiently.
実施の形態としてのテレビ受信装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the television receiving apparatus as an embodiment. 画像処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image processing part. 画像処理部の各部における画像状態の一例を示す図である。It is a figure which shows an example of the image state in each part of an image processing part. 検出部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the detection part. 処理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a processing part. 画像処理部の処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the processing procedure of an image processing unit. 検出部の他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of a detection part. 処理部の他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of a processing part. 処理部の他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of a processing part.
 以下、発明を実施するための形態(以下、「実施の形態」とする)について説明する。なお、説明は以下の順序で行う。
 1.実施の形態
 2.変形例
Hereinafter, embodiments for carrying out the invention (hereinafter referred to as “embodiments”) will be described. The explanation will be given in the following order.
1. 1. Embodiment 2. Modification example
 <1.実施の形態>
 [テレビ受信装置の構成]
 図1は、実施の形態としてのテレビ受信装置10の構成例を示している。このテレビ受信装置10は、受信アンテナ101と、デジタル放送受信部102と、表示部103と、記録/再生部104と、画像処理部105と、CPU106と、ユーザ操作部107を有している。
<1. Embodiment>
[TV receiver configuration]
FIG. 1 shows a configuration example of the television receiving device 10 as an embodiment. The television receiving device 10 includes a receiving antenna 101, a digital broadcast receiving unit 102, a display unit 103, a recording / reproducing unit 104, an image processing unit 105, a CPU 106, and a user operation unit 107.
 CPU106は、テレビ受信装置10の各部の動作を制御する。ユーザは、ユーザ操作部107により種々の操作入力を行うことができる。このユーザ操作部107は、リモートコントロール部、近接/タッチにより操作入力を行うタッチパネル部、マウス、キーボード、カメラで操作入力を検出するジェスチャ入力部、音声で操作する音声入力部などである。 The CPU 106 controls the operation of each part of the television receiving device 10. The user can perform various operation inputs by the user operation unit 107. The user operation unit 107 includes a remote control unit, a touch panel unit that performs operation input by proximity / touch, a mouse, a keyboard, a gesture input unit that detects operation input with a camera, a voice input unit that is operated by voice, and the like.
 デジタル放送受信部102は、受信アンテナ101から入力されたテレビ放送信号を処理して、放送コンテンツに係る画像信号を得る。記録/再生部104は、デジタル放送受信部102でえられた画像信号を記録し、適宜なタイミングで再生する。表示部103は、デジタル放送受信部102で得られた画像信号、あるいは記録/再生部104で再生された画像信号に基づいて、画像を表示する。 The digital broadcast receiving unit 102 processes the television broadcast signal input from the receiving antenna 101 to obtain an image signal related to the broadcast content. The recording / reproducing unit 104 records the image signal obtained by the digital broadcasting receiving unit 102 and reproduces it at an appropriate timing. The display unit 103 displays an image based on the image signal obtained by the digital broadcast receiving unit 102 or the image signal reproduced by the recording / reproducing unit 104.
 画像処理部105は、記録/再生部104で再生された画像信号に対して画像処理を行い、処理後の画像信号を記録/再生部104に戻して処理後の画像信号として記録させる。画像処理部105は、入力画像から例えばロゴ、テロップ、グラフィクスなどの付加的情報の領域を除去領域として検出し、生成系ニューラルネットワークを用いてその除去領域に嵌め込む補間画像を生成し、この補間画像を入力画像の除去領域に嵌め込んで出力画像を得る、という処理をする。 The image processing unit 105 performs image processing on the image signal reproduced by the recording / reproducing unit 104, returns the processed image signal to the recording / reproducing unit 104, and causes the image signal to be recorded as the processed image signal. The image processing unit 105 detects an area of additional information such as a logo, telop, and graphics from the input image as a removal area, generates an interpolation image to be fitted in the removal area using a generation neural network, and performs this interpolation. The process of fitting the image into the removal area of the input image to obtain the output image is performed.
 この画像処理部105の処理は、記録/再生部104における処理前の画像信号の再生と、処理後の画像信号の記録の動作を伴って行われるが、この際、再生画像信号による画像を表示部103に表示して行うことも、表示しないで行うことも可能である。表示して行う場合には、ユーザ操作で、除去すべき付加的情報を選択的に指示することも可能となる。また、デジタル放送受信部102で得られた受信画像信号に対して、画像処理部105でリアルタイムに画像処理を行うことも考えられる。 The processing of the image processing unit 105 is performed with the operation of reproducing the image signal before processing and recording the image signal after processing in the recording / reproducing unit 104. At this time, the image by the reproduced image signal is displayed. It can be displayed on the unit 103 or not displayed. In the case of displaying and performing, it is also possible to selectively instruct additional information to be removed by user operation. It is also conceivable that the image processing unit 105 performs image processing in real time on the received image signal obtained by the digital broadcast receiving unit 102.
 図2は、画像処理部105の構成例を示している。画像処理部105は、検出部200と、除去部300と、処理部400を有している。入力画像信号は、検出部200および除去部300に供給される。検出部200は、入力画像に含まれる除去領域(例えばロゴ、テロップ、グラフィクスなどの付加的情報の領域)の情報を得る。この除去領域の情報は、除去部300および処理部400に供給される。 FIG. 2 shows a configuration example of the image processing unit 105. The image processing unit 105 has a detection unit 200, a removal unit 300, and a processing unit 400. The input image signal is supplied to the detection unit 200 and the removal unit 300. The detection unit 200 obtains information on a removal area (for example, an area of additional information such as a logo, telop, graphics, etc.) included in the input image. The information of this removal area is supplied to the removal unit 300 and the processing unit 400.
 除去部300は、除去領域の情報を参照し、入力画像から除去領域の画像を除いた画像を得る。この除去部300から出力される画像信号は、処理部400に供給される。処理部400は、入力画像にセグメンテーション処理を行って画像内に含まれる各被写体の領域および種別の情報を得、この情報と除去領域の情報を参照し、生体系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、その補間画像を入力画像の除去領域に嵌め込んで出力画像を得る。この処理部400から出力画像信号が出力される。 The removal unit 300 refers to the information of the removal area and obtains an image obtained by removing the image of the removal area from the input image. The image signal output from the removal unit 300 is supplied to the processing unit 400. The processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image, refers to this information and the information on the removal area, and uses a biological neural network to obtain information on the input image. An interpolated image to be fitted in the removal area is generated, and the interpolated image is fitted in the removal area of the input image to obtain an output image. The output image signal is output from the processing unit 400.
 図3は、図2に示す画像処理部105の各部における画像状態の一例を示している。図3(a)は、検出部200の入力画像を示している。検出部200では、図3(b)に示すように、この入力画像の左上に存在する番組のロゴが検出され、そのロゴの領域が除去領域とされる。そして、除去部300では、図3(c)に示すように、入力画像から除去領域の画像を除いた画像が得られる。 FIG. 3 shows an example of the image state in each part of the image processing unit 105 shown in FIG. FIG. 3A shows an input image of the detection unit 200. As shown in FIG. 3B, the detection unit 200 detects the logo of the program existing in the upper left of the input image, and the area of the logo is set as the removal area. Then, as shown in FIG. 3C, the removal unit 300 obtains an image obtained by removing the image of the removal region from the input image.
 処理部400では、図3(d)に示すように、入力画像にセグメンテーション処理が行われて画像内に含まれる各被写体の領域および種別の情報が得られる。この場合、入力画像は各被写体の領域毎に分割されて、それぞれにその種別を示すラベルが付与される。図示の例では、橋、空、山、水の各領域に分割されている。 As shown in FIG. 3D, the processing unit 400 performs segmentation processing on the input image to obtain information on the area and type of each subject included in the image. In this case, the input image is divided into each subject area, and a label indicating the type is given to each area. In the illustrated example, it is divided into bridge, sky, mountain, and water areas.
 また、処理部400では、この各被写体の領域および種別の情報と、除去領域の情報が参照され、図3(e)に示すように、除去領域に嵌め込むための補間画像が生体系ニューラルネットワークを用いて生成される。 Further, in the processing unit 400, the information on the area and type of each subject and the information on the removal area are referred to, and as shown in FIG. 3E, the interpolated image for fitting in the removal area is the biological neural network. Is generated using.
 この場合、処理部400は、除去領域が橋の領域に存在していることから、除去領域には、橋の情報を補間することが望ましいと判断し、橋の情報を予め学習した生体系ニューラルネットワークを用いて、尤もらしい橋のテクスチャを補間画像として生成する。図3(f)は、処理部400で得られる補間画像を入力画像の除去領域に嵌め込んで得られた出力画像を示している。 In this case, since the removal region exists in the bridge region, the processing unit 400 determines that it is desirable to interpolate the bridge information in the removal region, and the biological neural network that has learned the bridge information in advance. A network is used to generate a plausible bridge texture as an interpolated image. FIG. 3F shows an output image obtained by fitting the interpolated image obtained by the processing unit 400 into the removal region of the input image.
 元々どのような画像であったかを知ることは不可能であるため、完全な正解画像を生成して除去領域に嵌め込むことはできない。しかし、従来手法のように周辺画素や前後フレームの画素情報をそのまま参照するのではなく、画像内の意味を考慮してパターンを生成するため、不自然さや違和感を大幅に抑えることができる。 Since it is impossible to know what kind of image it was originally, it is not possible to generate a complete correct image and fit it in the removal area. However, unlike the conventional method, the pixel information of the peripheral pixels and the front and rear frames is not referred to as it is, but the pattern is generated in consideration of the meaning in the image, so that unnaturalness and discomfort can be significantly suppressed.
 ここで、検出部200の処理方針(ポリシー)の一例を以下に示す。
 ・テロップやグラフィクスなど、対象となる領域を検出する。
 ・特定の領域 (画像の左上や右上や下部) で定常的に表示されたり、画面全体で非定常的に表示されたりするテロップやグラフィクスを検出する。
 ・複数のチャプターに渡って画像を比較し、チャプターを跨いでも変化しない画面上の領域は、その領域で定常的に表示される番組のロゴ等と判断する。
Here, an example of the processing policy of the detection unit 200 is shown below.
-Detect target areas such as telops and graphics.
-Detects telops and graphics that are displayed constantly in a specific area (upper left, upper right, or lower part of the image) or are displayed non-steadily on the entire screen.
-Compare images across multiple chapters, and determine that the area on the screen that does not change even across chapters is the logo of the program that is constantly displayed in that area.
 ・前後のフレーム差分をもとに、テロップの有無を判断してもよい。
 ・放送番組であれば、過去の放送を解析することで、どの領域にテロップやグラフィクスが表示されるかが予測できるので、その解析結果を利用してもよい。
 ・「テロップは消さずに番組のロゴだけを除去する」、「テロップや番組のロゴは除去し、グラフィクスは除去しない」といった視聴者の設定情報を参照してもよい。
-The presence or absence of telop may be determined based on the frame difference before and after.
-If it is a broadcast program, it is possible to predict in which area the telop or graphics will be displayed by analyzing the past broadcast, so the analysis result may be used.
-You may refer to the viewer's setting information such as "remove only the program logo without erasing the telop" and "remove the telop and the program logo, but do not remove the graphics".
 また、処理部400の処理方針(ポリシー)の一例を以下に示す。
 ・テロップ等が除去された領域に新たなテクスチャ等を嵌め込む。
 ・生成系ニューラルネットワークを用いて尤もらしいパターンを生成し嵌め込む。
An example of the processing policy of the processing unit 400 is shown below.
-Insert a new texture etc. into the area where the telop etc. have been removed.
-Generate and fit a plausible pattern using a generating neural network.
 ・補間する際は、嵌め込む領域における時間的情報(前後フレームの情報)や空間的情報(同一フレームの近傍・他領域の情報)は使用しない。
 ・生成系ニューラルネットワークの学習には、処理対象の画像において補間対象ではない他の領域を学習データとして利用する。同じフレーム内であれば、空間的に類似の情報が含まれている可能性が高いからである。
-When interpolating, do not use temporal information (information on the previous and next frames) and spatial information (information on the vicinity of the same frame and other areas) in the area to be fitted.
-For learning of the generating neural network, other areas that are not the interpolation target in the image to be processed are used as learning data. This is because there is a high possibility that spatially similar information is included within the same frame.
 ・そのシーンのセグメンテーション情報を利用して、補間する領域が有する確率の高いテクスチャを推定し、その結果に応じて生成ネットワークやそのネットワークの学習データを選択してもよい。
 ・そのシーンのセグメンテーション情報を利用して、補間する領域と同じセグメンテーション結果を有する領域の情報を学習用データに利用してもよい。補間領域とより相関の高い情報を元に学習できるため、より尤もらしいパターンが生成できる可能性が高い。
-Using the segmentation information of the scene, the texture with high probability of the interpolated area may be estimated, and the generated network or the training data of the network may be selected according to the result.
-The segmentation information of the scene may be used, and the information of the region having the same segmentation result as the region to be interpolated may be used for the learning data. Since learning can be performed based on information that has a higher correlation with the interpolation area, there is a high possibility that a more plausible pattern can be generated.
 なお、図2に示す画像処理部105において、除去部300は、必ずしも必要ではない。処理部400において、入力画像の除去領域に補間画像を嵌め込む際に、入力画像の除去領域の画像は実質的に除去されるからである。 In the image processing unit 105 shown in FIG. 2, the removal unit 300 is not always necessary. This is because when the interpolated image is fitted into the removal region of the input image in the processing unit 400, the image in the removal region of the input image is substantially removed.
 また、画像処理部105の各部の処理の一部または全部を、コンピュータによるソフトウェア処理で行うこともできる。 Further, a part or all of the processing of each part of the image processing unit 105 can be performed by software processing by a computer.
 図4は、検出部200の構成例を示している。この検出部200は、変化領域判定部201と、チャプター情報記録部202と、変化情報記録部203と、除去領域位置特定部204と、除去情報種別判定部205と、除去情報記録部206を有している。ここで、除去領域位置特定部204および除去情報種別判定部205は、除去領域判定部を構成している。 FIG. 4 shows a configuration example of the detection unit 200. The detection unit 200 includes a change area determination unit 201, a chapter information recording unit 202, a change information recording unit 203, a removal area position identification unit 204, a removal information type determination unit 205, and a removal information recording unit 206. doing. Here, the removal area position specifying unit 204 and the removal information type determination unit 205 constitute a removal area determination unit.
 変化領域判定部201は、一群の入力画像を参照して、画像内で変化がある領域と変化がない領域を判定する。判定には、入力画像の内容に合わせて付与された1つ以上のチャプターに関する情報を参照し、チャプターを跨いでも変化がない領域は定常的に変化がない領域と判定したり、チャプター内でも時間毎に変化が見られる領域は定常的に変化がある領域と判定したりする。 The change area determination unit 201 refers to a group of input images and determines an area with a change and an area without a change in the image. For the judgment, the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even if the chapters are straddled is judged as the region where there is no steady change, or the time even within the chapter. A region where a change is seen every time is determined to be a region where there is a steady change.
 例えば、番組のロゴを検出する場合、次のような手段をとることが可能である。
 ・ロゴは番組中を通して表示されることが多く、チャプターが変わってもその領域には変化がないため、チャプターをまたいで変化がない領域は、ロゴ (定常的に表示される情報) と判断する。
 ・特定の番組であれば、ロゴが表示される領域はほぼ一意に特定できるため、ロゴに関する情報を外部から取得することで、ロゴの位置を判断する。
For example, when detecting a program logo, the following measures can be taken.
-The logo is often displayed throughout the program, and even if the chapter changes, the area does not change. Therefore, the area that does not change across chapters is judged to be the logo (information that is constantly displayed). ..
-If it is a specific program, the area where the logo is displayed can be specified almost uniquely, so the position of the logo is determined by acquiring information about the logo from the outside.
 また、例えば、テロップ(説明情報)を検出する場合、次のような手段をとることが可能である.
 ・テロップは、表示されているコンテンツによって、表示の有無や表示内容が時間的に変化する。しかし、右下という位置を考慮すると緊急性が高い情報が表示されているわけではないと判断できるため、緊急性のない非定常情報が表示されていると判断できる。
 ・特定の番組であれば、テロップが表示される領域はほぼ一意に特定できるため、テロップに関する情報を外部から取得することで、説明情報の位置を判断する。
In addition, for example, when detecting telop (explanatory information), the following measures can be taken.
-For telops, the presence or absence of display and the displayed contents change with time depending on the displayed contents. However, considering the position of the lower right, it can be judged that the information with high urgency is not displayed, so that it can be judged that the non-stationary information without urgency is displayed.
-If it is a specific program, the area where the telop is displayed can be specified almost uniquely, so the position of the explanatory information is determined by acquiring the information about the telop from the outside.
 チャプター情報記録部202は、チャプターに関する情報を記録する。変化情報記録部203は、変化領域判定部201で判定された、入力画像に対する領域毎の変化の有無に関する情報を記録する。 The chapter information recording unit 202 records information about the chapter. The change information recording unit 203 records information regarding the presence or absence of a change in each area with respect to the input image, which is determined by the change area determination unit 201.
 除去領域位置特定部204は、変化情報記録部203に記録されている領域毎の変化の有無に関する情報をもとに、入力画像から除去すべき領域の位置を特定し、除去領域の位置情報として出力する。除去情報種別判定部205は、変化情報記録部203に記録されている領域毎の変化の有無に関する情報をもとに、その領域が有する情報がどのような種別のものであるか、例えば、番組のロゴかテロップ(説明情報)か、緊急性があるかないか等を判定し、除去情報の種別情報として出力する。 The removal area position specifying unit 204 identifies the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203, and uses it as the position information of the removal area. Output. The removal information type determination unit 205 is based on the information regarding the presence or absence of a change for each area recorded in the change information recording unit 203, and what kind of information the area has, for example, a program. It is determined whether the logo or telop (explanatory information) is urgent or not, and is output as the type information of the removal information.
 除去情報記録部206は、入力画像の除去領域の情報を記録する。つまり、この除去情報記録部206は、除去領域の位置情報および除去情報の種別情報を統合して記録する。検出部200は、除去情報記録部206に記録される除去領域の情報を出力信号として出力する。 The removal information recording unit 206 records information on the removal area of the input image. That is, the removal information recording unit 206 integrates and records the position information of the removal area and the type information of the removal information. The detection unit 200 outputs the information of the removal area recorded in the removal information recording unit 206 as an output signal.
 図5は、処理部400の構成例を示している。 セグメンテーション処理部401と、セグメンテーション情報記録部402と、学習用データ領域指定部403と、学習用データ生成部404と、学習用データ記録部405と、生成系ネットワーク学習部406と、設定パラメータ記録部407と、生成モデル記録部408と、補間画像生成部409と、画像統合部410を有している。ここで、学習用データ生成部404および生成系ネットワーク学習部406は学習部を構成し、補間画像生成部409および画像統合部410は補間処理部を構成している。 FIG. 5 shows a configuration example of the processing unit 400. Segmentation processing unit 401, segmentation information recording unit 402, learning data area designation unit 403, learning data generation unit 404, learning data recording unit 405, generation network learning unit 406, and setting parameter recording unit. It has a 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410. Here, the learning data generation unit 404 and the generation network learning unit 406 form a learning unit, and the interpolation image generation unit 409 and the image integration unit 410 constitute an interpolation processing unit.
 セグメンテーション処理部401は、入力画像に対するセグメンテーション処理を行う。セグメンテーション処理部401は、被写体毎に領域を分割した上で、その領域毎にその被写体が何であるか、つまりその被写体の種別を示すラベルを付与する。セグメンテーション情報記録部402は、セグメンテーション処理により得られた被写体の領域情報とラベル情報を対応付けて記録する。 The segmentation processing unit 401 performs segmentation processing on the input image. The segmentation processing unit 401 divides an area for each subject, and then assigns a label indicating what the subject is, that is, the type of the subject, for each area. The segmentation information recording unit 402 records the area information of the subject obtained by the segmentation process and the label information in association with each other.
 学習用データ領域指定部403は、検出部200から供給される、あるいは検出部200から参照される除去領域の情報、さらには上述したセグメンテーション処理により得られた被写体の領域情報とラベル情報を参照し、除去領域を補間するためのネットワークが学習するために必要な学習用データを構築するため、その学習用データに使用できるデータの領域を入力画像上で指定する。 The learning data area designation unit 403 refers to the information of the removal area supplied from the detection unit 200 or referred to from the detection unit 200, and further, the area information and the label information of the subject obtained by the segmentation process described above. , In order to construct the training data necessary for learning by the network for interpolating the removal area, the area of data that can be used for the training data is specified on the input image.
 具体的には、例えば、図3に示す例では、除去領域が含まれているセグメンテーションが橋であることを特定し、入力画像上で橋と判断されている領域全体を学習用データの構築に使用するデータとして指定する。これにより、除去領域を有する画像と同じ画像を学習データに利用できるため、極めて再現精度の高いテクスチャ生成が可能となる。また、静止画像である必要はなく、動画像の連続したフレームの情報を活用することで、学習データの量が増加し、更に再現精度高くテクスチャを生成することも可能である。 Specifically, for example, in the example shown in FIG. 3, it is specified that the segmentation including the removed area is a bridge, and the entire area determined to be a bridge on the input image is used to construct training data. Specify as the data to use. As a result, the same image as the image having the removed region can be used for the training data, so that texture generation with extremely high reproducibility can be performed. Further, it is not necessary to be a still image, and by utilizing the information of continuous frames of a moving image, the amount of training data can be increased, and it is possible to generate a texture with higher reproducibility accuracy.
 学習用データ生成部404は、学習用データ領域指定部403で指定された、学習用データに利用可能な入力画像の領域を参照して学習用データを構成する。学習用データの構成に際しては、学習用データをこの入力画像のみから構成すること、所定のラベル (例えば橋)に基づいて広範に取得したデータを流用すること、その両者を組み合わせること、のいずれの手段をとってもよい。 The learning data generation unit 404 configures the learning data by referring to the area of the input image that can be used for the learning data specified by the learning data area designation unit 403. When constructing the training data, either the training data is composed only of this input image, the data widely acquired based on a predetermined label (for example, a bridge) is diverted, or both are combined. You may take measures.
 学習用データ記録部405は、学習用データ生成部404で生成された学習用データを記録する。生成系ネットワーク学習部406は、学習用データ記録部405から学習用データを読み出し、補間処理に使用する生成系ニューラルネットワークを学習する。この場合、例えば、除去領域に対応する被写体と同じ種別の画像を生成するように予め学習されている生成系ニューラルネットワークをベースとして、当該学習用データでさらに学習してもよい。これにより、除去領域に嵌め込む補間画像生成する生成系ニューラルネットワークの学習を、少ない学習用データで効率よく学習させることが可能となる。 The learning data recording unit 405 records the learning data generated by the learning data generation unit 404. The generating network learning unit 406 reads the learning data from the learning data recording unit 405 and learns the generating neural network used for the interpolation processing. In this case, for example, further learning may be performed using the training data based on a generating neural network that has been trained in advance to generate an image of the same type as the subject corresponding to the removal region. This makes it possible to efficiently train the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
 この生成系ニューラルネットワークとして、敵対的生成ネットワーク(GAN:Generative Adversarial Network)が代表例として知られている。詳細説明は省略するが、従来周知のように、敵対的生成ネットワーク(GAN)は、ネットワーク内の生成器(generator)と識別器(discriminator)を互いに敵対的(adversarial)に競わせる構造を有しており、生成器と識別器を繰り返し競わせることで生成器の性能を高めることが可能である。本技術においては、生成器と識別器の双方の学習が必要である。ただし、生成器は数多くのパターンが生成できるよう事前に学習しておき、入力画像に特化するために学習を追加で実施する程度でよい。 As this generating neural network, a hostile generation network (GAN: Generative Adversarial Network) is known as a typical example. Although detailed description is omitted, as is well known in the past, a hostile generation network (GAN) has a structure in which a generator and a discriminator in the network compete with each other in an adversarial manner. It is possible to improve the performance of the generator by repeatedly competing the generator and the classifier. In this technique, learning of both the generator and the classifier is required. However, the generator may be trained in advance so that many patterns can be generated, and additional learning may be performed in order to specialize in the input image.
 設定パラメータ記録部407は、学習用データ生成部404で使用する学習データの種類や学習データの仕様に関するパラメータと、生成系ネットワーク学習部406での学習回数や学習率といった学習に関するパラメータを記録する。 The setting parameter recording unit 407 records parameters related to the type of learning data used by the learning data generation unit 404 and the specifications of the learning data, and parameters related to learning such as the number of learning times and the learning rate in the generation network learning unit 406.
 生成モデル記録部408は、生成系ネットワーク学習部406で生成されたネットワークモデルを記録する。ここで、複数のネットワークモデルが記録されていても構わない。補間画像生成部409は、生成モデル記録部408から適切なネットワークモデルを選択し、除去領域に最も適合するテクスチャを補間画像として生成する。画像統合部410は、補間画像生成部409で生成された補間画像を、入力画像の除去領域に嵌め込んで一枚の画像として統合し、出力画像として出力する。 The generation model recording unit 408 records the network model generated by the generation network learning unit 406. Here, a plurality of network models may be recorded. The interpolated image generation unit 409 selects an appropriate network model from the generation model recording unit 408, and generates a texture most suitable for the removal area as an interpolated image. The image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image.
 なお、上述の説明では、検出部200も処理部400と同様に画像処理部105内に配置されているが、検出部200の全てを画像処理部105の外部に配置し、除去領域が特定された入力画像を画像処理部105の内部に配置する処理部400に入力しても、同様の効果を得ることが可能である。 In the above description, the detection unit 200 is also arranged in the image processing unit 105 like the processing unit 400, but all of the detection unit 200 is arranged outside the image processing unit 105, and the removal area is specified. The same effect can be obtained by inputting the input image to the processing unit 400 arranged inside the image processing unit 105.
 図6のフローチャートは、図2に示す画像処理部105の処理手順の一例を示している。なお、この例では、除去部300における、入力画像から除去領域の画像を除去する処理については省略されている。画像処理部105の処理手順は、大きく分けて、検出ステップと処理ステップの2ステップから構成される。 The flowchart of FIG. 6 shows an example of the processing procedure of the image processing unit 105 shown in FIG. In this example, the process of removing the image in the removal region from the input image in the removal unit 300 is omitted. The processing procedure of the image processing unit 105 is roughly divided into two steps, a detection step and a processing step.
 まず、検出部200の処理である検出ステップについて説明する。検出ステップは、変化領域判定ステップST1と、除去領域位置特定ステップST2と、除去情報種別判定ステップST3とから構成される。 First, the detection step, which is the process of the detection unit 200, will be described. The detection step is composed of a change area determination step ST1, a removal area position identification step ST2, and a removal information type determination step ST3.
 変化領域判定ステップST1では、一群の入力画像を参照して画像内で変化がある領域と変化がない領域を判定する。判定には、入力画像の内容に合わせて付与された1つ以上のチャプターに関する情報を参照し、チャプターをまたいでも変化がない領域は定常的に変化がない領域と判定したり、チャプター内でも時間毎に変化が見られる領域は定常的に変化がある領域と判定したりする。 In the change area determination step ST1, the area where there is a change and the area where there is no change in the image are determined with reference to a group of input images. For the judgment, the information about one or more chapters given according to the content of the input image is referred to, and the region where there is no change even across chapters is determined to be a region where there is no steady change, or the time even within the chapter. A region where a change is seen every time is determined to be a region where there is a steady change.
 除去領域位置特定ステップST2では、入力画像から除去すべき領域の位置を特定し除去位置情報として出力する。除去情報種別判定ステップST3では、除去すべき領域が有する情報がどのような種類のものであるか(例えば、番組のロゴかテロップか、緊急性があるかないか等)を判定し、種別情報として出力する。 In the removal area position specifying step ST2, the position of the area to be removed is specified from the input image and output as the removal position information. In the removal information type determination step ST3, it is determined what kind of information the area to be removed has (for example, whether it is a program logo or a telop, whether there is an urgency, etc.), and it is used as the type information. Output.
 次に、処理部40の処理ステップについて説明する。処理ステップは、セグメンテーション処理ステップST4と、学習用データ領域指定ステップST5と、学習用データ生成ステップST6と、生成系ネットワーク学習ステップST7と、補間画像生成ステップST8と、画像統合ステップST9とから構成される。 Next, the processing steps of the processing unit 40 will be described. The processing step is composed of a segmentation processing step ST4, a learning data area designation step ST5, a learning data generation step ST6, a generation network learning step ST7, an interpolation image generation step ST8, and an image integration step ST9. To.
 セグメンテーション処理ステップST4では、入力画像に対するセグメンテーション処理を行う。被写体毎に領域を分割した上で、その領域毎にその被写体が何であるかを示す、つまり種別を示すラベルを付与する。 Segmentation processing In step ST4, segmentation processing is performed on the input image. After dividing the area for each subject, a label indicating what the subject is, that is, a label indicating the type is given to each area.
 学習用データ領域指定ステップST5では、除去領域位置特定ステップST2と除去情報種別判定ステップST3で得られた除去情報と、セグメンテーション処理ステップST4で得られた各被写体の領域情報およびラベル情報を参照し、除去領域を補間するためのネットワークが学習するために必要な学習用データを構築するため、その学習用データに使用できるデータの領域を入力画像上で指定する。 In the learning data area designation step ST5, the removal information obtained in the removal area position identification step ST2 and the removal information type determination step ST3, and the area information and label information of each subject obtained in the segmentation processing step ST4 are referred to. In order to construct the training data necessary for learning by the network for interpolating the removal area, the area of data that can be used for the training data is specified on the input image.
 学習用データ生成ステップST6では、学習用データ領域指定ステップST5で指定された、学習用データに利用可能な入力画像の領域を参照して、学習用データを構成する。学習用データの構成に際しては、学習用データをこの入力画像のみから構成すること、 所定のラベルに基づいて広範に取得したデータを流用すること、その両者を組み合わせること、のいずれの手段をとってもよい。 In the learning data generation step ST6, the learning data is configured by referring to the area of the input image that can be used for the learning data specified in the learning data area designation step ST5. When constructing the learning data, any means may be taken: the learning data is composed only of this input image, the data widely acquired based on a predetermined label is diverted, or both are combined. ..
 生成系ネットワーク学習ステップST7では、学習用データ生成ステップST6で生成した学習用データを読み出し、補間処理に使用する生成系ニューラルネットワークを学習する。補間画像生成ステップST8では、生成系ネットワーク学習ステップST7で学習された適切な生成系ニューラルネットワークを選択し、除去領域に最も適合するテクスチャを補間画像として生成する。画像統合ステップST9では、補間画像生成ステップST8で生成された補間画像を、入力画像の除去領域に嵌め込んで一枚の画像として統合し、出力画像として出力する。 In the generating network learning step ST7, the learning data generated in the learning data generation step ST6 is read out, and the generating neural network used for the interpolation processing is learned. In the interpolated image generation step ST8, an appropriate generating system neural network learned in the generating system network learning step ST7 is selected, and the texture most suitable for the removal region is generated as an interpolated image. In the image integration step ST9, the interpolated image generated in the interpolation image generation step ST8 is fitted into the removal area of the input image, integrated as one image, and output as an output image.
 上述したように、図2に示す画像処理部105において、処理部400では、生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成するものである。そのため、補間のための特別な信号を用いずに、入力画像の除去領域に見た目の違和感を低減した補間画像を嵌め込むことが可能となる。 As described above, in the image processing unit 105 shown in FIG. 2, the processing unit 400 uses a generating neural network to generate an interpolated image to be fitted in the removal region of the input image. Therefore, it is possible to fit an interpolated image with reduced visual discomfort in the removal region of the input image without using a special signal for interpolation.
 また、図2に示す画像処理部105において、処理部400の生成系ネットワーク学習部406は、除去領域に対応する被写体と同じ種別の画像を生成するように予め学習されている生成系ニューラルネットワークを学習するものである。そのため、除去領域に嵌め込む補間画像生成する生成系ニューラルネットワークの学習を、少ない学習用データで効率よく学習させることが可能となる。 Further, in the image processing unit 105 shown in FIG. 2, the generating network learning unit 406 of the processing unit 400 uses a generating neural network that has been pre-learned to generate an image of the same type as the subject corresponding to the removal region. It is something to learn. Therefore, it is possible to efficiently learn the learning of the generating neural network that generates the interpolated image to be fitted in the removal region with a small amount of learning data.
 <2.変形例>
 [検出部の他の構成例]
 図7は、検出部200の他の構成例を示している。ここでは、検出部200Aとして説明する。この図7において、図4と対応する部分には同一符号を付し、適宜、その詳細説明を省略する。この検出部200Aも、図4に示す検出部200と同様に、変化領域判定部201と、チャプター情報記録部202と、変化情報記録部203と、除去領域位置特定部204と、除去情報種別判定部205と、除去情報記録部206を有している。
<2. Modification example>
[Other configuration examples of the detector]
FIG. 7 shows another configuration example of the detection unit 200. Here, the detection unit 200A will be described. In FIG. 7, the parts corresponding to those in FIG. 4 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate. Similar to the detection unit 200 shown in FIG. 4, the detection unit 200A also has the change area determination unit 201, the chapter information recording unit 202, the change information recording unit 203, the removal area position identification unit 204, and the removal information type determination. It has a unit 205 and a removal information recording unit 206.
 この検出部200Aでは、除去領域位置特定部204および除去情報種別判定部205は、外部の情報を参照することが可能である。上述したように、除去領域位置特定部204は、変化情報記録部203に記録されている領域毎の変化の有無に関する情報をもとに、入力画像から除去すべき領域の位置を特定し、除去位置情報として出力する。 In this detection unit 200A, the removal area position identification unit 204 and the removal information type determination unit 205 can refer to external information. As described above, the removal area position specifying unit 204 identifies and removes the position of the area to be removed from the input image based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Output as position information.
 入力画像を参照すること以外に、(1)視聴者自身が除去すべき領域を指定すること、(2)視聴している番組に関する情報を参照すること、(3)その日に発生した各種イベントに関する情報を参照することも、入力画像から除去すべき領域の位置を特定することに役立つ。 In addition to referring to the input image, (1) specify the area to be removed by the viewer himself, (2) refer to the information about the program being watched, and (3) relate to various events that occurred on that day. Referencing the information also helps to locate the area to be removed from the input image.
 例えば、(1)について、視聴者が画像左上の領域に除去すべき領域があると指定した場合は、その領域で除去対象となり得る情報が存在するかどうかを判断することが可能となる。また、例えば、(2)について、視聴している番組に関する情報から、番組のロゴの外観や表示位置、あるいはテロップの挿入位置や大きさに関する情報を取得することにより、画面上のどの領域に除去対象となり得る情報が存在するかを判断することが可能となる。 For example, regarding (1), when the viewer specifies that there is an area to be removed in the upper left area of the image, it is possible to determine whether or not there is information that can be removed in that area. Further, for example, regarding (2), by acquiring information on the appearance and display position of the program logo, or the insertion position and size of the telop from the information on the program being watched, it is removed in any area on the screen. It is possible to determine whether there is information that can be the target.
 例えば、(3)について、その番組が放送された日とその近辺で発生した事件、事故、災害、選挙等のイベントに関する情報から、ニュース速報や気象情報等が表示されることを予測し、それらの情報が表示される可能性が高い画像上の位置を予測することが可能となる。これにより、除去対象となり得る領域の位置を、精度高く推測することが可能となる。 For example, regarding (3), it is predicted that breaking news, weather information, etc. will be displayed from information on events such as incidents, accidents, disasters, elections, etc. that occurred on and around the day when the program was broadcast, and they are It is possible to predict the position on the image where the information of is likely to be displayed. This makes it possible to accurately estimate the position of the region that can be removed.
 また、除去情報種別判定部205は、上述したように、変化情報記録部203に記録されている領域毎の変化の有無に関する情報をもとに、その領域が有する情報がどのような種類のものであるかを判定し、種別情報として出力する。上述の(1)から(3)の情報を用いることで、除去対象となり得る領域に表示されている情報の種別を、精度高く推測することが可能となる。 Further, as described above, the removal information type determination unit 205 has what kind of information the area has based on the information regarding the presence or absence of change for each area recorded in the change information recording unit 203. Is determined and output as type information. By using the above-mentioned information (1) to (3), it is possible to estimate the type of information displayed in the area that can be removed with high accuracy.
 [処理部の他の構成例]
 図8は、処理部400の他の構成例を示している。ここでは、処理部400Aとして説明する。この図8において、図5と対応する部分には同一符号を付し、適宜、その詳細説明を省略する。この処理部400Aも、図5に示す処理部400と同様に、セグメンテーション処理部401と、セグメンテーション情報記録部402と、学習用データ領域指定部403と、学習用データ生成部404と、学習用データ記録部405と、生成系ネットワーク学習部406と、設定パラメータ記録部407と、生成モデル記録部408と、補間画像生成部409と、画像統合部410を有している。
[Other configuration examples of the processing unit]
FIG. 8 shows another configuration example of the processing unit 400. Here, the processing unit 400A will be described. In FIG. 8, the parts corresponding to those in FIG. 5 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate. Similar to the processing unit 400 shown in FIG. 5, this processing unit 400A also has a segmentation processing unit 401, a segmentation information recording unit 402, a learning data area designation unit 403, a learning data generation unit 404, and learning data. It has a recording unit 405, a generation network learning unit 406, a setting parameter recording unit 407, a generation model recording unit 408, an interpolation image generation unit 409, and an image integration unit 410.
 また、この処理部400Aは、出力画像を評価する評価部411と、その評価情報を記録する評価情報記録部412と、その評価情報に基づいて処理部400Aでの機能を更新するか否かを判断する処理更新部413を、さらに有している。 Further, the processing unit 400A determines whether or not to update the functions of the evaluation unit 411 that evaluates the output image, the evaluation information recording unit 412 that records the evaluation information, and the processing unit 400A based on the evaluation information. Further, it has a processing update unit 413 for determining.
 上述したように、画像統合部410は、補間画像生成部409で生成された補間画像を、入力画像の除去領域に嵌め込んで一枚の画像として統合し、出力画像として出力する。評価部411は、その出力画像が適切であるか否かを評価し、その評価結果を評価情報として出力する。 As described above, the image integration unit 410 fits the interpolated image generated by the interpolation image generation unit 409 into the removal area of the input image, integrates it as a single image, and outputs it as an output image. The evaluation unit 411 evaluates whether or not the output image is appropriate, and outputs the evaluation result as evaluation information.
 評価は、例えば、視聴者がリモコンや端末などで直接的、明示的に入力したり、視聴者の視線・感情・生体情報を計測することで取得したり、視聴者の音声情報から推測したりする方法等が考えられる。 For example, the evaluation can be directly or explicitly input by the viewer using a remote controller or a terminal, obtained by measuring the viewer's line of sight, emotion, or biological information, or inferred from the viewer's voice information. The method of doing this can be considered.
 処理更新部413は、得られた評価情報をもとに、処理部400Aにおけるセグメンテーション処理部401、学習用データ領域指定部403、学習用データ生成部404、生成系ネットワーク学習部406、設定パラメータ記録部407等の機能をどのように更新するかを判断した上で、更新が必要だと判断した場合は機能を更新する。 Based on the obtained evaluation information, the processing update unit 413 includes a segmentation processing unit 401, a learning data area designation unit 403, a learning data generation unit 404, a generation network learning unit 406, and a setting parameter recording in the processing unit 400A. After determining how to update the functions of the unit 407 and the like, if it is determined that the update is necessary, the functions are updated.
 例えば、セグメンテーション処理部401に対しては、セグメンテーション手法やクラス分類数を更新することが考えられる。従来のセグメンテーション手法では分類しきれなかった領域を正確に認識しラベル付けできるようになれば、その領域に対する補間を行う場合に一層再現精度の高いテクスチャを生成し嵌め込むことが可能となる。 For example, for the segmentation processing unit 401, it is conceivable to update the segmentation method and the number of classifications. If it becomes possible to accurately recognize and label a region that cannot be classified by the conventional segmentation method, it becomes possible to generate and fit a texture with higher reproduction accuracy when performing interpolation for that region.
 学習用データ領域指定部403に対しては、セグメンテーション手法やクラス分類数を更新することで、学習用データとして指定すべき領域の精度が向上し、特定のラベルで構成される学習データに誤って識別された学習データが混入することが避けられ、生成系ニューラルネットワーク学習部での学習性能が高くなる。 By updating the segmentation method and the number of classifications for the learning data area designation unit 403, the accuracy of the area to be designated as the learning data is improved, and the learning data composed of a specific label is mistakenly used. It is possible to avoid mixing the identified learning data, and the learning performance in the generation system neural network learning unit is improved.
 学習用データ生成部404に対しては、学習用データを入力画像のみから構成すること、所定のラベルに基づいて広範に取得したデータを流用すること、その両者を組み合わせること、のいずれの手段をとるかを修正することが考えられる。生成系ネットワーク学習部406に対しては、ネットワークの構造を更新することが考えられる。 For the learning data generation unit 404, either means of configuring the learning data only from the input image, diverting the data widely acquired based on a predetermined label, or combining both means. It is conceivable to correct the label. It is conceivable to update the network structure for the generating network learning unit 406.
 設定パラメータ記録部407に対しては、学習用データ生成部404で使用する学習データの種類や学習データの仕様に関するパラメータを更新したり、生成系ネットワーク学習部406での学習回数や学習率といった学習に関するパラメータを更新したりすることが考えられる。 For the setting parameter recording unit 407, parameters related to the type of learning data used in the learning data generation unit 404 and the specifications of the learning data can be updated, and learning such as the number of learning times and the learning rate in the generation network learning unit 406 can be performed. It is conceivable to update the parameters related to.
 上述したように補完した処理画像を視聴者が評価することで、補間に使用した生成ネットワークやその学習データの妥当性を評価し、継続的に改善し続けることが可能となる。 By the viewer evaluating the complemented processed image as described above, it is possible to evaluate the validity of the generated network used for interpolation and its learning data and continuously improve it.
 [処理部の他の構成例]
 図9は、処理部400のさらに他の構成例を示している。ここでは、処理部400Bとして説明する。この図9において、図8と対応する部分には同一符号を付し、適宜、その詳細説明を省略する。この処理部400Bは、基本的な構成は図8に示す処理部400Aと同様である。この処理部400Bでは、処理更新部413において、処理部400Bでの機能を更新するか否かを判断する際、評価情報だけではなく外部設定情報を利用する。
[Other configuration examples of the processing unit]
FIG. 9 shows yet another configuration example of the processing unit 400. Here, it will be described as the processing unit 400B. In FIG. 9, the parts corresponding to those in FIG. 8 are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate. The basic configuration of the processing unit 400B is the same as that of the processing unit 400A shown in FIG. In the processing unit 400B, when the processing update unit 413 determines whether or not to update the function in the processing unit 400B, not only the evaluation information but also the external setting information is used.
 上述の図8における説明のように、処理更新部413では、評価情報をもとに、処理部400A(400B)における各機能をどのように更新するかを判断した上で、更新が必要だと判断した場合は機能を更新する。 As described in FIG. 8 above, the processing update unit 413 determines how to update each function in the processing unit 400A (400B) based on the evaluation information, and then it is necessary to update. If it is determined, the function will be updated.
 処理部400Bにおける処理更新部413では、更に外部から入手可能な設定情報を用いることにより、処理部400Bにおけるセグメンテーション処理部401、学習用データ領域指定部403、学習用データ生成部404、生成系ネットワーク学習部406、設定パラメータ記録部407の機能を同様に更新することが可能である。例えば、処理部400Bと同様の機能を有する他の視聴者の処理部における設定情報や他の視聴者の評価情報を利用することが考えられる。これにより、改善を効率よく行うことが可能となる。 The processing update unit 413 in the processing unit 400B further uses the setting information that can be obtained from the outside, so that the segmentation processing unit 401, the learning data area designation unit 403, the learning data generation unit 404, and the generation network in the processing unit 400B are used. The functions of the learning unit 406 and the setting parameter recording unit 407 can be updated in the same manner. For example, it is conceivable to use the setting information in the processing unit of another viewer having the same function as the processing unit 400B and the evaluation information of the other viewer. This makes it possible to make improvements efficiently.
 なお、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示の技術的範囲はかかる例に限定されない。本開示の技術分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 Although the preferred embodiments of the present disclosure have been described in detail with reference to the accompanying drawings, the technical scope of the present disclosure is not limited to such examples. It is clear that a person having ordinary knowledge in the technical field of the present disclosure can come up with various modifications or modifications within the scope of the technical ideas described in the claims. Of course, it is understood that the above also belongs to the technical scope of the present disclosure.
 また、本明細書に記載された効果は、あくまで説明的または例示的なものであって限定的ではない。つまり、本開示に係る技術は、上記の効果とともに、または上記の効果に代えて、本明細書の記載から当業者には明らかな他の効果を奏しうる。 Further, the effects described in the present specification are merely explanatory or exemplary and are not limited. That is, the techniques according to the present disclosure may exhibit other effects apparent to those skilled in the art from the description herein, in addition to or in place of the above effects.
 なお、本技術は、以下のような構成もとることができる。
 (1)生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る処理部を備える
 画像処理装置。
 (2)前記処理部は、
 前記入力画像に対するセグメンテーション処理を行って画像内に含まれる各被写体の領域情報を得るセグメンテーション処理部と、
 前記除去領域の情報と前記各被写体の領域情報に基づいて、前記入力画像上で、前記生成系ニューラルネットワークが学習するために必要な学習用データに使用できるデータ領域を指定する学習用データ領域指定部と、
 前記データ領域に基づいて、前記入力画像から学習用データを生成する学習用データ生成部と、
 前記学習用データを用いて前記生成系ニューラルネットワークを学習する生成系ネットワーク学習部と、
 前記生成系ニューラルネットワークを用いて前記除去領域に嵌め込む補間画像を生成する補間画像生成部と、
 前記補間画像を前記入力画像の除去領域に嵌め込んで前記出力画像を得る画像統合部を有する
 前記(1)に記載の画像処理装置。
 (3)前記生成系ネットワーク学習部は、前記除去領域に対応する被写体と同じ種別の画像を生成するように予め学習されている前記生成系ニューラルネットワークをベースとして、前記学習用データでさらに学習する
 前記(2)に記載の画像処理装置。
 (4)前記入力画像に含まれる前記除去領域の情報を得る検出部をさらに備える
 前記(1)から(3)のいずれかに記載の画像処理装置。
 (5)前記検出部は、外部からの情報に基づいて、前記除去領域の情報を得る
 前記(4)に記載の画像処理装置。
 (6)前記出力画像の評価情報に基づいて、前記処理部の機能を更新する処理更新部をさらに備える
 前記(1)から(5)のいずれかに記載の画像処理装置。
 (7)前記処理更新部は、さらに外部設定情報に基づいて、前記処理部の機能を更新する
 前記(6)に記載の画像処理装置。
 (8)生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る手順を有する
 画像処理方法。
 (9)コンピュータを、
 生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る処理手段として機能させる
 プログラム。
 (10)入力画像を参照して画像内で変化がある領域と変化がない領域を判定する変化領域判定部と、
 前記入力画像のチャプターに関する情報が記録されているチャプター情報記録部と、
 前記入力画像に対する領域毎の変化の有無に関する情報を記録する変化情報記録部と、
 前記変化情報記録部に記録されている領域毎の変化の有無に関する情報をもとに、前記入力画像から除去すべき領域の位置を特定し除去位置情報として出力する除去領域位置特定部と、
 前記変化情報記録部に記録されている領域毎の変化の有無に関する情報をもとに、その領域が有する情報がどのような種類のものであるかを判定し種別情報として出力する除去情報種別判定部と、
 前記除去位置情報と前記種別情報を統合して記録する除去情報記録部と、
 前記入力画像に対するセグメンテーション処理を行うセグメンテーション処理部と、
 前記セグメンテーション処理により得られた被写体の領域情報とラベル情報を対応づけて記録するセグメンテーション情報記録部と、
 前記除去情報記録部に記録されている除去情報を参照し、除去領域を補間するためのネットワークが学習するために必要な学習用データを構築するため、その学習用データに使用できるデータの領域を入力画像上で指定する学習用データ領域指定部と、
 前記学習用データ領域指定部で指定された、学習用データに利用可能な入力画像の領域を参照して学習用データを生成する学習用データ生成部と、
 前記学習用データ生成部で生成された学習用データを記録する学習用データ記録部と、
 前記学習用データ記録部から学習用データを読み出し, 補間処理に使用する生成系ニューラルネットワークを学習する生成系ネットワーク学習部と、
 前記学習用データ生成部で使用する学習データの種類や学習データの仕様に関するパラメータと、前記生成系ネットワーク学習部での学習回数や学習率といった学習に関するパラメータが記録される設定パラメータ記録部と、

 前記生成系ネットワーク学習部で生成されたネットワークモデルが記録される生成モデル記録部と、
 前記生成モデル記録部から適切なネットワークモデルを選択し, 除去領域に最も適合するテクスチャを補間画像として生成する補間画像生成部と、
 前記補間画像生成部で生成された補間画像を、入力画像の該当領域に嵌め込んで一枚の画像として統合し出力画像として出力する画像統合部とを備える
 を画像処理装置。
 (11)前記除去領域位置特定部と前記除去情報種別判定部において外部からの情報を参照する
 前記(10)に記載の画像処理装置。
 (12)前記画像統合部から出力される画像の品質を評価する評価部と、
 前記評価部で得られた評価情報に基づいて機能を更新するか否かを判断する処理更新部をさらに備える前記(10)または(11)に記載の画像処理装置。
 (13)前記処理更新部において機能を更新するか否かを判断する際、前記評価情報だけではなく外部設定情報を利用する
 前記(12)に記載の画像処理装置。
 (14)入力画像を参照して画像内で変化がある領域と変化がない領域を判定する変化領域判定ステップと、
 前記変化情報記録部に記録されている領域毎の変化の有無に関する情報をもとに、前記入力画像から除去すべき領域の位置を特定し除去位置情報として出力する除去領域位置特定ステップと、
 前記変化情報記録部に記録されている領域毎の変化の有無に関する情報をもとに、その領域が有する情報がどのような種類のものであるかを判定し種別情報として出力する除去情報種別判定ステップと、
 前記入力画像に対するセグメンテーション処理を行うセグメンテーション処理ステップと、
 前記除去情報記録部に記録されている除去情報を参照し、除去領域を補間するためのネットワークが学習するために必要な学習用データを構築するため、その学習用データに使用できるデータの領域を入力画像上で指定する学習用データ領域指定ステップと、
 前記学習用データ領域指定部で指定された、学習用データに利用可能な入力画像の領域を参照して学習用データを生成する学習用データ生成ステップと、
 前記学習用データ記録部から学習用データを読み出し、補間処理に使用する生成系ニューラルネットワークを学習する生成系ネットワーク学習ステップと、
 前記生成モデル記録部から適切なネットワークモデルを選択し、除去領域に最も適合するテクスチャを補間画像として生成する補間画像生成ステップと、
 前記補間画像生成部で生成された補間画像を、入力画像の該当領域に嵌め込んで一枚の画像として統合し出力画像として出力する画像統合ステップを有する
 画像処理方法。
The present technology can have the following configurations.
(1) An image processing apparatus including a processing unit that generates an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fits the interpolated image into the removal region of the input image to obtain an output image.
(2) The processing unit is
A segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image.
Designation of a learning data area for designating a data area that can be used for learning data required for learning by the generation neural network on the input image based on the information of the removal area and the area information of each subject. Department and
A learning data generation unit that generates learning data from the input image based on the data area,
A generating network learning unit that learns the generating neural network using the learning data,
An interpolated image generator that generates an interpolated image to be fitted in the removal region using the generating neural network,
The image processing apparatus according to (1), further comprising an image integration unit for fitting the interpolated image into a removal region of the input image to obtain the output image.
(3) The generating network learning unit further learns from the learning data based on the generating neural network that has been previously trained to generate an image of the same type as the subject corresponding to the removal region. The image processing apparatus according to (2) above.
(4) The image processing apparatus according to any one of (1) to (3) above, further comprising a detection unit for obtaining information on the removal region included in the input image.
(5) The image processing apparatus according to (4), wherein the detection unit obtains information on the removal area based on information from the outside.
(6) The image processing apparatus according to any one of (1) to (5) above, further comprising a processing updating unit that updates the function of the processing unit based on the evaluation information of the output image.
(7) The image processing apparatus according to (6), wherein the processing update unit further updates the function of the processing unit based on external setting information.
(8) An image processing method comprising a procedure of generating an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fitting the interpolated image into the removal region of the input image to obtain an output image.
(9) Computer
A program that uses a generating neural network to generate an interpolated image to be fitted into the removal area of an input image, and fits the interpolated image into the removal area of the input image to function as a processing means for obtaining an output image.
(10) A change area determination unit that determines an area with a change and an area without a change in the image with reference to the input image, and
A chapter information recording unit in which information about chapters of the input image is recorded, and
A change information recording unit that records information regarding the presence or absence of changes in each region with respect to the input image, and
Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position identification unit that identifies the position of the area to be removed from the input image and outputs it as removal position information,
Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information. Department and
A removal information recording unit that integrates and records the removal position information and the type information,
A segmentation processing unit that performs segmentation processing on the input image,
A segmentation information recording unit that records the area information of the subject obtained by the segmentation process and the label information in association with each other.
In order to construct the learning data necessary for learning by the network for interpolating the removal area by referring to the removal information recorded in the removal information recording unit, the area of data that can be used for the learning data is determined. The learning data area specification part specified on the input image and
A learning data generation unit that generates learning data by referring to an input image area that can be used for learning data, which is designated by the learning data area designation unit.
A learning data recording unit that records learning data generated by the learning data generation unit,
A generating network learning unit that reads learning data from the learning data recording unit and learns a generating neural network used for interpolation processing.
A setting parameter recording unit that records parameters related to the type of learning data used in the learning data generation unit and learning data specifications, and learning parameters such as the number of learning times and the learning rate in the generation network learning unit.

A generation model recording unit that records the network model generated by the generation network learning unit, and a generation model recording unit.
An interpolated image generator that selects an appropriate network model from the generated model recording unit and generates the texture that best fits the removal area as an interpolated image.
An image processing apparatus including an image integrating unit that fits the interpolated image generated by the interpolated image generation unit into a corresponding area of an input image, integrates the interpolated image as a single image, and outputs the output image.
(11) The image processing apparatus according to (10), wherein the removal area position specifying unit and the removal information type determination unit refer to information from the outside.
(12) An evaluation unit that evaluates the quality of the image output from the image integration unit, and
The image processing apparatus according to (10) or (11), further comprising a processing updating unit that determines whether or not to update the function based on the evaluation information obtained by the evaluation unit.
(13) The image processing apparatus according to (12), wherein not only the evaluation information but also the external setting information is used when the processing update unit determines whether or not to update the function.
(14) A change area determination step for determining an area where there is a change and an area where there is no change in the image with reference to the input image, and
Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, the removal area position specifying step of specifying the position of the area to be removed from the input image and outputting it as the removal position information,
Based on the information regarding the presence or absence of change for each area recorded in the change information recording unit, it is determined what kind of information the area has, and the removal information type determination is output as type information. Steps and
A segmentation processing step for performing segmentation processing on the input image, and
In order to construct the learning data necessary for learning by the network for interpolating the removal area by referring to the removal information recorded in the removal information recording unit, the area of data that can be used for the learning data is determined. The learning data area specification step specified on the input image and
A learning data generation step of generating learning data by referring to an area of an input image that can be used for learning data, which is designated by the learning data area designation unit, and
A generating network learning step of reading learning data from the learning data recording unit and learning a generating neural network used for interpolation processing.
An interpolated image generation step of selecting an appropriate network model from the generated model recording unit and generating the texture most suitable for the removal area as an interpolated image.
An image processing method including an image integration step of fitting an interpolated image generated by the interpolated image generation unit into a corresponding region of an input image, integrating the image as a single image, and outputting the image as an output image.
 10・・・テレビ受信装置
 101・・・受信アンテナ
 102・・・デジタル放送受信部
 103・・・表示部
 104・・・記録/再生部
 105・・・画像処理部
 106・・・CPU
 107・・・ユーザ操作部
 200,200A・・・検出部
 201・・・変化領域判定部
 202・・・チャプター情報記録部
 203・・・変化情報記録部
 204・・・除去領域位置特定部
 205・・・除去情報種別判定部
 206・・・除去情報記録部
 300・・・除去部
 400,400A,400B・・・処理部
 401・・・セグメンテーション処理部
 402・・・セグメンテーション情報記録部
 403・・・学習データ領域指定部
 404・・・学習用データ生成部
 405・・・学習用データ記録部
 406・・・生成系ネットワーク学習部
 407・・・設定パラメータ記録部
 408・・・生成モデル記録部
 409・・・補間画像生成部
 410・・・画像統合部
 411・・・評価部
 412・・・評価情報記録部
 413・・・処理更新部
10 ... TV receiver 101 ... Reception antenna 102 ... Digital broadcast receiver 103 ... Display unit 104 ... Recording / playback unit 105 ... Image processing unit 106 ... CPU
107 ・ ・ ・ User operation unit 200, 200A ・ ・ ・ Detection unit 201 ・ ・ ・ Change area determination unit 202 ・ ・ ・ Chapter information recording unit 203 ・ ・ ・ Change information recording unit 204 ・ ・ ・ Removal area position identification unit 205 ・・ ・ Removal information type determination unit 206 ・ ・ ・ Removal information recording unit 300 ・ ・ ・ Removal unit 400, 400A, 400B ・ ・ ・ Processing unit 401 ・ ・ ・ Segmentation processing unit 402 ・ ・ ・ Segmentation information recording unit 403 ・ ・ ・Learning data area designation unit 404 ・ ・ ・ Learning data generation unit 405 ・ ・ ・ Learning data recording unit 406 ・ ・ ・ Generation network learning unit 407 ・ ・ ・ Setting parameter recording unit 408 ・ ・ ・ Generation model recording unit 409 ・・ ・ Interpolated image generation unit 410 ・ ・ ・ Image integration unit 411 ・ ・ ・ Evaluation unit 412 ・ ・ ・ Evaluation information recording unit 413 ・ ・ ・ Processing update unit

Claims (9)

  1.  生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る処理部を備える
     画像処理装置。
    An image processing apparatus including a processing unit that generates an interpolated image to be fitted into a removal region of an input image using a generation neural network, and fits the interpolated image into the removal region of the input image to obtain an output image.
  2.  前記処理部は、
     前記入力画像に対するセグメンテーション処理を行って画像内に含まれる各被写体の領域情報を得るセグメンテーション処理部と、
     前記除去領域の情報と前記各被写体の領域情報に基づいて、前記入力画像上で、前記生成系ニューラルネットワークが学習するために必要な学習用データに使用できるデータ領域を指定する学習用データ領域指定部と、
     前記データ領域に基づいて、前記入力画像から学習用データを生成する学習用データ生成部と、
     前記学習用データを用いて前記生成系ニューラルネットワークを学習する生成系ネットワーク学習部と、
     前記生成系ニューラルネットワークを用いて前記除去領域に嵌め込む補間画像を生成する補間画像生成部と、
     前記補間画像を前記入力画像の除去領域に嵌め込んで前記出力画像を得る画像統合部を有する
     請求項1に記載の画像処理装置。
    The processing unit
    A segmentation processing unit that performs segmentation processing on the input image to obtain area information of each subject included in the image.
    Designation of a learning data area for designating a data area that can be used for learning data required for learning by the generation neural network on the input image based on the information of the removal area and the area information of each subject. Department and
    A learning data generation unit that generates learning data from the input image based on the data area,
    A generating network learning unit that learns the generating neural network using the learning data,
    An interpolated image generator that generates an interpolated image to be fitted in the removal region using the generating neural network,
    The image processing apparatus according to claim 1, further comprising an image integration unit for fitting the interpolated image into a removal region of the input image to obtain the output image.
  3.  前記生成系ネットワーク学習部は、前記除去領域に対応する被写体と同じ種別の画像を生成するように予め学習されている前記生成系ニューラルネットワークをベースとして、前記学習用データでさらに学習する
     請求項2に記載の画像処理装置。
    The generating network learning unit further learns from the learning data based on the generating neural network that has been previously trained to generate an image of the same type as the subject corresponding to the removal region. The image processing apparatus according to.
  4.  前記入力画像に含まれる前記除去領域の情報を得る検出部をさらに備える
     請求項1に記載の画像処理装置。
    The image processing apparatus according to claim 1, further comprising a detection unit for obtaining information on the removal region included in the input image.
  5.  前記検出部は、外部からの情報に基づいて、前記除去領域の情報を得る
     請求項4に記載の画像処理装置。
    The image processing apparatus according to claim 4, wherein the detection unit obtains information on the removal region based on information from the outside.
  6.  前記出力画像の評価情報に基づいて、前記処理部の機能を更新する処理更新部をさらに備える
     請求項1に記載の画像処理装置。
    The image processing apparatus according to claim 1, further comprising a processing update unit that updates the function of the processing unit based on the evaluation information of the output image.
  7.  前記処理更新部は、さらに外部設定情報に基づいて、前記処理部の機能を更新する
     請求項6に記載の画像処理装置。
    The image processing apparatus according to claim 6, wherein the processing update unit further updates the function of the processing unit based on external setting information.
  8.  生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る手順を有する
     画像処理方法。
    An image processing method comprising a procedure of generating an interpolated image to be fitted in a removal region of an input image using a generation neural network, and fitting the interpolated image into the removal region of the input image to obtain an output image.
  9.  コンピュータを、
     生成系ニューラルネットワークを用いて入力画像の除去領域に嵌め込む補間画像を生成し、該補間画像を前記入力画像の除去領域に嵌め込んで出力画像を得る処理手段として機能させる
     プログラム。
    Computer,
    A program that uses a generating neural network to generate an interpolated image to be fitted into the removal area of an input image, and fits the interpolated image into the removal area of the input image to function as a processing means for obtaining an output image.
PCT/JP2020/027931 2019-08-30 2020-07-17 Image processing apparatus, image processing method, and program WO2021039192A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-157826 2019-08-30
JP2019157826 2019-08-30

Publications (1)

Publication Number Publication Date
WO2021039192A1 true WO2021039192A1 (en) 2021-03-04

Family

ID=74684141

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/027931 WO2021039192A1 (en) 2019-08-30 2020-07-17 Image processing apparatus, image processing method, and program

Country Status (1)

Country Link
WO (1) WO2021039192A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09200684A (en) * 1996-01-12 1997-07-31 Sony Corp Video/audio signal recording device
JP2014212434A (en) * 2013-04-18 2014-11-13 三菱電機株式会社 Device and method for video signal processing, program, and recording medium
JP2019079114A (en) * 2017-10-20 2019-05-23 キヤノン株式会社 Image processing device, image processing method, and program
WO2019159424A1 (en) * 2018-02-16 2019-08-22 新東工業株式会社 Evaluation system, evaluation device, evaluation method, evaluation program, and recording medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09200684A (en) * 1996-01-12 1997-07-31 Sony Corp Video/audio signal recording device
JP2014212434A (en) * 2013-04-18 2014-11-13 三菱電機株式会社 Device and method for video signal processing, program, and recording medium
JP2019079114A (en) * 2017-10-20 2019-05-23 キヤノン株式会社 Image processing device, image processing method, and program
WO2019159424A1 (en) * 2018-02-16 2019-08-22 新東工業株式会社 Evaluation system, evaluation device, evaluation method, evaluation program, and recording medium

Similar Documents

Publication Publication Date Title
KR101318459B1 (en) Method of viewing audiovisual documents on a receiver, and receiver for viewing such documents
US20100313214A1 (en) Display system, system for measuring display effect, display method, method for measuring display effect, and recording medium
US10957024B2 (en) Real time tone mapping of high dynamic range image data at time of playback on a lower dynamic range display
CN101377917B (en) Display apparatus
US20120229489A1 (en) Pillarboxing Correction
US8446432B2 (en) Context aware user interface system
CN109379628A (en) Method for processing video frequency, device, electronic equipment and computer-readable medium
US9773523B2 (en) Apparatus, method and computer program
US20160381290A1 (en) Apparatus, method and computer program
KR20110074107A (en) Method for detecting object using camera
JP5116513B2 (en) Image display apparatus and control method thereof
CN101611629A (en) Image processing equipment, moving-image reproducing apparatus and processing method thereof and program
JP2008244811A (en) Frame rate converter and video picture display unit
WO2021039192A1 (en) Image processing apparatus, image processing method, and program
KR20050026965A (en) Method of and system for controlling the operation of a video system
JP2013179563A (en) Video processing device, video display device, video recording device, video processing method, and video processing program
TW201802664A (en) Image output device, image output method, and program
WO2020234939A1 (en) Information processing device, information processing method, and program
CN107995538B (en) Video annotation method and system
JP6080667B2 (en) Video signal processing apparatus and method, program, and recording medium
US20110235997A1 (en) Method and device for creating a modified video from an input video
KR20120098622A (en) Method for adding voice content to video content and device for implementing said method
CN107743710A (en) Display device and its control method
JP6930880B2 (en) Video display device and video production support method
CN102111630B (en) Image processing device and image processing method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20858523

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20858523

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP