WO2019082268A1

WO2019082268A1 - Image processing device, and image processing method

Info

Publication number: WO2019082268A1
Application number: PCT/JP2017/038339
Authority: WO
Inventors: 彰峯澤; 守屋　芳美; 杉本　和夫
Original assignee: 三菱電機株式会社
Priority date: 2017-10-24
Filing date: 2017-10-24
Publication date: 2019-05-02
Also published as: TWI673652B; TW201917631A; JP6782789B2; JPWO2019082413A1; WO2019082413A1

Abstract

There is provided an evaluation value calculation unit (3) for calculating an evaluation value used in determining the presence of a scene change from a feature quantity in units of blocks included in two pictures calculated by a feature quantity calculation unit (2), and a presence determination unit (4) compares the evaluation value calculated by the evaluation value calculation unit (3) and a threshold value and thereby determines the presence of a scene change between the two pictures. This makes it possible to determine the presence of a scene change even for a picture in which no motion vectors are included.

Description

Image processing apparatus and image processing method

The present invention relates to an image processing apparatus and an image processing method for determining the presence or absence of a scene change which is a change point of a scene.

Patent Document 1 below discloses an image processing apparatus that determines the presence or absence of a scene change.
This image processing apparatus acquires a motion vector, which is one of the coding parameters of a plurality of pictures, from the encoded video stream, and determines the presence or absence of a scene change from the difference of the motion vectors among the plurality of pictures. I have to.

International Publication No. 2010/125757

Since the conventional image processing apparatus is configured as described above, the presence or absence of a scene change can be determined if a motion vector is included as a coding parameter of a picture in a coded video stream. . However, since the coding mode of a picture is, for example, an intra-frame coding mode, a motion vector may not be included as a coding parameter of the picture. There is a problem that it is not possible to determine the presence or absence of a scene change for a picture that does not include a motion vector as a coding parameter.

The present invention has been made to solve the above-described problems, and it is an object of the present invention to provide an image processing apparatus and an image processing method capable of determining the presence or absence of a scene change even for a picture containing no motion vector. To aim.

The image processing apparatus according to the present invention selects one of a plurality of pictures as a picture to be evaluated for the presence or absence of a scene change, which is a scene change point, by a picture selection unit and a picture selection unit. The feature quantity calculation unit that calculates the feature quantity of each block included in the picture and the feature quantity of the block unit included in the two pictures calculated by the feature quantity calculation unit There is provided an evaluation value calculation unit that calculates an evaluation value used to determine the presence or absence of a change, and the presence / absence determination unit compares the evaluation value calculated by the evaluation value calculation unit with a threshold to obtain between two pictures. It is determined whether there is a scene change.

According to the present invention, the evaluation value calculation unit is provided for calculating an evaluation value used to determine the presence or absence of a scene change from the feature amounts in block units included in two pictures calculated by the feature amount calculation unit. However, since it is configured to determine the presence or absence of a scene change between two pictures by comparing the threshold with the evaluation value calculated by the evaluation value calculation unit, the picture including no motion vector is also included. There is an effect that it is possible to determine the presence or absence of a scene change.

FIG. 1 is a block diagram showing an image processing apparatus according to Embodiment 1 of the present invention. It is a hardware block diagram which shows the image processing apparatus by Embodiment 1 of this invention. FIG. 17 is a hardware configuration diagram of a computer in the case where the image processing apparatus is realized by software or firmware. 7 is a flowchart showing an image processing method which is a processing procedure when the image processing apparatus is realized by software or firmware. FIG. 7 is an explanatory drawing showing an example of selecting an evaluation target picture from among a plurality of pictures included in a coded video stream. FIG. 7 is an explanatory drawing showing an example of selecting an evaluation target picture from among a plurality of pictures included in a coded video stream. The coding method is AVC / H. H.264 or HEVC / H. It is explanatory drawing which shows the structural example of the encoding video stream in the case of being H.265. It is explanatory drawing which shows an example of the code amount of processing block unit. It is a block diagram which shows the other image processing apparatus by Embodiment 1 of this invention. It is an explanatory view showing a processing block which specifies code amount among processing blocks contained in a picture for evaluation. It is an explanatory view showing a processing block which specifies code amount among processing blocks contained in a picture for evaluation. It is an explanatory view showing a processing block which specifies code amount among processing blocks contained in a picture for evaluation. It is explanatory drawing which shows the hierarchy number of the division block in a coding block. It is an explanatory view showing an example of a processing block and a coding block. It is an explanatory view showing an example of a processing block and a coding block.

Hereinafter, in order to explain the present invention in more detail, a mode for carrying out the present invention will be described according to the attached drawings.

Embodiment 1
FIG. 1 is a block diagram showing an image processing apparatus according to Embodiment 1 of the present invention. FIG. 2 is a hardware configuration diagram showing an image processing apparatus according to Embodiment 1 of the present invention.
In FIG. 1, the picture selection unit 1 is realized by, for example, the picture selection circuit 11 shown in FIG.
The picture selection unit 1 executes a process of acquiring a coded video stream including coded data of a plurality of pictures.
The picture selection unit 1 determines a candidate picture to be evaluated for the presence or absence of a scene change, which is a scene change point, from among a plurality of pictures having encoded data encoded in encoded block units (block units), A process of selecting two pictures is performed as a target picture for which the presence or absence of a scene change is to be evaluated from candidate pictures. However, the method of determining candidate pictures is assumed to be previously defined.

The coded data of a picture is, for example, data coded in coding block units called macroblocks, or in coding block units called CTU (Coding Tree Unit).
As a standard of a coding scheme for coding in macro block units, for example, AVC / H. The H.264 standard is disclosed in the following non-patent document 1.
In addition, as a standard of a coding scheme for coding in CTU units, for example, HEVC / H. The H.265 standard is disclosed in the following non-patent document 2.
[Non-Patent Document 1] ISO / IEC 14496-10 / ITU-T H.2. H.264 standard [non-patent document 2] ISO / IEC 23008-2 / ITU-T H.264. 265 standard

The feature quantity calculation unit 2 is realized by, for example, the feature quantity calculation circuit 12 shown in FIG.
The feature amount calculation unit 2 performs, for each of the pictures selected by the picture selection unit 1, a process of calculating a feature amount in units of processing blocks from coded data in units of coding blocks in the picture. A processing block is composed of one or more coding blocks, and the configuration is predefined. Since the processing block is composed of one or more coding blocks, the smallest definable unit of the processing block is the coding block (processing block = coding block).
FIG. 14 shows an example in which a processing block is composed of four coding blocks. Further, as in the example shown in FIG. 15, the number of coding blocks in the vertical direction and the horizontal direction constituting the processing block may be different.
The feature amount in processing block units is calculated from the coded data of each coding block in the processing block. In the first embodiment, the coded data of each coding block is decoded, the code amount of each coding block is specified from the decoding result of the coded data, and the coding blocks belonging to each processing block are identified. The sum of code amounts is taken as a feature amount in processing block units.
Note that the processing block sizes of two pictures selected by the picture selection unit 1 (configurations of coding blocks in processing blocks to be defined in advance) are the same.

The evaluation value calculation unit 3 is realized by, for example, the evaluation value calculation circuit 13 shown in FIG.
The evaluation value calculation unit 3 executes a process of calculating an evaluation value to be used to determine the presence or absence of a scene change from the feature amounts of the processing block unit included in the two pictures calculated by the feature amount calculation unit 2.

The presence / absence determination unit 4 is realized by, for example, the presence / absence determination circuit 14 illustrated in FIG.
The presence / absence determination unit 4 executes a process of determining the presence / absence of a scene change between two pictures by comparing the evaluation value calculated by the evaluation value calculation unit 3 with a threshold.

In FIG. 1, each of the picture selection unit 1, the feature amount calculation unit 2, the evaluation value calculation unit 3, and the presence / absence determination unit 4 which are components of the image processing apparatus is realized by dedicated hardware as shown in FIG. 2. The thing is assumed. That is, what is realized by the picture selection circuit 11, the feature amount calculation circuit 12, the evaluation value calculation circuit 13, and the presence / absence determination circuit 14 is assumed.
Here, the picture selection circuit 11, the feature value calculation circuit 12, the evaluation value calculation circuit 13, and the presence / absence determination circuit 14 are, for example, a single circuit, a composite circuit, a processor that has been programmed, a processor that is parallel programmed, ASIC (Application Specific) An integrated circuit), an FPGA (field-programmable gate array), or a combination thereof is applicable.

The components of the image processing apparatus are not limited to those realized by dedicated hardware, and the image processing apparatus may be realized by software, firmware, or a combination of software and firmware.
The software or firmware is stored as a program in the memory of the computer. A computer means hardware that executes a program, and corresponds to, for example, a central processing unit (CPU), a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, a processor, a digital signal processor (DSP), etc. .
The memory of the computer may be, for example, non-volatile or volatile such as random access memory (RAM), read only memory (ROM), flash memory, erasable programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM). A semiconductor memory, a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, a DVD (Digital Versatile Disc), etc. correspond.

FIG. 3 is a hardware configuration diagram of a computer when the image processing apparatus is realized by software or firmware.
When the image processing apparatus is realized by software or firmware, the memory 21 has a program for causing a computer to execute the processing procedure of the picture selection unit 1, feature quantity calculation unit 2, evaluation value calculation unit 3, and presence / absence determination unit 4. The program may be stored and the processor 22 of the computer may execute the program stored in the memory 21.
FIG. 4 is a flowchart showing an image processing method which is a processing procedure when the image processing apparatus is realized by software or firmware.

Also, FIG. 2 shows an example in which each of the components of the image processing apparatus is realized by dedicated hardware, and FIG. 3 shows an example in which the image processing apparatus is realized by software or firmware. Some components of the image processing apparatus may be realized by dedicated hardware, and the remaining components may be realized by software, firmware or the like.

Next, the operation will be described.
The picture selection unit 1 obtains a coded video stream including coded data of a plurality of pictures.
The picture selecting unit 1 determines a candidate picture from the obtained encoded video stream according to a method for determining a candidate picture defined in advance, and evaluates the presence or absence of a scene change which is a scene change point among the candidate pictures. Two pictures are selected as the picture of (1) (step ST1 in FIG. 4).
Here, of the two selected pictures, the later picture in the reproduction order is referred to as the “evaluation target picture”, and the previous picture in the reproduction order is referred to as the “reference picture”. “Evaluate the presence or absence of a scene change between two pictures” means that there is a scene change from the picture one after the reference picture in the playback order to the evaluation target picture (including the evaluation target picture itself) It shows that it evaluates whether it is not.

5 and 6 determine a candidate picture to be an evaluation target picture from among pictures in which encoded data is included in a coded video stream, and select an evaluation target picture and a reference picture from candidate pictures. It is an explanatory view showing an example to do.
In FIG. 5, all of the plurality of pictures in which the encoded data is included in the encoded video stream are set as candidate pictures, and the evaluation target pictures are sequentially selected in the reproduction order from the candidate pictures. An example is shown in which all pictures included in the encoded video stream are selected as pictures to be evaluated.
In FIG. 6, a plurality of pictures in which the encoded data is included in the encoded video stream has a GOP (Group Of Picture) structure, and the intra picture of the random access point in each GOP is taken as a candidate picture. An example is shown in which pictures are selected as evaluation target pictures in order of reproduction from pictures.
AVC / H. In H.264, Instantaneous Decoding Refresh (IDR) pictures are intra pictures of random access points.
HEVC / H. At 265, an Intra Random Access Point (IRAP) picture is an intra picture of a random access point.
The reference picture is a candidate picture one before the evaluation target picture in reproduction order in any of FIGS. 5 and 6. Therefore, in the example of FIG. 5, it is evaluated whether there is a scene change between two adjacent pictures, that is, whether the evaluation target picture itself is a scene change point. In the example of FIG. 6, the presence or absence of a scene change between the intra pictures of random access points in adjacent GOPs is evaluated.
In addition to the above, candidate picture definitions are optionally set, such as selecting an evaluation target picture at N (N is an integer of 0 or more) picture intervals (in the case of N = 0, the same as in FIG. 5). can do.

FIG. 7 shows that the coding method is AVC / H. H.264 or HEVC / H. It is explanatory drawing which shows the structural example of the encoding video stream in the case of being H.265.
In FIG. 7, a plurality of access units are arranged in coding order, and one access unit indicates data of one picture.
Each access unit is composed of a plurality of NAL (Network Abst Ruction Layer) units.
Each NAL unit is classified into header information (non-VCL) and picture data (VCL) which is coded data of a picture.
The header information (non-VCL) includes an access unit delimiter, a sequence level header and a picture level header.
AVC / H. H.264 or HEVC / H. At 265, one or more NAL units constituting one access unit are continuous, and a method of identifying the head position of the access unit is defined so that the division position between adjacent access units on the stream can be known. There is. As one example, an access unit delimiter which is an NAL unit indicating the beginning of an access unit is defined.
The NAL unit illustrated in FIG. 7 is an example, and the configuration pattern of the NAL unit conforms to the specifications of the respective standards. At this time, there are cases where no access unit delimiter exists in the NAL unit that constitutes the access unit, and even in that case, the head position of the access unit should be identified according to the identification method of the head position of the access unit defined in the standard. It is possible.

Therefore, the picture selection unit 1 can identify the break position of the access unit which is a picture by identifying the start position of the access unit according to the identification method of the start position of the access unit defined in the standard.
Since the picture selection unit 1 can identify the break position of the access unit, it can select two pictures from among a plurality of access units, and calculates the feature amount of the encoded data of the selected two pictures Output to part 2.

The feature quantity calculation unit 2 calculates feature quantities in processing block units from the encoded data in coding block units for each of the pictures selected by the picture selection unit 1 (step ST2 in FIG. 4).
The process of calculating the feature amount in units of processing blocks by the feature amount calculation unit 2 will be specifically described below.
Here, the feature amount of the processing block included in the picture P _t of the picture number t selected by the picture selection unit 1 is C _{n, t} , the picture number of the reference picture is t = tp, and the picture number of the evaluation target picture is Let t = tc.
n is a block number for specifying a processing block to be subjected to feature amount calculation in a picture, and is numbered 1, 2,..., NUM_BL _{t in} raster scan order from the upper left processing block. Here, NUM_BL _t indicates the total number of processing blocks for feature amount calculation in the picture P _t of the picture number t. If the processing block size is 64 × 64 pixels and the resolution of the picture is 3840 × 2160 pixels, NUM_BL _t = 60 × 34 = 2040, assuming that the whole of one screen is targeted. However, the block in the lowermost row has a block size of 64 × 48 pixels.
Therefore, feature quantities C _{n, tp} and C _{n, tc} at t = tp and t = _tc are respectively calculated according to the following.

Feature amount calculation unit 2, implemented decoding syntax in the feature quantity calculation target picture P _t, from the decoding processing result of the syntax, the code amount S _n of the processing _{block, t (n} = _1,2, ·・・, NUM_BL _t ) is specified respectively.
For example, the feature amount calculation unit 2 decodes each of all the encoding parameters included in the encoded data of each encoding block belonging to the processing block for which the feature amount is to be calculated, The sum of code amounts is specified as a code amount _{Sn, t} .
The coding parameter is a parameter necessary for generating a decoded image of a coding block, and corresponds to, for example, a parameter indicating a coding mode, an intra prediction parameter, an inter prediction parameter, a prediction difference coding parameter, a motion vector, and the like.

Next, the feature amount calculating unit 2, the code amount S _ALL feature quantity calculation target picture P _{_t,} identifies _t, as shown in the following equation (1), the feature quantity calculation target picture P _t code amount S _ALL, in _t, the code amount S _n of the processing _blocks, normalized by dividing _t, feature amount C _n of the processing _block, as _t, the code amount S _n of normalized processing block _units, the _t M Double. M is a preset constant.

Here, the feature amount calculating section 2, the code amount _{S ALL} feature quantity calculation target picture _{P _t,} is shown an example of specifying a _t, the feature quantity calculation target picture _{P t} code amount _{S ALL,} as _t, code amount S _n of all of the processing blocks included in the feature quantity calculation target picture P _{_t,} example of calculating the sum of _t is considered.
However, this is only an example, the code amount S _ALL feature quantity calculation target picture P _{_t,} as _t, the size of the picture data (VCL) of the feature quantity calculation target picture P _t may be specified. The size of the picture data (VCL), without performing the decoding process of syntax in the feature quantity calculation target picture P _t, can be identified.

The feature amount C _{n, t} of the processing block unit calculated by the equation (1) represents the ratio of the code amount of each processing block, and the influence of the variation of the size of the total code amount for each picture is suppressed It is a feature amount.
FIG. 8 is an explanatory diagram of an example of the code amount in processing block units.
FIG. 8 shows an example in which the processing block is composed of 2 × 2 coded blocks and the coded block is a macroblock or CTU, and the code amount of each coded block is indicated by a numeral. Furthermore, at the center of each processing block, the sum of the code amounts of the coding blocks belonging to the processing block is represented as the code amount of the processing block.
Although the unit of the code amount of the coding block is represented by bytes in the example of FIG. 8, this is merely an example, and may be represented by bit units.

Evaluation value calculation unit 3, the feature amount C _n of the processing block in the reference picture P _tp calculated by the feature amount calculation unit _2, and _tp, the feature amount C _n of the processing block in the evaluation object picture P _{_tc,} and _tc Then, the evaluation value J used to determine the presence or absence of a scene change is calculated (step ST3 in FIG. 4).
For example, as the evaluation value J used to determine the presence or absence of a scene change, the evaluation value calculation unit 3 performs processing blocks between the reference picture P _tp and the evaluation target picture P _tc as shown in the following equation (2). Calculate the sum of absolute differences of the feature amounts of the unit.

Evaluation value J calculated by the evaluation value calculating section 3, as the pattern of change between the reference picture P _tp evaluated picture P _tc is large, the reference process block units of picture P _tp evaluated picture P _tc The difference in the distribution of the code amount of increases, and the value increases.
Also, MIN_NUM_BL is the minimum value of NUM_BL _tp and NUM_BL _tc . That is, when the number of processing blocks of the reference picture P _tp and the evaluation target picture P _tc is different, only the processing block whose feature amount is calculated for both pictures is the calculation target of the evaluation value J.
For example, when the sizes of the encoded data of the reference picture P _tp and the evaluation target picture P _tc are largely different, the decoding processing time also tends to be largely different. In such a case, when the time allocated to decoding processing of the encoded data of each picture and calculation of the feature amount in processing block units is uniformly limited, the result that the number of process blocks that can be processed is different in both pictures It becomes. In such a case, the evaluation value J can be obtained for only the processing block whose feature amount has been calculated for both pictures. By determining the processing time T that can be consumed in each picture as described above, feature amount calculation processing at a desired processing speed (frame rate 1 / T) becomes possible.

The presence / absence determination unit 4 compares the evaluation value J calculated by the evaluation value calculation unit 3 with the threshold value Th set in advance (step ST4 in FIG. 4).
If the evaluation value J calculated by the evaluation value calculation unit 3 is equal to or greater than the threshold Th (if step ST4 in FIG. 4: YES), the presence / absence determination unit 4 determines between the reference picture _Ptp and the evaluation target picture _Ptc. It is determined that there is a scene change (step ST5 in FIG. 4).
If the evaluation value J calculated by the evaluation value calculation unit 3 is less than the threshold Th (in the case of NO in step ST4 of FIG. 4), the presence / absence determination unit 4 _selects between the reference picture _Ptp and the evaluation target picture _Ptc. It is determined that there is no scene change (step ST6 in FIG. 4).
The presence determination unit 4 outputs the determination result of the presence or absence of a scene change.
Note that the threshold Th may be a fixed value set in advance, or may be switched or changed according to a specific condition. For example, a method of preparing and switching the threshold Th for each type of content (drama, news, sports, etc.), a method of preparing and switching the threshold Th for each broadcasting station, and for each encoder generating a stream A method of preparing and switching the threshold Th, a method of adaptively calculating the threshold Th according to the setting of the encoder that generates the stream (a method of calculating the threshold Th with the setting value of the encoder as a variable), Method of calculating threshold Th adaptively according to code amount transition (Prepare a formula for calculating threshold Th with time change of code amount (1st to nth differential in the time direction, n is an integer of 1 or more) as a variable) Is considered.

If there is still a picture not selected as an evaluation target picture among the candidate pictures determined by the picture selection unit 1 (step ST7 in FIG. 4: YES), the process returns to step ST1 and step ST1. The processing of ̃ST7 is repeated.
In the process of step ST1 returned, the process of calculating the feature amount by the feature amount calculating unit 2 is omitted for the picture of which the feature amount is already calculated among the two pictures selected by the picture selecting unit 1 You may do so. By doing this, it is not necessary to recalculate feature amounts, and processing time can be shortened. For example, as described in FIG. 5 and FIG. 6, when the evaluation target picture is selected from among the candidate pictures in the reproduction order and the reference picture is made one candidate picture in reproduction order from the evaluation target picture, the evaluation target The feature amount C _{n, tc} in the picture is the feature amount C _{n, tp} in the reference picture when the candidate picture one ahead in the reproduction order is selected as the evaluation target picture. Therefore, the feature amount calculation process can be omitted by sequentially storing the calculated feature amounts C _{n and tc} and reading out the feature amounts C _{n and tp} from the stored feature amounts.
In addition, when the decoding order and the reproduction order are different, the evaluation target picture is selected in the decoding order, and the presence or absence of a scene change is evaluated as a candidate picture closest to the evaluation target picture in decoding order before the reference object You may Also in this case, for the picture whose feature amount has already been calculated, the process may be omitted by diverting the calculation result of the feature amount.
If no candidate picture not selected as the evaluation target picture remains in the candidate pictures determined by the picture selection unit 1 (in the case of NO at step ST7 in FIG. 4), a series of processing ends.

As apparent from the above, according to the first embodiment, the evaluation value used to determine the presence or absence of a scene change is calculated from the feature amount in units of blocks included in the two pictures calculated by the feature amount calculation unit 2 The evaluation value calculation unit 3 is provided, and the presence / absence determination unit 4 determines the presence / absence of a scene change between two pictures by comparing the evaluation value calculated by the evaluation value calculation unit 3 with a threshold. Since the configuration is made, it is possible to determine the presence or absence of a scene change even for a picture that does not include a motion vector.

Although the first embodiment shows an example in which the coded video stream is given to the picture selection unit 1 of the image processing apparatus, a media transport stream storing the coded video stream may be given to the image processing apparatus. .
In this case, as shown in FIG. 9, the demultiplexer 5 may be provided to extract the encoded video stream from the media transmission stream and output the encoded video stream to the picture selection unit 1.
FIG. 9 is a block diagram showing another image processing apparatus according to Embodiment 1 of the present invention.
The media transmission stream may be, for example, a media transmission format such as MPEG-2 TS (Transport Stream of ISO / IEC 13818-1 / ITU-T H.222.0) or MMT (ISO / IEC 23008-1). A packetized media transport stream is assumed.

The demultiplexer 5 can know the division between each access unit and the type and size of the NAL unit from the header information of the given media transport stream.
If the demultiplexer 5 gives information indicating the division between each access unit to the picture selection unit 1, the picture selection unit 1 can encode the encoded video stream based on the information provided from the demultiplexer 5. A desired picture can be selected from
In addition, if the demultiplexer 5 supplies information indicating the type and size of the NAL unit to the feature amount calculation unit 2, the feature amount calculation unit 2 does not perform the decoding process of the syntax in the evaluation target picture. The code amount of the evaluation target picture can be specified.

In the first embodiment, an example is shown in which the feature amount calculation unit 2 targets the processing block of the whole picture in the calculation processing of the code amount _{Sn, t} of the processing block.
This is merely an example, and in order to reduce the processing load of the feature amount calculation unit 2, only a part of processing blocks in a picture are targeted in the calculation process of the code amount _{Sn, t} of the processing block. May be

Specifically, as shown in FIG. 10, only every other processing block may be targeted in each of the row direction and the column direction.
FIG. 10 is an explanatory diagram of a processing block for specifying a code amount among processing blocks constituting a picture.
In FIG. 10, the processing block in which “1” is described is a processing block that specifies a code amount, and the processing block in which “0” is described is a processing block that does not specify a code amount.

Further, as shown in FIG. 11, processing blocks every other row or processing blocks every other column may be targeted.
FIG. 11 is an explanatory view showing a processing block for specifying a code amount among processing blocks constituting a picture.
In FIG. 11, the processing block in which “1” is described is a processing block that specifies the code amount, and the processing block in which “0” is described is a processing block that does not specify the code amount.

As shown in FIG. 12, in the process of calculating the code amount S _{n, t} of the processing block, the processing block from the upper left of the picture to the specific processing block in raster scan order is targeted, and the remaining processing blocks are not targeted. May be
FIG. 12 is an explanatory view showing a processing block for specifying a code amount among processing blocks constituting a picture.
In FIG. 12, the processing block in which “1” is described is a processing block that specifies a code amount, and the processing block in which “0” is described is a processing block that does not specify a code amount.
In FIGS. 10 to 12, the processing block in which “1” is described and the processing block in which “0” is described may be reversed.

Second Embodiment
In the first embodiment, the feature quantity calculation unit 2 decodes the encoded data of the encoded block, specifies the code amount of the processing block from the decoding result of the encoded data, and processes from the code amount of the processing block The example which calculates the feature-value of a block is shown.
In the second embodiment, for each processing block of the picture P _t of the picture number t selected by the picture selection unit 1, the feature quantity calculation unit 2 sets the processing block as the feature quantity C _{n, t} of the processing block. The number of divided blocks included in may be used.

The number of divided blocks included in the processing block is calculated by adding the number of divided blocks included in each coding block belonging to the processing block.
The number of divided blocks included in the coding block is, for example, the coding method of AVC / H. H.264 corresponds to the number of blocks in a macroblock determined by the macroblock type (mb_type). The number of blocks in a macroblock is obtained by performing syntax decoding processing on a picture to be evaluated.
The coding method is HEVC / H. If it is 265, it corresponds to the number of CUs (Coding Units) in the CTU which is a coding block. The number of CUs in a CTU can be obtained by performing syntax decoding processing on a picture to be evaluated.
When specifying the number of divided blocks included in the processing block as the feature amount of the processing block, the feature amount calculating unit 2 does not need to perform processing for normalizing the code amount when calculating the feature amount of the processing block As a result, the processing load of the feature amount calculation unit 2 is reduced.

Third Embodiment
In the first embodiment, the feature quantity calculation unit 2 decodes the encoded data of the encoded block, specifies the code amount of the processing block from the decoding result of the encoded data, and processes from the code amount of the processing block The example which calculates the feature-value of a block is shown.
In the third embodiment, for each processing block of the picture P _t of the picture number t selected by the picture selection unit 1, the feature quantity calculation unit 2 sets the processing block as the feature quantity C _{n, t} of the processing block. The highest hierarchical number of divided blocks in the above may be used.

The deepest hierarchy number of divided blocks in the processing block is calculated by obtaining the maximum value of the deepest hierarchy number of divided blocks in each coding block belonging to the processing block.
For example, the number of layers in the deepest layer in the coding block is the coding scheme HEVC / H. If it is 265, as shown in FIG. 13, the hierarchy of division corresponds to the deepest number among the plurality of CUs included in the CTU which is a coding block.
FIG. 13 is an explanatory drawing showing the number of layers of divided blocks in a coding block.
In the example of FIG. 13, since the maximum value of CU depth is 3, the number of deepest layers in the divided block in the coding block is three.
The deepest hierarchy number of the divided blocks can be obtained by performing syntax decoding processing on the evaluation target picture.
When specifying the number of deepest layers of the divided block in the processing block as the feature amount of the processing block, the feature amount calculation unit 2 does not need to perform processing for normalizing the code amount when calculating the feature amount of the processing block. As a result, the processing load of the feature amount calculation unit 2 is reduced.

Fourth Embodiment
In the above embodiment, the evaluation value calculation unit 3 calculates the feature amount C _{n, tp} of the processing block unit in the evaluation target picture P _tp calculated by the feature amount calculation unit 2 and the processing block unit in the evaluation target picture P _tc An example in which the evaluation value J is calculated from the feature amounts C _{n and t} c is shown.
In the fourth embodiment, the evaluation value calculation unit 3 processes the feature amount C _{n, tp} of the processing block unit in the reference picture P _tp of the picture number t = _tp and the evaluation target picture P _tc of the picture number t = tc. Features of processing block unit in the feature quantity C _{n, tc of} block unit and secondary reference picture P _{tp '} of picture number t = tp' one time before the reference picture P _tp among the candidate pictures in reproduction time An example of calculating the evaluation value J from the amounts C _{n and tp ′} will be described.
Incidentally, the secondary reference picture _{P tp 'is} the reference picture of the reference picture _{P tp,} i.e., a reference picture _{P tp} when the reference picture _{P tp} was evaluated picture _{P tc.} Therefore, the reproduction time positional relationship of the three pictures is tp '<tp <tc, and tc-tp = tp-tp'.

Next, the operation will be described.
In the fourth embodiment, processing is performed according to the flowchart of FIG. 4 as in the first embodiment, but the processing content is changed to the processing described below.

As in the first embodiment, the picture selection unit 1 obtains a coded video stream including coded data of a plurality of pictures.
The picture selection unit 1 determines a candidate picture from among a plurality of pictures in which the encoded data is included in the encoded video stream according to a method for determining a candidate picture defined in advance, and performs the above implementation from among the candidate pictures. Similar to mode 1, the reference picture P _tp and the evaluation target picture P _tc are selected.
Also, the picture selection unit 1 selects a secondary reference picture _{Ptp ′} from the candidate pictures. That is, in step ST1 of FIG. 4, in addition to the evaluation target picture and the reference picture, the secondary reference picture is also selected.

As in the first embodiment, the feature quantity calculation unit 2 calculates, for each of the pictures selected by the picture selection unit 1, the feature of processing block unit from the encoded data of coding block unit included in the evaluation target picture. The quantities are calculated respectively (step ST2 in FIG. 4).
That is, the feature amount calculation unit 2 calculates the feature amount C _n of the processing blocks included in the reference picture P _{_tp,} and _tp, the feature amount C _n of the processing blocks included in the evaluation object picture P _{_tc,} and _tc .
Also, the feature quantity calculation unit 2 calculates feature quantities C _{n, tp ′} in units of processing blocks included in the secondary reference picture _{Ptp ′} .

The evaluation value calculation unit 3 calculates the feature amount C _{n, tp of the} processing block unit included in the reference picture P _tp calculated by the feature amount calculation unit 2 and the feature amount C of the processing block unit included in the evaluation target picture P _tc _An evaluation value J is calculated from _{n, tc} and the feature amount C _{n, tp 'in} units of processing blocks included in the secondary reference picture P _tp' (step ST3 in FIG. 4).
Specifically, the evaluation value calculation unit 3 calculates the sum of absolute differences of the feature amounts in processing block units between the reference picture P _tp and the evaluation target picture P _tc .
In addition, the evaluation value calculation unit 3 calculates the sum of absolute differences of feature quantities in processing block units between the reference picture P _tp and the secondary reference picture P _{tp ′} .
The evaluation value calculation unit 3 calculates, as the evaluation value J, the difference between the calculated difference absolute value sums as shown in the following equation (3).

Here, MIN_NUM_BL 'is the minimum value of three values of NUM_BL _tp' , NUM_BL _tp , and NUM_BL _tc . That is, when the numbers of processing blocks of the secondary reference picture _{Ptp '} , the reference picture _Ptp, and the evaluation target picture _Ptc are not all the same, only the processing block whose feature amount is calculated for all three pictures is the evaluation value J It becomes the object of calculation.
In the evaluation value J calculated by the evaluation value calculation unit 3, the change in the feature amount in processing block units becomes small without much change in pattern between the secondary reference picture _{Ptp '} and the reference picture _Ptp , and further When the design largely changes between the reference picture P _tp and the evaluation target picture P _{tc and} the change of the feature amount in processing block units becomes large, the value becomes large.
Therefore, in the first to third embodiments, the magnitude of the change in the pattern between the reference picture P _tp and the evaluation target picture P _t _c is evaluated. However, in the fourth embodiment, the pattern in the reference picture P _tp Since the magnitude of the difference between the degree of change of the pattern and the degree of change of the pattern in the evaluation target picture P _tc is evaluated, the pattern of the whole screen is large in gradual change of the pattern between the pictures that are not scene changes. It is possible to reduce the possibility of erroneous detection as a changing scene change, and to improve the scene change detection accuracy more than the first to third embodiments.
Further, the definition method of the processing block to be the target of calculation of the feature amount in processing block units is the same as that of the first embodiment (all processing blocks in a picture, partial processing blocks as shown in FIGS. It can be defined arbitrarily.

Similar to the first embodiment, the presence / absence determination unit 4 compares the evaluation value J calculated by the evaluation value calculation unit 3 with the threshold value Th set in advance (step ST4 in FIG. 4).
As in the first embodiment, if the evaluation value J calculated by the evaluation value calculation unit 3 is equal to or larger than the threshold value Th, the presence / absence determination unit 4 sets the scene between the reference picture P _tp and the evaluation target picture P _tc. It is determined that there is a change (step ST5 in FIG. 4).
As in the first embodiment, if the evaluation value J calculated by the evaluation value calculation unit 3 is smaller than the threshold value Th, the presence / absence determination unit 4 sets the scene between the reference picture P _tp and the evaluation target picture P _tc. It is determined that there is no change (step ST6 in FIG. 4).
The presence determination unit 4 outputs the determination result of the presence or absence of a scene change.
Further, the method of setting the threshold value Th is the same as that of the first embodiment.

If there is still a picture not selected as an evaluation target picture among the candidate pictures determined by the picture selection unit 1 (step ST7 in FIG. 4: YES), the process returns to step ST1 and step ST1. The processing of ̃ST7 is repeated.
In the process of step ST1 returned, among the three pictures selected by the picture selection unit 1, for the picture for which the feature amount has already been calculated, the process of calculating the feature amount by the feature amount calculation unit 2 is omitted. You may do so. By doing this, it is not necessary to recalculate feature amounts, and processing time can be shortened.
If there is no picture which has not been selected as a target picture for evaluation yet in the candidate pictures determined by the picture selection unit 1 (in the case of NO at step ST7 in FIG. 4), a series of processing ends.

In the scope of the invention, the present invention allows free combination of each embodiment, or modification of any component of each embodiment, or omission of any component in each embodiment. .

The present invention is suitable for an image processing apparatus and an image processing method for determining the presence or absence of a scene change which is a change point of a scene.

Reference Signs List 1 picture selection unit, 2 feature amount calculation unit, 3 evaluation value calculation unit, 4 presence / absence determination unit, 5 demultiplexers, 11 picture selection circuit, 12 feature amount calculation circuit, 13 evaluation value calculation circuit, 14 presence / absence determination circuit, 21 memory , 22 processors.

Claims

A picture selection unit for selecting two pictures as pictures to be evaluated for the presence or absence of a scene change which is a scene change point among a plurality of pictures;
A feature amount calculation unit that calculates feature amounts in units of blocks included in each of the two pictures selected by the picture selection unit;
An evaluation value calculation unit that calculates an evaluation value used to determine the presence or absence of a scene change from the feature amount in block units included in the two pictures calculated by the feature amount calculation unit;
An image processing apparatus comprising: a presence / absence determination unit that determines presence / absence of a scene change between the two pictures by comparing the threshold with the evaluation value calculated by the evaluation value calculation unit.
The feature quantity calculation unit specifies, for each block of the picture selected by the picture selection unit, a code quantity of the block, and calculates a feature quantity of the block from the code quantity of the block. An image processing apparatus according to claim 1.
The feature quantity calculation unit specifies, for each block of a picture selected by the picture selection unit, the number of divided blocks included in the block as the feature quantity of the block. Image processing apparatus as described.
The feature quantity calculation unit is characterized by specifying, for each block of the picture selected by the picture selection unit, as the feature quantity of the block, the deepest hierarchy number of divided blocks included in the block. An image processing apparatus according to claim 1.
The evaluation value calculating unit may calculate a sum of absolute differences of feature amounts in units of blocks between two pictures calculated by the feature amount calculating unit as an evaluation value used to determine the presence or absence of a scene change. The image processing apparatus according to claim 1, characterized in that
The picture selection unit selects three pictures as pictures to be evaluated for the presence or absence of a scene change,
The feature quantity calculation unit calculates feature quantities in units of blocks included in each of the three pictures selected by the picture selection unit,
The evaluation value calculation unit is a block unit feature between the second picture and the third picture in reproduction order among the three pictures for which the feature amount calculation unit calculates the feature amount in block units. In addition to calculating the sum of absolute differences of quantities, the sum of absolute differences of feature quantities in units of blocks between the first picture and the second picture in reproduction order is calculated, and the presence or absence of a scene change is determined. As the evaluation value to be used, calculate the difference between the calculated sum of absolute differences of both,
The presence / absence determination unit compares the evaluation value calculated by the evaluation value calculation unit with a threshold to determine the presence or absence of a scene change between the second picture and the third picture in the reproduction order. An image processing apparatus according to claim 1, characterized in that.
The picture selection unit selects two pictures from among the plurality of pictures as pictures to be evaluated for the presence or absence of a scene change,
The feature amount calculation unit calculates feature amounts in units of blocks included in each of the two pictures selected by the picture selection unit,
An evaluation value calculation unit calculates an evaluation value to be used to determine the presence or absence of a scene change from the feature amounts in block units included in the two pictures calculated by the feature amount calculation unit,
An image processing method, comprising: determining a presence or absence of a scene change between the two pictures by comparing a threshold with the evaluation value calculated by the evaluation value calculator.