CN1717058A

CN1717058A - Method and apparatus for coding images and method and apparatus for decoding the images

Info

Publication number: CN1717058A
Application number: CNA2005100814941A
Authority: CN
Inventors: 渡边刚; 冈田茂之
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2004-06-29
Filing date: 2005-06-29
Publication date: 2006-01-04
Anticipated expiration: 2025-06-29
Also published as: CN100442854C; JP2006014121A

Abstract

A region of interest is set within an image, the region of interest is tracked along motion of an object marked out within the image, and coding is performed in a manner that image quality differs between the region of interest and a region other than the region of interest. A wavelet transform unit applies a low-pass filter and a high-pass filter in the respective x and y directions of an original image, and divides the image into four frequency sub-bands so as to carry out a wavelet transform. A quantization unit quantizes, with a predetermined quantizing width, the wavelet transform coefficients outputted from the wavelet transform unit. A motion detector detects the motion of an object. A ROI setting unit moves a ROI region according to this motion of an object. In the case of moving images where a viewpoint changes, the background may be separated from the object and then the ROI region may be moved according to the motion of the object and the motion of the background.

Description

Method for encoding images and device and image decoding method and device

Technical field

The present invention relates to method for encoding images, picture coding device and camera head, relating in particular to can be to carry out image encoded code device, method for displaying image and camera head and can adjust picture quality and the image decoding method and the device of decoding in each different picture quality in zone.

Background technology

In ISO/ITU-T, follow-up as the JPEG (JointPhotographic Expert Group) of the standard technique of the compressed encoding of rest image adopted the standardization of the JPEG2000 of discrete wavelet transform (DWT).In JPEG2000, can be encoded to the picture quality of the wide region till the lossless compression from low bit rate with high-performance code, scalable (scalability) function that realizes improving gradually picture quality also is easy.In addition, in JPEG2000, can prepare the function that does not have in the multiple existing Joint Photographic Experts Group.

As one of function of JPEG2000, standardize: region-of-interest (the Regionof Interest of image; ROI) also preferentially encode than other zones, and the ROI that transmits coding.By the ROI coding, exist at encoding rate under the situation of the upper limit, except the reproduced picture quality that can make region-of-interest is preferential high-quality, can also be when the sequential decoding encoding stream, ahead of time with high-quality regeneration region-of-interest.

Patent documentation 1 discloses the technology in a plurality of ROI zone in the automatic recognition image data.

Put down in writing in the paragraph 0079 as above-mentioned patent documentation 1, under the situation of dynamic (activity) image photography pattern, taken relatively image automatic setting ROI zone.Yet above-mentioned patent documentation 1 has been discerned in image under the situation of a plurality of movable bodies, and the possibility of setting in the ROI zone till the movable body that is intended to according to photography is arranged.Though the record that can select the ROI zone of liking from a plurality of ROI zone is arranged, and under the situation of dynamic image, it is miscellaneous selecting by every frame, can not in the dynamic image photography, select.In addition, carry out the identification in ROI zone by every frame and handle, operand increases, and the burden of signal processing increases.

In addition, in ISO/ITU-T, follow-up as the JPEG (Joint Photographic Expert Group) of the standard technique of the compressed encoding of rest image adopted the standardization of the JPEG2000 of discrete wavelet transform (DWT).In JPEG2000, can be encoded to the picture quality of the wide region till the lossless compression from the low level bit rate with high-performance code, scalability (scalability) function that realizes improving gradually picture quality also is easy.In addition, in JPEG2000, can prepare the function that does not have in the multiple existing Joint Photographic Experts Group.

Patent documentation 2 discloses a kind of technology of removing noise or emphasizing the image processing at edge etc. when this coded image that carries out overcompression of decoding, in order to improve picture quality.Specifically be, the conversion coefficient that sub-band comprised beyond the LL sub-band is made as 0, form with reference to image.Ask for conversion coefficient in this sub-band relative with reference to the zone on the image, to ask for the mean value of the pixel value in this zone.If this mean value etc. then carry out threshold process to this conversion coefficient less than the threshold value of regulation.

Above-mentioned patent documentation 2, because the conversion coefficient in the sub-band beyond the LL sub-band is carried out above-mentioned processing, so operand rolls up.In addition, be difficult in image, make the difference of picture quality, to making the outstanding degree of certain target.

[patent documentation 1] spy opens the 2004-72655 communique

[patent documentation 2] spy opens the 2002-135593 communique

Summary of the invention

The present invention carries out in view of the above problems, one of the present invention's purpose is to provide a kind of target image quality that can the user be paid close attention to maintain user's desired horizontal on one side, Yi Bian reduce method for encoding images, picture coding device and the camera head of the size of code of dynamic image.

Another purpose of two of the present invention is to provide a kind of can easily make outstanding image decoding method, image decoder and the camera head of being paid close attention to of target.

In order to solve above-mentioned problem, the method for encoding images of one of the present invention's (execution mode 1) a certain form, in image, set region-of-interest, make the activity (motion) of the target of being paid close attention in the region-of-interest tracking map picture, encode with different picture quality with zone in addition at region-of-interest.The setting of initial region-of-interest can be operated by the user and carry out.

According to this form, can be on one side the picture quality of region-of-interest be maintained the desired level of user, reduces the picture quality in the zone of not paying close attention on one side, also can reduce size of code.In addition, can also reduce the picture quality of region-of-interest consciously.

Have: the region-of-interest configuration part of in image, setting region-of-interest; The activity detection portion of the activity of the concern target in the detected image; With in region-of-interest and in addition zone, the encoding section of encoding with different picture quality; The activity that the region-of-interest configuration part follows the trail of the objective region-of-interest." activity of target " also can detect with activity vector.

According to this form, can be on one side the picture quality of region-of-interest be maintained the desired level of user, reduces the picture quality in the zone of not paying close attention on one side, also can reduce size of code.Can also reduce the picture quality of region-of-interest consciously.And then, even do not carry out the identification of region-of-interest or the setting of user's operation, also can follow the trail of the objective automatically at every frame.

Can also have the picture quality configuration part, it sets the picture quality of region-of-interest with exterior domain according to institute's assigned code amount.So-called " institute's assigned code amount " can be the size of code of distributing to each frame, also can be the size of code of distributing to whole dynamic image." picture quality configuration part " can dynamically adjust picture quality in encoding process.Even under the condition of institute's assigned code amount, also can the picture quality of region-of-interest be maintained the desired level of user by the size of code of adjusting non-region-of-interest.

Also have the target extraction unit that the background in the dynamic image of viewpoint change is separated, the region-of-interest configuration part can be according to the activity of background, and the activity that region-of-interest is followed the trail of the objective.Thus, offset, thereby can obtain and the fixing equal precision of situation of viewpoint by activity with background.

Another form of one of the present invention is a camera head.This device has the image pickup part of obtaining image, sets region-of-interest in image, makes the activity of the target of being paid close attention in the region-of-interest tracking map picture, encodes with different picture quality with zone in addition at region-of-interest.

According to this form, can be on one side the picture quality of region-of-interest be maintained the desired level of user, reduces the picture quality in the zone of not paying close attention on one side, also can reduce size of code.Can also reduce the picture quality of region-of-interest consciously.

Other forms of one of the present invention also are camera heads.This device has: the image pickup part of obtaining image; In image, set the region-of-interest configuration part of region-of-interest; The activity detection portion of the activity of the concern target in the detected image; With in region-of-interest and in addition zone, the encoding section of encoding with different picture quality; The activity that the region-of-interest configuration part follows the trail of the objective region-of-interest.The setting of initial region-of-interest can be operated by the user and carry out.

According to this form, can be on one side the picture quality of region-of-interest be maintained the desired level of user, reduces the picture quality in the zone of not paying close attention on one side, also can reduce size of code.Can also reduce the picture quality of region-of-interest consciously.And then, can obtain: even do not carry out the identification of region-of-interest or the setting of user's operation, the also camera head that can follow the trail of the objective automatically at every frame.

And the form of between method, device, system, computer program, recording medium etc. conversion was carried out in the combination in any of above inscape, performance of the present invention also is effective as form of the present invention.

In addition, the image decoding method of the present invention two (execution modes 2) is set region-of-interest in image, makes the activity of the concern target in the region-of-interest tracking map picture, deciphers dynamic image with zone in addition with different picture quality at region-of-interest.According to this form, can easily make the target of concern outstanding.

Other forms of two of the present invention are image decoders.This device has: the region-of-interest configuration part of setting region-of-interest in image; The activity detection portion of the activity of the target of paying close attention in the detected image; With the decoding part of deciphering dynamic image in region-of-interest and zone in addition with different picture quality.The activity that the region-of-interest configuration part follows the trail of the objective region-of-interest.The initial setting of " region-of-interest " can be operated by the user and carry out.According to this form, can easily make the target of being paid close attention to outstanding.In addition, the operand in the time of can reducing decoding.

Can also have the picture quality configuration part, it sets region-of-interest and this region-of-interest picture quality with at least one side of exterior domain with reference to the state of this device.The battery remaining power or the reproduction speed that can comprise device in " state of this device ".According to this form, the state that can be adapted to device is deciphered image.

Another form of two of the present invention is a camera head.This device has the image pickup part of obtaining image.In image, set region-of-interest, make the activity of the concern target in the region-of-interest tracking map picture, in region-of-interest and in addition zone with different picture quality demonstration dynamic images.According to this form, can easily make the concern target in the image that photographs outstanding.

The present invention's two another form or camera head.This device has: the image pickup part of obtaining image; In image, set the region-of-interest configuration part of region-of-interest; The activity detection portion of the activity of the concern target in the detected image; In region-of-interest and in addition zone encoding section with different picture quality coding dynamic images; With the decoding part that the coded view data of encoding section is deciphered.The activity that the region-of-interest configuration part follows the trail of the objective region-of-interest.

According to this form, can easily make the concern target in the image that photographs outstanding.In addition, can reduce the size of code of coded image.

And the form of between method, device, system, computer program, recording medium etc. conversion was carried out in the performance of the combination in any of above inscape, execution mode 2 also is effective as form of the present invention.

Description of drawings

Fig. 1 is the figure of the formation of the related picture coding device of the embodiment 1 of expression execution mode 1.

Fig. 2 (a) represents wavelet transform coefficients, and (b) expression ROI coefficient only amplifies 5 state in proportion, and (c) expression scans the state of the quantized value of the wavelet transform coefficients of amplifying in proportion in order from high-order bit plane.

Fig. 3 (a) is illustrated in the state of having selected region-of-interest on the original image, (b) expression is by with original image the 1st layer changing image obtaining for 1 time of wavelet transform only, and (c) expression is further carried out the 2nd layer the changing image that wavelet transform (wavelet transform) obtains by the sub-band LL1 to the changing image of (b).

Fig. 4 (a) represents wavelet transform coefficients, (b) expression is changed to zero state with the S position of the LSB side of non-ROI conversion coefficient, and (c) expression scans in order from high-order bit plane and comprises the state of wavelet transform coefficients that carries out the non-ROI conversion coefficient of zero passage displacement with the ROI conversion coefficient.

The wavelet transform coefficients of 5 bit planes that Fig. 5 (a) expression only is made of non-ROI conversion coefficient, (b) expression is replaced into zero wavelet transform coefficients with low level 2 bit planes of LSB side, and (c) expression begins a high position 3 bit planes of the wavelet transform coefficients after zero displacement are carried out the state of entropy coding from above.

Fig. 6 is the figure of the formation of the related picture coding device of the embodiment 2 of expression execution mode 1.

Fig. 7 is the figure of the formation of the related picture coding device of the embodiment 3 of expression execution mode 1.

Fig. 8 (a) represents former frame, (b) expression present frame, (c) expression difference image.

Fig. 9 is the figure of the formation of the related camera head of the embodiment 4 of expression execution mode 1.

Figure 10 (a) is illustrated in the state of the target that designated user is paid close attention in the image, (b) is illustrated in the state of setting the ROI zone in the image, and (c) state of target is removed in expression from the ROI zone, (d) represents the state of the activity that the ROI zone follows the trail of the objective.

Figure 11 (a) expression user sets the state in ROI zone in image, (b) be illustrated in the state of the target that designated user is paid close attention in the ROI zone, (c) represents the state of the activity that the ROI zone follows the trail of the objective.

Figure 12 (a) sets the state of the scope of following the trail of in the ROI zone, and (b) state in ROI zone is set in expression, and (c) the expression target moves and appears at state beyond the big frame.

Figure 13 is the figure of the formation of the related image decoder of the embodiment 1 of expression execution mode 2.

Figure 14 (a) is illustrated in the state of having selected region-of-interest on the original image, (b) expression is by with original image the 1st layer changing image obtaining for 1 time of wavelet transform only, and (c) expression is further carried out the 2nd layer the changing image that wavelet transform obtains by the sub-band LL1 to the changing image of (b).

The wavelet transform coefficients of Figure 15 (a) expression decoding image, (b) expression ROI conversion coefficient and non-ROI conversion coefficient, (c) expression is changed to zero state with low level 2 positions of non-ROI conversion coefficient.

Figure 16 is the figure of the formation of the related image decoder of the embodiment 2 of expression execution mode 2.

Figure 17 is the figure of the formation of the related camera head of the embodiment 3 of expression execution mode 2.

Figure 18 is the figure of the formation of the related encoding block of the embodiment 3 of expression execution mode 2.

Figure 19 (a) represents wavelet transform coefficients, and (b) expression ROI coefficient has only amplified 5 state in proportion, and (c) expression scans the state of the quantized value of the wavelet transform coefficients of amplifying in proportion in order from high-order bit plane

Figure 20 (a) represents wavelet transform coefficients, (b) expression is changed to zero state with the S position of the LSB side of non-ROI conversion coefficient, and (c) expression scans in order from high-order bit plane and comprises the state of wavelet transform coefficients that carries out the non-ROI conversion coefficient of zero passage displacement with the ROI conversion coefficient.

Figure 21 (a) is illustrated in the state of the target that designated user is paid close attention in the image, (b) is illustrated in the state of setting the ROI zone in the image, and (c) state of target is removed in expression from the ROI zone, (d) represents the state of the activity that the ROI zone follows the trail of the objective.

Figure 22 (a) expression user sets the state in ROI zone in image, (b) be illustrated in the state of the target that designated user is paid close attention in the ROI zone, (c) represents the state of the activity that the ROI zone follows the trail of the objective.

Figure 23 (a) sets the state of the scope of following the trail of in the ROI zone, and (b) state in ROI zone is set in expression, and (c) the expression target moves and appears at state beyond the big frame.

Embodiment

At first execution mode 1 is described.

(embodiment 1 of execution mode 1)

Fig. 1 is the pie graph of the related picture coding device 100 of the embodiment 1 of execution mode 1.The formation of picture coding device 100, can realize by CPU, memory, other LSI of any computer at hardware aspect, aspect software, can wait and realize by the program that memory loaded with encoding function, but described here be the functional block that realizes by working in coordination with of these.Therefore, these functional blocks can be only by hardware, only realize with various forms that by software or by these combination this it will be appreciated by those skilled in the art that.

Picture coding device 100 as an example, utilizes the JPEG2000 mode to carry out compressed encoding the original image imported.The original image that is input to picture coding device 200 is the frame of dynamic image.Picture coding device 100 can be encoded to each frame of dynamic image continuously in the JPEG mode, generates the encoding stream of dynamic image.

Wavelet transform portion 10 carries out sub-band with the original image of being imported to be cut apart, and calculates the wavelet transform coefficients of each sub-band image, generates by the wavelet transform coefficients of layering.Specifically be, wavelet transform coefficients 10 adopts low pass filter and high pass filter in the x of original image, y all directions, is divided into the row-wavelet conversion of going forward side by side of 4 frequency sub-bands.These sub-bands are: have the LL sub-band of low-frequency component in x, y two directions; Wherein any direction at x, y has low-frequency component, and has the HL and the LH sub-band of radio-frequency component at other direction; And have the HH sub-band of radio-frequency component in x, y two directions.The pixel count in length and breadth of each sub-band be before handling image separately 1/2, can obtain resolution, be that picture size is 1/4 sub-band image with 1 filtering.

LL sub-band in 10 pairs of sub-bands that obtain like this of wavelet transform portion is carried out Filtering Processing once again, and it further is divided into the laggard row-wavelet conversion of 4 sub-bands of LL, HL, LH, HH.Wavelet transform portion 10 carries out stipulated number with this filtering, and original image is layered as the sub-band image, exports the wavelet transform coefficients of each sub-band.Quantization unit 12 quantizes the wavelet transform coefficients from 10 outputs of wavelet transform portion with the quantization width of regulation.

The position of specified target is detected by activity detection portion 18, and outputs to ROI configuration part 20.The appointment of target can be undertaken by the user, also can be by activity detection portion 18 identification automatically from the ROI zone of user's appointment.In addition, also can identification automatically from the integral body of image.The appointment of this target can be a plurality of.

Under the situation of dynamic image, the position of target can be represented with activity vector.Below, the concrete example of activity vector detection method is described.The first, activity detection portion 18 possesses memories such as SRAM or SDRAM, when the appointment of target with this frame in the image of designated mistake be kept in this memory as the reference image.Reference picture is as long as preserve the piece of the prescribed level that comprises assigned address.Activity detection portion 18 comes the detected activity vector by the image that compares reference picture and present frame.In the calculating of activity vector, can adopt the radio-frequency component of wavelet transform coefficients, carry out behind the profile composition of specific objective.In addition, also can adopt MSB (MostSignificant Bit) bit plane of the wavelet transform coefficients after the quantification or a plurality of bit planes of MSB side.

The second, activity detection portion 18 relatively present frames and front, for example former frame, to detect the activity vector of target.The 3rd, do not compare two field picture, but compare the wavelet transform coefficients behind the wavelet transform, with the detected activity vector.Wavelet transform coefficients can adopt any of LL sub-band, HL sub-band, LH sub-band and HH sub-band.In addition, can be meant the reference picture of regularly registering with the comparison other of present frame, also can be the reference picture front, that for example registered from former frame.

The 4th, activity detection portion 18 adopts a plurality of wavelet transform coefficients to detect the activity vector of target.For example, can get the average of these 3 activity vectors by LL sub-band, LH sub-band and HH sub-band detected activity vector, or from wherein selecting apart from the nearest vector of the activity vector of previous frame.Thus, can improve the activity detection precision of target.

The user can specify in the scope that detects this activity vector in the image in advance in activity detection portion 18.For example, in the monitor camera in shops such as convenience store, adopt under the situation of this picture coding device, can carry out following processing: pay close attention to and enter the targets such as personage of certain limit, but do not pay close attention to the activity of the target of from then on coming out from register.

The position information such as activity vector of target are obtained from activity detection portion 18 in ROI configuration part 20, and are corresponding, and the ROI zone is moved.According to the detection method of activity detection portion 18, calculate amount of movement that begins from the position in the ROI zone of initial setting or the amount of movement that begins from former frame, with the position in the ROI zone of decision present frame.

The user is set in position, size and the picture quality etc. in ROI zone in the ROI configuration part 20 as initial value.And, specified the user under the situation of target, perhaps activity detection portion 18 has carried out under the situation of automatic identification, ROI configuration part 20 can be in the ROI zone automatic setting comprise the prescribed limit of this target.

The shape in ROI zone can be rectangle, circle, other complicated shapes.Though the shape in ROI zone self is fixed in principle, the change of shape that can make the zone at the core and the outer peripheral portion of image also can the dynamic change by user's operation.In addition, the ROI zone can be set a plurality of.

ROI configuration part 20 outputs to quantization unit 12 and coded data generating unit 16 with the ROI set information, carries out the ROI coding.In ROI coding the maximum displacement method is arranged, it will only amplify the maximum number of digits of the bit plane of the wavelet transform coefficients (hereinafter referred to as non-ROI conversion coefficient) corresponding to non-region-of-interest in proportion corresponding to the bit plane of the wavelet transform coefficients (hereinafter referred to as the ROI conversion coefficient) of the region-of-interest of image.According to this method, whole bit planes of ROI conversion coefficient all are encoded earlier than the bit plane of arbitrary non-ROI conversion coefficient.

At first, illustrate and utilize the maximum displacement method to carry out the example of ROI coding.Wavelet transform coefficients 50 after Fig. 2 (a) expression quantizes comprises from most significant bits (Most Significant Bit; MSB) to least significant bits (Least Significant Bit; LSB) each bit plane of 5.

The region-of-interest on the original image is set based on the positional information in ROI zone in ROI configuration part 20, and generation is used for specific wavelet transform coefficients corresponding to this region-of-interest, is the ROI mask of ROI conversion coefficient.The ROI conversion coefficient is represented with oblique line in the wavelet transform coefficients 50 of Fig. 2 (a).

Quantization unit 12 uses above-mentioned ROI mask, and the ROI conversion coefficient after quantizing is only amplified the S position in proportion.That is the value of ROI conversion coefficient is only shifted left S position.At this, in proportion amplification quantity S be than corresponding to the wavelet transform coefficients of non-region-of-interest, be the also big natural number of peaked figure place of the quantized value of non-ROI conversion coefficient.The wavelet transform coefficients 52 that Fig. 2 (b) expression has only been amplified 5 state in proportion with the ROI conversion coefficient.In the wavelet transform coefficients 52 after amplifying in proportion, null value is served as by amplifying and newly-generated figure place in proportion.

Entropy coding portion 14 shown in the arrow of Fig. 2 (c), on one side begin to scan in order in proportion the quantized value of the wavelet transform coefficients 52 after amplifying from high-order bit plane, carry out entropy coding on one side.

Coded data generating unit 16 obtains position or ROI set information such as amplification quantity in proportion from ROI configuration part 20, and from the information that entropy coding portion 14 obtains first-born one-tenth usefulness such as quantization width, generates head.In addition, the data that entropy coding is crossed are carried out fluidisation, and coded image is outputed to recording medium or network.At this, in recording medium, can adopt SDRAM or flash hard disk drive etc.

As described above, if utilize the maximum displacement method to carry out the ROI coding, then in order to cut down size of code, even only encode halfway, also because bit plane that can priority encoding ROI zone, so can make the picture quality in ROI zone become the high picture quality of picture quality than non-ROI zone.

Then, describe cutting down the example that bit plane carries out the ROI coding.The region-of-interest on the original image is set based on the positional information in ROI zone in ROI configuration part 20, and generating specific wavelet transform coefficients corresponding to this region-of-interest is the ROI mask that the ROI conversion coefficient is used.Under the situation with the rectangular selection region-of-interest, the positional information in ROI zone is provided by the pixel coordinate value in the upper left corner of rectangular area and the pixel count in length and breadth of rectangular area.

Fig. 3 (a)～(c) is the figure of explanation by the ROI mask of ROI configuration part 20 generations.Shown in Fig. 3 (a), establish by ROI configuration part 20 and on original image 80, selected region-of-interest 90.ROI configuration part 20 is specific for restoring the required wavelet transform coefficients of selected region-of-interest on the original image 80 90 in each sub-band.

Fig. 3 (b) expression is by with original image 80 the 1st layer of changing image 82 obtaining for 1 time of wavelet transform only.The 1st layer of changing image 82 is made of 4 sub-band LL1, HL1, LH1, HH1 of the 1st level.ROI configuration part 20 is specific for restoring wavelet transform coefficients on the 1st layer of required changing image 82 of region-of-interest 90 on the original image 80, being ROI conversion coefficient 91～94 in each sub-band LL1, HL1 of the 1st level, LH1, HH1.

Fig. 3 (c) expression is further carried out the 2nd layer the changing image 84 that wavelet transform obtains by the sub-band LL1 with the low-limit frequency composition of the changing image 82 of Fig. 3 (b).The 2nd layer changing image 84 is as shown in the drawing, except 3 sub-band HL1, LH1 of the 1st level, HH1, also comprises 4 sub-band LL2, HL2, LH2, the HH2 of the 2nd level.ROI configuration part 20, specific for restoring wavelet transform coefficients on the 2nd layer of required changing image 84 of ROI conversion coefficient 91 in the 1st layer of changing image 82, being ROI conversion coefficient 95～98 in each sub-band LL2, HL2 of the 2nd level, LH2, HH2.

Equally, by will be, thereby in the changing image of final layer, can be all specific be the required ROI conversion coefficient of recovery region-of-interest 90 corresponding to the ROI conversion coefficient of region-of-interest 90 number of times of specific wavelet transform circularly only in each layer.ROI configuration part 20 is created on the ROI mask that the position of specific should be final specific ROI conversion coefficient on the changing image of final layer is used.For example, under the situation of only having carried out 2 wavelet transforms, generation can specific pattern 3 (c) in the ROI mask of 7 ROI conversion coefficients 92～98 shown in the oblique line.

Quantization unit 12 after the quantification in picture quality is set according to relative importance value, in the lower bit number that is replaced into null value corresponding to adjustment in the ranking of the above-mentioned wavelet transform coefficients of non-region-of-interest.With reference to the ROI mask that generates by ROI configuration part 20, in the ranking of the non-ROI conversion coefficient that is not shielded, begin number from least significant bits by the ROI mask, only the S position is changed to zero.At this, zero displacement figure place S is that the maximum number of digits with the quantized value in the non-region-of-interest is any natural number of the upper limit.By this zero displacement figure place S is changed, thereby can adjust degradation continuously with respect to the reproduced picture quality of the non-region-of-interest of region-of-interest.

Fig. 4 (a)～(c) is the state of zero displacement is carried out in explanation by the low level position of the wavelet transform coefficients 60 of 12 pairs of original images of quantization unit figure.Wavelet transform coefficients 60 after Fig. 4 (a) expression quantizes comprises 5 bit planes, represents the ROI conversion coefficient with oblique line.

Shown in Fig. 4 (b), quantization unit 12 will be not be changed to zero by the S position of the LSB side of the non-ROI conversion coefficient of ROI mask shielding.S=2 in this example, shown in mark 64,2 of LSB side that can obtain non-ROI conversion coefficient are replaced into zero wavelet transform coefficients 62.

Shown in the arrow of Fig. 4 (c), entropy coding portion 14 carries out surface sweeping to the wavelet transform coefficients 62 that comprises ROI conversion coefficient and the zero non-ROI conversion coefficient of replacing in order from high-order bit plane on one side, Yi Bian carry out entropy coding.

Fig. 5 (a)～(c) is that explanation original image entropy does not exist the figure that under the situation of region-of-interest the low level position of wavelet transform coefficients is carried out the state of zero displacement.Fig. 5 (a) expression: because in original image, do not set region-of-interest, so the wavelet transform coefficients 70 of 5 bit planes that constitute by non-ROI conversion coefficient only.Be under 2 the situation at zero displacement figure place S, shown in Fig. 5 (b), quantization unit 12 generate with in 5 bit planes, low level 2 bit planes of LSB side are replaced into zero wavelet transform coefficients 72.

Shown in Fig. 5 (c), entropy coding portion 14 carries out entropy coding to a high position 3 bit planes of the wavelet transform coefficients 72 after zero displacement in order from top beginning.Under this situation, 2 bit planes of the low level after zero displacement are not encoded.And, replace zero displacement low level, 2 bit planes, also can only give up low level 2 bit planes.

Coded data generating unit 16 is that the basis generates head with coding parameters such as quantization widths.In addition, the data fluidisation that entropy coding is intact outputs to recording medium or network as coded image.

Generally, in the size of data of final coded image, be set with under the situation of the upper limit according to restriction of memory capacity or transfer rate etc., when the wavelet transform coefficients of entropy coding portion 14 begin coded quantization in order from high-order bit plane after, end coding with the bit plane of centre for the upper limit of observing size of data.Perhaps, when the coding output of coded data generating unit 16 after beginning to export fluidisation in order, come middle fluid stopping output with the bit plane of centre for the restriction of observing transfer rate from high-order bit plane.

Like this, even in the size of data of coded image, exist under the situation of restriction, also because in the bit plane of low level, the pairing wavelet transform coefficients of non-region-of-interest is carried out zero displacement, only the pairing wavelet transform coefficients of region-of-interest is made as the object of coding as significant information, so the compression efficiency of the bit plane of low level improves, even be encoded to till the least significant bits plane, also can not roll up size of data.

As described above, cut down the coding method of bit plane, owing to do not carry out the processing and amplifying in proportion of ROI conversion coefficient, so the computing that can encode effectively.In addition, owing to can not increase the number of bit-planes that to encode,, can cut down hardware cost so need not the unnecessary storage area that is provided with.

In addition, because the scaled processing when not needing to decipher so need not additional ROI positional information in the head of coded data, need not to add amplification quantity in proportion in coded data.And then, because it is as broad as long with common coded image on form to have carried out the ROI image encoded with this method,, can keep deciphering the interchangeability of processing so can decipher to handle identical processing with the decoding of common coded image.

(embodiment 2 of execution mode 1)

Fig. 6 is the pie graph of the related picture coding device 200 of the embodiment 2 of execution mode 1.This picture coding device 200 is the formations of having added picture quality configuration part 22 in the related picture coding device 100 of the embodiment 1 of execution mode 1.Pay identical mark for the formation identical, formation and the activity different with the embodiment 1 of execution mode 1 are described with the embodiment 1 of execution mode 1.

The initial value of the picture quality in ROI zone and non-ROI zone, the user can be set in the ROI configuration part 20.In addition, also can calculate, analogize and determine automatically the picture quality in non-ROI zone is distributed to the size of code of each frame picture quality configuration part 22.That is,, then reduce the size of code in non-ROI zone if the size of code in ROI zone increases; If the size of code in ROI zone reduces, then increase the size of code in non-ROI zone.And, add mosaic etc. in the ROI zone, make the ROI zone under the situation of low image quality in contrast.If the explanation that then can realize above-mentioned ROI coding is read with non-ROI zone in transposing ROI zone.

In addition, picture quality configuration part 22 can be calculated, analogize, to adjust the encoding amount in non-ROI zone adaptively in dynamic image photography from the residual amount of the size of code that photographs so far or the recording medium that records encoding stream.For example, if the residual amount of recording medium reduces, then reduce the size of code in non-ROI zone.

Present embodiment according to above explanation, by adjust the size of code in non-ROI zone by picture quality configuration part 22, the picture quality of the target that thereby the user can be paid close attention to maintains the desired level of user, on one side the size of code of dynamic image integral body is suppressed at the capacity of regulation.

(embodiment 3 of execution mode 1)

Fig. 7 is the pie graph of the related picture coding device 300 of the embodiment 3 of execution mode 1.This picture coding device 300 is the formations of having added frame buffer 24 and target extraction unit 26 in the related picture coding device 100 of the embodiment of execution mode 1.Pay identical mark for the formation identical, formation and the activity different with the embodiment 1 of execution mode 1 are described with the embodiment 1 of execution mode 1.In addition, even the different activity of identical formation is also described.

Frame buffer 24 is mass storages such as SDRAM, store at least present frame and front, for example former frame.Target and background in target extraction unit 26 separate pictures.Target extraction unit 26 compares the image of previous frame and the image of present frame.At this moment, as adopting MPEG (Moving PictureExperts Group), picture breakdown is a plurality of, tries to achieve activity vector by every, these mean value or mode activity vector as a setting.This is to have utilized the movable body that becomes target irregular mobile, the situation that background moves to a direction under the situation that viewpoint moves.

Target extraction unit 26 departs from the pairing part of activity vector of the background of being tried to achieve, and generates the difference image of previous frame image and current frame image.Because the background unanimity or the basically identical of this difference image are so target extraction unit 26 can precision be removed background goodly.Target extraction unit 26 possesses reference memory.The specified target of image detection from this background is removed is kept in the reference memory as the reference image.Reference picture can be fixed the image when using initial appointment, also can upgrade at every turn.Under the situation of specifying a plurality of targets, in reference memory, preserve corresponding a plurality of reference pictures.In addition, also can possess a plurality of reference memory.

Activity detection portion 18 relatively is kept at the reference picture in the reference memory and the difference image of next frame, detects the absolute activity vector of target.This difference image can be the image that background is removed, and also can be residual image of having powerful connections.The activity vector of this activity vector and background is outputed to quantization unit 12.Quantization unit 12 according to the activity vector of this activity vector and background, calculates the activity in ROI zone after quantification, corresponding therewith, and the ROI zone is moved.

In the above description, though the image of the image of target extraction unit 26 comparison previous frame and present frame detects the activity vector of background, also can compare the wavelet transform coefficients of previous frame and the wavelet transform coefficients of present frame.At this moment, if adopt the LL sub-band then can reduce picture size.If adopt HL sub-band, LH sub-band or HH sub-band, then on this basis, can also reduce the operand that is used for only extracting profile.

Fig. 8 is the figure that is illustrated in the state of interior separate targets of image and background.The frame of Fig. 8 (a) expression front.Personage A and personage B, 2 targets are present in the image.Fig. 8 (b) represents present frame.Because flower moves right, so as can be known: background moves right, and promptly viewpoint is moved to the left.Personage A moves to upper left slightly, and personage B moves left in a large number.The personage A of this two interframe and the activity of personage B are relative activities.Fig. 8 (c) represents difference image.This difference image is the frame of front to be moved right and synthetic.Can detect the absolute activity of personage A and personage B.In addition, also can remove background.

Present embodiment according to above explanation, even in the photography of Digital Video etc., dynamic image photography under the situation of viewpoint change, offset by activity and to detect the activity of absolute target with background, thereby can reduce the mistake identification of the target that change caused of background, ROI zone precision is followed the trail of goodly.

(embodiment 4 of execution mode 1)

Fig. 9 is the pie graph of the related camera head 400 of the embodiment 4 of execution mode 1.As the example of camera head 400, enumerate digital camera, Digital Video, monitor camera etc.

Image pickup part 410 for example possesses CCD (Charge Coupled Device) etc., is taken into from the light of subject and is converted to the signal of telecommunication, outputs to encoding block 420.420 pairs of original images from image pickup part 410 inputs of encoding block are encoded, and the image conveyer after will encoding is to efferent 440.

Encoding block 420 has the formation of any picture coding device among the embodiment 1～3 of execution mode 1, is created on region-of-interest and the different coded image of non-region-of-interest picture quality.Operating portion 430 possesses LCD or OLED display, shows the image that image pickup part 410 photographs there.The target that the user can specify region-of-interest or be paid close attention in this image.For example, can perhaps adopt the display of touch panel mode with cursor or the frame in the mobile images such as cross key, wait with recording pen and specify.Operating portion 430 can also load shutter release button or various action button in addition.

Efferent 440 is networks such as the recording medium that can load and unload or LAN.The image of encoding by encoding block 420 or be recorded in this recording medium, perhaps sending network.

Figure 10 is the figure of the 1st example of the tracking process of the region-of-interest in the image that photographs of embodiment 4 related camera heads 400 of expression execution mode 1.Figure 10 (a) is illustrated in the state of the target that designated user is paid close attention in the image.The personage A that comes designated user to be paid close attention to the cursor of cross.Figure 10 (b) is illustrated in the state that is set with the ROI zone in the image.The zone that is fenced up by frame is the ROI zone.Initial setting can by initial setting, also can be carried out in the regulation zone that comprises specified target automatically by user's operation in the ROI zone.Figure 10 (c) expression personage A moves and runs out of the state in ROI zone.Figure 10 (d) expression ROI also follows the trail of the state of the activity of personage A in the zone.Detect the activity vector of personage A, corresponding, the ROI zone is also moved.

Figure 11 is the figure of the 2nd example of the region-of-interest tracking process in the image that photographs of embodiment 4 related camera heads 400 of expression execution mode 1.Figure 11 (a) is different with the order of the 1st example, is illustrated in the state that the interior user of image sets the ROI zone.Personage A in personage A and the personage B is set at the target that the user pays close attention to.And, also can set a plurality of ROI zone.Figure 11 (b) is illustrated in the state of the target that designated user is paid close attention in the ROI zone.The user can specify, and also can discern automatically.Figure 11 (c) expression personage A moves, this movable state is followed the trail of in the ROI zone.Because the target that the moving of personage B is not appointed as the user and paid close attention to is not so influence moving of ROI zone.

Figure 12 is the figure of the 3rd example of the region-of-interest tracking process in the image that photographs of embodiment 4 related camera heads 400 of expression execution mode 1.The state of the scope of following the trail of in the ROI zone is set in Figure 12 (a) expression.Big frame table among the figure shows this scope.The state in ROI zone is set in Figure 12 (b) expression.Only move in the big frame of setting in this ROI zone.Figure 12 (c) expression personage A moves and goes to state outside the big frame.Because the tracking of personage A is carried out in the ROI zone in the scope of big frame, finish so become halfway to follow the trail of.And, if the target that the user paid close attention to is run out of big frame, then can become the processing that finishes photography itself.For example, be under the situation of monitor camera, need special record to invade the personage in certain limit zone, as long as in this scope, keep the picture quality of targets such as personage.The 3rd example goes for this situation, can further cut down size of code than the 1st example and the 2nd example.

And, Yi Bian camera head 400 certainly makes ROI zone follow the trail of the processing of specified target, Yi Bian take dynamic image and record in the efferent 440.In addition, the user can operate from operating portion 430 during this period, and the setting of carrying out the ROI zone is removed, set.If the ROI zone is disengaged, then with the All Ranges in the identical low bit rate coded image.And by this operation of user, the dynamic image photography can temporarily stop, restarting.And then, ROI zone is followed the trail of in the processing of specified target, the shutter release button that the user can be by depressing operating portion 430 etc. is taken rest image.This rest image is a high image quality in the ROI zone, is low image quality in non-ROI zone.

According to present embodiment discussed above, by reducing the size of code in non-ROI zone, thereby can provide the picture quality of the target of on one side user being paid close attention to maintain the desired level of user, can reduce the camera head of the size of code of dynamic image integral body on one side.

More than, the execution mode 1 that has been base description with embodiment.These embodiment are examples, and the combination of these inscapes or variety of processes can have various variation, in addition these variation also within the scope of the invention, this it will be appreciated by those skilled in the art that.This variation below is shown.

In the above-described embodiment, encode continuously, generate the encoding stream of dynamic image, but be not limited to the JPEG2000 mode, get final product so long as can generate the mode of the encoding stream of dynamic image in the JPEG2000 mode.

Cut down bit plane, carry out the ROI Methods for Coding and only zero displacement is carried out in the low level position of non-ROI conversion coefficient, the amplification in proportion of ROI conversion coefficient is not all carried out, and implements but can make up amplifying in proportion with zero displacement of the low level position of non-ROI conversion coefficient of ROI conversion coefficient yet.

In the above-described embodiment, in ROI configuration part 20, set under the situation in a plurality of ROI zone, can set different picture quality by each ROI zone the user.Replace number by the zero of low level position of adjusting non-ROI conversion coefficient, thereby can realize the picture quality of various levels.

Embodiment whichsoever, the space filtering of using as image encoding has all illustrated wavelet transform, but also can adopt other spatial frequency transforms.For example, even under the situation of the discrete cosine transform of in Joint Photographic Experts Group, adopting, by zero displacement being carried out in the low level position of the conversion coefficient of non-region-of-interest with same method, thereby sacrifice the picture quality of non-region-of-interest, improve the compression efficiency of integral image, can improve the picture quality of region-of-interest simultaneously relatively.

The embodiment of execution mode 2 is as described below.

(embodiment 1 of execution mode 2)

Figure 13 is the pie graph of the related image decoder 1100 of the embodiment 1 of execution mode 2.The formation of image decoder 1100 can be realized by CPU, memory, other LSI of any computer at hardware aspect; Aspect software, can wait and realize by the program that memory loaded with encoding function, but described here be the functional block that realizes by working in coordination with of these.Therefore, these functional blocks can be only by hardware, only realize with various forms that by software or by these combination this it will be appreciated by those skilled in the art that.

In the embodiment 1 of execution mode 2, image decoder 1100 will utilize the JPEG2000 mode and be compressed the coded image of encoding and decipher as an example.The coded image that is input to image decoder 1100 is the common coded image that does not carry out the ROI coding, and the ROI coding makes region-of-interest (the Region of Interest of image; ROI) also preferentially encode than other.Image decoder 1100 is specified region-of-interest (hereinafter referred to as the ROI zone) when decoding, the ROI zone is preferentially deciphered.

The coded image that is input to image decoder 1100 also can be the coded frame of dynamic image.By each coded frame as the dynamic image of encoding stream input is deciphered continuously, thereby can regenerate dynamic image.

Coded data extraction unit 1010 is extracted coded data from the coded image of being imported.Entropy decoding part 1012 is deciphered coded data by each bit plane, and the wavelet transform coefficients after the quantification that will obtain as decode results is stored in the not shown memory.

The position of specified target is detected by activity detection portion 1018, and outputs to ROI configuration part 1020.The appointment of target can be undertaken by the user, also can be by activity detection portion 1018 identification automatically from the ROI zone of user's appointment.In addition, also can identification automatically from entire image.The appointment of this target can be for a plurality of.

Under the situation of dynamic image, the position of target is represented with activity vector.Below, the concrete example of activity vector detection method is described.The first, activity detection portion 1018 possesses memories such as SRAM or SDAM, when target is specified in this frame the image of appointed target be kept in this memory as the reference image.As the reference image, can preserve the piece of the prescribed level that comprises assigned address.Activity detection portion 1018 is by the image of this reference picture and present frame relatively, thus the detected activity vector.In the calculating of activity vector, can adopt the radio-frequency component of wavelet transform coefficients, the profile composition of specific objective and carrying out.In addition, also can adopt MSB (MostSignificant Bit) bit plane of the wavelet transform coefficients after the quantification or a plurality of bit planes that begin from the MSB side.

The second, activity detection portion 1018 relatively present frames and front, for example former frame, to detect the activity vector of target.The 3rd, do not compare two field picture, but compare the wavelet variation coefficient behind the wavelet transform, with the detected activity vector.Wavelet transform coefficients can adopt any of LL sub-band, HL sub-band, LH sub-band and HH sub-band.In addition, can be meant the reference picture of regularly registering with the comparison other of present frame, also can be, the reference picture registered of for example former frame from the front.

The 4th, activity detection portion 1018 adopts a plurality of wavelet transform coefficients, to detect the activity vector of target.For example, can come the detected activity vector, get the mean value of these 3 activity vectors, perhaps from wherein selecting apart from the nearest vector of the activity vector of previous frame by each of HL sub-band, LH sub-band and HH sub-band.Thus, can improve the activity detection precision of target.

In addition, the user can specify in the scope that detects this activity vector in the image in advance in activity detection portion 1018.For example, under the situation that the image that the monitor camera with shops such as convenience stores is photographed is deciphered, can carry out following processing: pay close attention to the targets such as personage that enter into certain limit from radially (radius), do not pay close attention to the activity of the target of coming out therefrom.

The position information such as activity vector of target are obtained from activity detection portion 1018 in ROI configuration part 1020, and corresponding, and the ROI zone is moved.According to the detection method of activity detection portion 1018, calculate apart from the amount of movement of the position in the ROI zone of initial setting or apart from the amount of movement of former frame, with the position of the ROI of decision present frame.

The user is set in position, size and the picture quality etc. in ROI zone in the ROI configuration part 1020 as initial value.Be chosen as in the ROI zone under the situation of rectangle, the positional information in ROI zone can be provided by the coordinate figure of the pixel in the upper left corner, rectangular area and the pixel count in length and breadth of rectangular area.And, also can specify under the situation of target the user, perhaps activity detection portion 1018 has carried out under the situation of automatic identification, and the prescribed limit that ROI configuration part 1020 will comprise this target is automatically made the ROI zone.

ROI configuration part 1020 generates specific ROI area relative wavelet transform coefficients, is the ROI mask that the ROI conversion coefficient is used based on the ROI set information.Re-quantization portion 1014 with respect to the relative priority degree in ROI zone, adjusts the lower bit number that is replaced into null value in the ranking of the above-mentioned wavelet transform coefficients of non-ROI area relative according to non-region-of-interest (hereinafter referred to as non-ROI zone).Then, with reference to above-mentioned ROI mask, carry out with in the wavelet transform coefficients of decipher by entropy decoding part 1012, regulation figure place part of beginning of LSB (the Least Significant Bit) side of non-ROI conversion coefficient is replaced into zero processing.

At this, be replaced into zero figure place and be the maximum number of digits of the quantized value in the non-ROI zone any natural number as the upper limit.By this figure place is changed, thereby can adjust the non-ROI zone degradation of the reproduced picture quality in ROI zone relatively continuously.And 1014 pairs in re-quantization portion comprises the ROI conversion coefficient and re-quantization is carried out by the wavelet transform coefficients of the zero non-ROI conversion coefficient of replacing in the low level position.Wavelet transform coefficients behind 1016 pairs of re-quantizations of wavelet inverse transformation portion is carried out inverse transformation, and exports resulting decoding image.

Figure 14 (a)～(c) is the figure of explanation by the ROI mask of ROI configuration part 1020 generations.Shown in Figure 14 (a), be made as by ROI configuration part 1020 and on original image 1080, selected ROI zone 1090.ROI configuration part 1020 is specific for restoring the required wavelet transform coefficients in the ROI zone selected on the original image 1,080 1090 in each sub-band.

Figure 14 (b) expression is by with image 1080 the 1st layer changing image 1082 obtaining for 1 time of wavelet transform only.The 1st layer changing image 1082 is made of 4 sub-band LL1001, HL1001, LH1001, HH1001 of the 1st level.ROI configuration part 1020 in each sub-band LL1001, HL1001 of the 1st level, LH1001, HH1001 specific for the wavelet transform coefficients on the 1st layer of required changing image 1082 of the ROI zone 1090 of restoring original image 1080, be ROI conversion coefficient 1091～94.

The 2nd layer the changing image 1084 that Figure 14 (c) expression obtains by the further wavelet transform of sub-band LL1001 with the low-limit frequency composition of the changing image 1082 of Figure 14 (b).As shown in the figure, the 2nd layer changing image 1084 also comprises 4 sub-band LL1002, HL1002, LH1002, the HH1002 of the 2nd level except 3 sub-band HL1001, LH1001 of the 1st level, HH1001.ROI configuration part 1020 in each sub-band LL1002, HL1002 of the 2nd level, LH1002, HH1002 specific for the wavelet transform coefficients on the 2nd layer of required changing image 1084 of the ROI conversion coefficient among the sub-band LL1001 that restores the 1st layer of changing image 1,082 1091 be ROI conversion coefficient 1095～1098.

Equally, by in each layer to only the circulate number of times of specific wavelet transform of ROI zone 1090 pairing ROI conversion coefficients, thereby in the changing image of final layer, can specific whole recovery ROI zone 1090 required ROI conversion coefficients.ROI configuration part 1020 is created on the ROI mask that the position of specific should be final specific ROI conversion coefficient on the changing image of final layer is used.For example, only carrying out under the situation of 2 wavelet transforms the ROI mask of the position of 7 ROI conversion coefficients 1092～1098 that generation is represented with oblique line in can specific Figure 14 (c).

The low level position of the wavelet transform coefficients behind Figure 15 (a)～(c) presentation code image decoding is by the state of zero displacement.Figure 15 (a) is the wavelet transform coefficients 1074 of the image after the entropy decoding, comprises 5 bit planes.In Figure 15 (b), represent ROI area relative ROI conversion coefficient by 1020 appointments of ROI configuration part with oblique line.Shown in Figure 15 (c), re-quantization portion 1014 generates low level 2 positions with non-ROI conversion coefficient and is changed to wavelet transform coefficients 1076 after zero.

And ROI configuration part 1020 replaces selects the ROI zone, also can select non-ROI zone.For example, in desire iridescence is put under the situation in zone of the personal information such as number plate of taking face that the personage is arranged or car, this zone is chosen as non-ROI zone.Under this situation, make the mask counter-rotating of specific non-ROI conversion coefficient, can generate the mask of specific ROI conversion coefficient.The mask of specific non-ROI conversion coefficient perhaps, can be provided to re-quantization portion 1014.

Under the situation of the coded frame of importing dynamic image to image decoder 1100 continuously, image decoder 1100 also can be movable as described below.Image decoder 1100 is handled load in order to reduce usually the time, suitably give up the simple and easy regeneration of regenerating behind the low level bit plane of wavelet transform coefficients.Thus, even exist under the situation of restriction at the handling property of image decoder 1100, owing to give up the low level bit plane, therefore for example also can be to carry out simple and easy regeneration 30 frame/seconds.

In simple and easy regeneration, under the situation of having selected the ROI zone on the image, image decoder 1100 at the low level position in non-ROI zone by the wavelet transform coefficients of the state of zero displacement, code interpretive reproduced picture behind the bit plane of lowest order.At this moment, raise owing to handle load, so though become sometimes with the state of 15 frame/time lapses such as second or the state of regenerating at a slow speed, can be with the high image quality ROI zone of regenerating.

Like this, when having selected the ROI zone, non-ROI zone has only the ROI zone with higher quality regeneration directly to regenerate with the quality of simple and easy regeneration equal extent.As monitor the image, when usual, do not need high quality, only want when unusual that under the situation with high quality regeneration concern place be useful.In addition, under situation,, also can utilize from the viewpoint of battery life with portable terminal regeneration dynamic image: in energy-saving mode with low-quality regeneration dynamic image, as required with only the regenerate using method in ROI zone of high image quality.

Image decoder 1100 according to present embodiment, for the common coded image of not encoded by ROI, by the low level position of non-ROI area relative wavelet transform coefficients being carried out zero displacement, thereby the picture quality that can relatively make the ROI zone can easily be given prominence to the target that the user pays close attention to than the also high laggard row decoding in non-ROI zone.In addition, owing to only preferentially decipher the ROI zone, reduce treating capacity so can handle than common decoding.Therefore, can handle by high speed, can also cut down consumption electric power.

(embodiment 2 of execution mode 2)

Figure 16 is the pie graph of the related image decoder 1200 of the embodiment 2 of execution mode 2.This image decoder 1200 is the formations of having added picture quality configuration part 1022 in the related image decoder 1100 of the embodiment 1 of execution mode 2.Pay identical symbol for the formation identical with the embodiment 1 of execution mode 2, and different formation and the activities of the embodiment 1 of explanation and execution mode 2.

The user can be from picture quality configuration part 1022 be set in the initial value of the picture quality in ROI zone and non-ROI zone the ROI configuration part 1020.In addition, even in dynamic image regeneration, also at least one side's in ROI zone and non-ROI zone picture quality can be changed to desired horizontal.Re-quantization portion 1014 is according to this change, adjusts the lower bit number that is replaced into null value in the ranking of the pairing wavelet transform coefficients of at least one side in ROI zone and non-ROI zone.Thus, between ROI zone and non-ROI zone, can make the poor image quality of the desired level of user.

In addition, picture quality configuration part 1022 can make at least one side's in ROI zone and non-ROI zone picture quality reduce according to reproduction speed.That is, selected the user under the situation of 2 times of rapid regenerations, the decoding fully of coded image is sometimes handled and can not worked.In this case, the picture quality in non-ROI zone is reduced, can alleviate treating capacity.Thus, no matter reproduction speed also can not carry out time lapse ground regeneration dynamic image.

And then, image decoder 1200 be loaded in mobile phone, PDA (Person DigitalAssistant), portable DVD player (Digital Video Disk) player and the situation of the portable sets such as automobile navigation apparatus that can take off under, picture quality configuration part 1022 can make at least one side's in ROI zone and non-ROI zone picture quality reduce according to the battery remaining power of this equipment.That is, under the situation that battery remaining power tails off, the picture quality in non-ROI zone is reduced, can reduce consumption electric power.Thus, the image that can prolong after battery remaining power tails off may the recovery time.

(embodiment 3 of execution mode 2)

Figure 17 is the pie graph of the related camera head 1300 of the embodiment 3 of execution mode 2.As the example of camera head 1300, enumerate digital camera, Digital Video, monitor camera etc.

Image pickup part 1310 for example possesses CCD (Charge Coupled Device) etc., is taken into from the light of subject and is converted to the signal of telecommunication, outputs to encoding block 1320.1320 pairs of original images from image pickup part 1310 inputs of encoding block are encoded, and the image after will encoding is stored in the storage part 1330.The original image that is input to encoding block 1320 can be the dynamic image frame, also can will be stored in the storage part 1330 after the dynamic image frame continuous programming code.

Decode block 1340 is read coded image from storage part 1330, offer display part 1350 after the decoding.The coded image of reading from storage part 1330 can be the coded frame of dynamic image.Decode block 1340 has the formation of the embodiment 1 or 2 the image decoder 1100,1200 of execution mode 2, and the coded image that is stored in the storage part 1330 is deciphered.Accept to be set in the information in the ROI zone on the picture from operating portion 1360, preferentially decipher the ROI zone, generate the ROI zone decoding image different with the picture quality in non-ROI zone.

Display part 1350 possesses LCD or OLED display etc., shows the image of being deciphered by decode block 1340 there.Operating portion 1360 can be specified the ROI zone or pay close attention to target by user's operation in the image of display part 1350.For example, the user perhaps adopts the display of touch panel mode with cursor or frame in the mobile images such as cross key, waits with recording pen and specifies.Operating portion 1360 can also load shutter release button or various action button in addition.

According to this form, can provide a kind of camera head that can easily give prominence to the target of user's concern.In addition, owing to only preferentially decipher the ROI zone, reduce treating capacity so can handle than common decoding.Therefore, can handle by high speed, can also cut down consumption electric power.Particularly digital camera, Digital Video etc. consume electric power by cutting down, and can prolong the time that can photograph.

The variation of the embodiment 3 of execution mode 2 then, is described.This example is: with encoding block 1320 priority encoding ROI zones, generate the different coded image of picture quality in ROI zone and non-ROI zone, decipher the example of these coded images with decode block 1340.

Figure 18 is the pie graph of the related encoding block 1320 of the modification of embodiment 3 of execution mode 2.Encoding block 1320 as an example, utilizes the JPEG2000 mode to carry out compressed encoding the original image imported.

Wavelet transform portion 1030 will carry out sub-band from the original image of image pickup part 1310 inputs to be cut apart, and calculates the wavelet transform coefficients of each sub-band image, the wavelet transform coefficients after the generation layering.Specifically be that wavelet transform portion 1030 adopts low pass filter and high pass filter in x, the y all directions of original image, is divided into 4 laggard row-wavelet conversion of frequency sub-band.These sub-bands are: have the LL sub-band of low-frequency component in x, y two directions; Wherein any direction at x, y has low-frequency component, and has the HL and the LH sub-band of radio-frequency component at other direction; And have the HH sub-band of radio-frequency component in x, y two directions.The pixel count in length and breadth of each sub-band be before handling image separately 1/2, can obtain resolution, be that picture size is 1/4 sub-band image with 1 filtering.

Wavelet transform portion 1030 carries out Filtering Processing once again to the LL sub-band in the sub-band that obtains like this, and it further is divided into the laggard row-wavelet conversion of 4 sub-bands of LL, HL, LH, HH.Wavelet transform portion 1030 carries out stipulated number with this filtering, and original image is layered as the sub-band image, exports the wavelet transform coefficients of each sub-band.Quantization unit 1032 quantizes the wavelet transform coefficients from 1030 outputs of wavelet transform portion with the quantization width of regulation.

The image decoder 1100,1200 of the formation of activity detection portion 1038 and ROI configuration part 1040 and the embodiment 1,2 of movable and execution mode 2 basic identical.Below, the narration difference.Carry out the ROI coding behind ROI configuration part 1040 vectorization portions 1032 and the coded data generating unit 1036 output ROI set informations.In ROI coding, there is bit plane with the ROI area relative ROI conversion coefficient of image only to amplify the maximum displacement method of maximum number of digits of the bit plane of non-ROI conversion coefficient in proportion.According to this method, whole bit planes of ROI conversion coefficient all preferentially are encoded than the bit plane of any non-ROI conversion coefficient.

At first, illustrate and utilize the maximum displacement method to carry out the example of ROI coding.Wavelet transform coefficients 1050 after Figure 19 (a) expression quantizes comprises each bit plane of 5 from MSB to LSB.

The ROI zone on the original image is set based on the positional information in ROI zone in ROI configuration part 1040, generates the ROI mask that is used for the specific ROI conversion coefficient.The ROI conversion coefficient illustrates with oblique line in the wavelet transform coefficients 1050 of Figure 19 (a).

Quantization unit 1032 uses above-mentioned ROI mask, and the ROI conversion coefficient after quantizing is only amplified the S position in proportion.That is, with the value of ROI conversion coefficient only to the S position of shifting left.At this, amplification quantity S is the also big natural number of peaked figure place than the quantized value of non-ROI conversion coefficient in proportion.The wavelet transform coefficients 1052 that Figure 19 (b) expression has only been amplified 5 state in proportion with the ROI conversion coefficient.In the wavelet transform coefficients 1052 after amplifying in proportion, null value is served as in newly-generated position by amplifying in proportion.

Shown in the arrow of Figure 19 (c), the quantized value of the wavelet transform coefficients 1052 after amplifying is in proportion scanned on one side in order in entropy coding portion 1034 from high-order bit plane, Yi Bian carry out entropy coding.

Coded data generating unit 1036 obtains position or ROI set information such as amplification quantity in proportion from ROI configuration part 1040, and obtains the information of first-born one-tenth usefulness such as quantization width from entropy coding portion 1034, to generate head.In addition, the data of entropy coding are carried out fluidisation, and coded image is outputed to storage part 1330.Then, the coded image of storage part 1330 can output to recording medium or network.At this, can adopt SDRAM or flash hard disk drive etc. in the recording medium.

As described above, if utilize the maximum displacement method to carry out the ROI coding, even then only be encoded to midway in order to cut down size of code, bit plane that also can priority encoding ROI zone is so can make the picture quality height of the picture quality in ROI zone than non-ROI zone.

Next, the example of cutting down bit plane, carrying out the ROI coding is described.ROI configuration part 1040 utilizes method illustrated in fig. 14 to generate the ROI mask.The relative importance value that quantization unit 1032 is set according to picture quality after quantification is adjusted the lower bit number that is replaced into null value in the ranking of the above-mentioned wavelet transform coefficients of non-ROI area relative.With reference to the ROI mask that generates by ROI configuration part 1040,, begin number from least significant bits and only the S position is changed to zero not by in the ranking of the non-ROI conversion coefficient of ROI mask shielding.At this, zero displacement figure place S is with the maximum number of digits of the quantized value in the non-ROI zone any natural number as the upper limit.By this zero displacement figure place S is changed, thereby can adjust the degradation of non-ROI zone continuously with respect to the reproduced picture quality in ROI zone.

Figure 20 (a)～(c) is the figure of explanation by the state of the low level position of the wavelet transform coefficients 1060 of quantization unit 1,032 zero displacement original images.Wavelet transform coefficients 1060 after Figure 20 (a) expression quantizes comprises 5 bit planes, represents the ROI conversion coefficient with oblique line.

Shown in Figure 20 (b), quantization unit 1032 will be not be changed to zero by the S position of the LSB side of the non-ROI conversion coefficient of ROI mask shielding.In this embodiment, S=2, shown in mark 1064,2 of LSB side that obtain non-ROI conversion coefficient have been replaced into zero wavelet transform coefficients 1062.And, replace zero displacement low level, 2 bit planes, also can only give up low level 2 bit planes.

Shown in the arrow of Figure 20 (c), the wavelet transform coefficients 1062 that comprises the non-ROI conversion coefficient after the ROI conversion coefficient is replaced with zero is scanned in order from high-order bit plane on one side in entropy coding portion 1034, Yi Bian carry out entropy coding.

Coded data generating unit 1036 serves as that the basis generates head with coding parameters such as quantization widths.In addition, the data that entropy coding is crossed are carried out fluidisation, and output to storage part 1330.

Generally, be set with on the size of data of final coded image under the situation of the upper limit owing to the restriction of memory capacity or transfer rate, when entropy coding portion 1034 encodes to the wavelet transform coefficients after quantizing in order from high-order bit plane, in order to observe the upper limit of size of data, end coding at the bit plane of centre sometimes.Perhaps, coded data generating unit 1036 in output from high-order bit plane in order when the intact coded data of fluidisation, the fluid stopping output in the bit plane of centre sometimes for the restriction of observing transfer rate.

Like this, even on the size of data of coded image, there is restriction, also because in the bit plane of low level, non-ROI area relative wavelet transform coefficients is by zero displacement, only ROI area relative wavelet transform coefficients is made as coded object as significant information, so the compression efficiency height of the bit plane of low level even be encoded to till the least significant bits plane, can not increase size of data yet.

As described above, cut down the coding method of bit plane, owing to do not carry out the processing and amplifying in proportion of ROI conversion coefficient, so the computing that can effectively encode.In addition, owing to can not increase the number of bit-planes that to encode,, can cut down hardware cost so need not the unnecessary storage area that is provided with.

In addition, because the scaled processing when not needing to decipher so need not additional ROI positional information in the head of coded image, need not to add amplification quantity in proportion in coded image.And then, since with this method carried out the ROI image encoded and common coded image as broad as long on form, so can decipher, can guarantee to decipher the interchangeability of processing to handle identical processing with the decoding of common coded image.

As described above, variation according to the embodiment 3 of execution mode 2, decode block 1340 preferential decoding ROI zones on the basis of the effect under the situation that generates the ROI zone decoding image different with the picture quality in non-ROI zone, can reduce the size of code of coded image.

Figure 21 is the figure of the 1st example of the tracking process in the ROI zone of illustrated mistake below the expression.Figure 21 (a) is illustrated in the state of the target that designated user is paid close attention in the image.The personage A that pays close attention to the cursor designated user of cross.Figure 21 (b) is illustrated in the state that is set with the ROI zone in the image.The zone that is fenced up by frame is the ROI zone.Initial setting can be operated by the user in the ROI zone, also can be initially set the regulation zone that comprises specified target automatically.Figure 21 (c) expression personage A moves and the state that leaves from the ROI zone.Figure 21 (d) expression ROI also follows the trail of the state of the activity of personage A in the zone.Detect the activity vector of personage A, corresponding, the ROI zone is also moved.

Figure 22 is the figure of the 2nd example of the tracking process in expression ROI zone.Figure 22 (a) is different with the order of the 1st example, and the expression user sets the state in ROI zone in image.Personage A among personage A and the personage B is set at the target that the user pays close attention to.And the ROI zone can be set a plurality of.Figure 22 (b) is illustrated in the state of the target that designated user is paid close attention in the ROI zone.Can the user specify, also can discern automatically.Figure 22 (c) expression personage A moves and this movable state of ROI zone tracking.Because the target that the user pays close attention to is not appointed as in the activity of personage B, so moving of ROI zone do not influenced.

Figure 23 is the figure of the 3rd example of the tracking process in expression ROI zone.The state of the scope of following the trail of in the ROI zone is set in Figure 23 (a) expression.Big frame table in the way shows this scope.The state in ROI zone is set in Figure 23 (b) expression.Only move in the big frame of setting in this ROI zone.Figure 23 (c) expression personage A moves and goes to state outside the big frame.Because the tracking of personage A is carried out in the ROI zone in the scope of big frame, finish so on the way become to follow the trail of.And, if the target that the user pays close attention to is run out of big frame, then can become the processing that finishes photography etc.For example, under the situation of monitor camera, need special record to invade the personage in certain limit zone, as long as in this scope, keep the picture quality of targets such as personage.The 3rd example goes for this situation, can further cut down treating capacity than the 1st example and the 2nd example.

And, Yi Bian the embodiment 3 related camera heads 1300 of execution mode 2 certainly make ROI zone follow the trail of specified target, Yi Bian take dynamic image and be recorded in the recording medium etc.In addition, the user begins operation from operating portion 1360 during this period, and the setting that can carry out the ROI zone is removed, set.If remove the ROI zone, then the Zone Full in the image is encoded with identical bit rate.And, can temporarily stop, restarting the dynamic image photography by this operation of user.And then, ROI zone is followed the trail of in the processing of specified target, the user is by depressing the shutter release button of operating portion 1360 etc., thereby can take rest image.In this rest image, the ROI zone is a high image quality, and non-ROI zone is a low image quality.

More than, the execution mode 2 that has been base description with embodiment.Embodiment is an example, and the combination of these inscapes or variety of processes can have various variation, in addition these variation also within the scope of the invention, this it will be appreciated by those skilled in the art that.This variation below is shown.

In the above-described embodiment, the encoding stream of the dynamic image crossed with JPEG2000 mode continuous programming code is deciphered, but be not limited to the JPEG2000 mode, so long as can the mode that the encoding stream of dynamic image is deciphered be got final product.

In the above-described embodiment, the user sets in ROI configuration part 1020,1040 under the situation in a plurality of ROI zone, also can set different picture quality by each ROI zone.Replace number by the zero of low level position of adjusting non-ROI conversion coefficient, thereby can realize the picture quality of various levels.

In the above-described embodiment, replace, thereby can make ROI zone and non-ROI zone be the different images quality by the low level position of the wavelet transform coefficients after the coded image decoding is carried out zero.This point by under the situation of each path absolute coding, can adopt the method for ending the decoding of variable-length in the centre.In the JPEG2000 mode,, use 3 kinds of processing paths of S path (significance propagation pass), R path (magnitude refinement pass), C path (cleanup pass) as each coefficient bits in the bit plane.In the S path, there is the decoding of the nonsensical coefficient of significant coefficient on every side, in the R path, carry out the decoding of meaningful coefficient, in the C path, carry out the decoding of remaining coefficient.The footpath of line of reasoning everywhere in S path, R path, C path increases according to the contribution degree of this order to image quality in images.Carry out in proper order according to this in line of reasoning everywhere footpath, and the contrast of each coefficient is to determine after near the information of coefficient considering.According to this method, owing to need not to carry out zero displacement, so can further reduce treating capacity.

In the above-described embodiment, the space filtering as image encoding is used has illustrated wavelet transform, but also can adopt other spatial frequency transforms.For example, even under the situation of the discrete cosine transform of in Joint Photographic Experts Group, adopting, by zero displacement being carried out in the low level position of the conversion coefficient of non-region-of-interest with same method, thereby sacrifice the picture quality of non-region-of-interest, improve the compression efficiency of integral image, can improve the picture quality of region-of-interest simultaneously relatively.

Claims

1. a method for encoding images is characterized in that,

In image, set region-of-interest, make described region-of-interest follow the trail of the activity of the target of being paid close attention in the described image, encode with different picture quality with zone in addition at described region-of-interest.

2. picture coding device is characterized in that having:

In image, set the region-of-interest configuration part of region-of-interest;

Detect the activity detection portion of the activity of the concern target in the described image; With

In described region-of-interest and in addition zone, the encoding section of encoding with different picture quality;

Described region-of-interest configuration part makes described region-of-interest follow the trail of the activity of described target.

3. picture coding device according to claim 2 is characterized in that, further has the picture quality configuration part, and it sets the picture quality of described region-of-interest with exterior domain according to institute's assigned code amount.

4. according to claim 2 or 3 described picture coding devices, it is characterized in that further having the target extraction unit that the background in the dynamic image of viewpoint change is separated;

Described region-of-interest configuration part is according to the activity of described background, and makes described region-of-interest follow the trail of the activity of described target.

5. a camera head is characterized in that,

Has the image pickup part of obtaining image;

6. camera head is characterized in that having:

Obtain the image pickup part of image;

In described image, set the region-of-interest configuration part of region-of-interest;

7. according to claim 5 or 6 described camera heads, it is characterized in that further having the picture quality configuration part, it sets the picture quality of described region-of-interest with exterior domain according to institute's assigned code amount.

8. according to claim 5 or 6 described camera heads, it is characterized in that further having the target extraction unit that the background in the dynamic image of viewpoint change is separated;

Described region-of-interest configuration part can be according to the activity of described background, and makes described region-of-interest follow the trail of the activity of described target.

9. image decoding method, it is characterized in that, in image, set region-of-interest, make described region-of-interest follow the trail of the activity of the target of being paid close attention in the described image, dynamic image is deciphered with different picture quality with zone in addition at described region-of-interest.

10. image decoder is characterized in that having:

In image, set the region-of-interest configuration part of region-of-interest;

Detect the activity detection portion of the activity of the target of paying close attention in the described image; With

Decipher the decoding part of dynamic image with different picture quality in described region-of-interest and zone in addition;

11. image decoder according to claim 10 is characterized in that, further has the picture quality configuration part, it sets described region-of-interest and this region-of-interest picture quality with at least one side of exterior domain with reference to the state of this device.

12. a camera head is characterized in that, has the image pickup part of obtaining image;

In described image, set region-of-interest, make described region-of-interest follow the trail of the activity of the concern target in the described image, in described region-of-interest and in addition zone with different picture quality demonstration dynamic images.

13. a camera head is characterized in that having:

Obtain the image pickup part of image;

Detect the activity detection portion of the activity of the concern target in the described image;

In described region-of-interest and in addition zone encoding section with different picture quality coding dynamic images; With

The decoding part that the coded view data of described encoding section is deciphered;

14. according to claim 12 or 13 described camera heads, it is characterized in that further having the picture quality configuration part, it sets described region-of-interest and this region-of-interest picture quality with at least one side of exterior domain with reference to the state of this device.