JP2005341093A - Contents adaptating apparatus, contents adaptation system, and contents adaptation method - Google Patents

Contents adaptating apparatus, contents adaptation system, and contents adaptation method Download PDF

Info

Publication number
JP2005341093A
JP2005341093A JP2004155742A JP2004155742A JP2005341093A JP 2005341093 A JP2005341093 A JP 2005341093A JP 2004155742 A JP2004155742 A JP 2004155742A JP 2004155742 A JP2004155742 A JP 2004155742A JP 2005341093 A JP2005341093 A JP 2005341093A
Authority
JP
Japan
Prior art keywords
video content
content
resolution
gaze area
conversion ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2004155742A
Other languages
Japanese (ja)
Inventor
Yuichi Izuhara
Yoshiaki Kato
Hirobumi Nishikawa
Fuminobu Ogawa
Shunichi Sekiguchi
Junichi Yokosato
優一 出原
嘉明 加藤
文伸 小川
純一 横里
博文 西川
俊一 関口
Original Assignee
Mitsubishi Electric Corp
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Electric Corp, 三菱電機株式会社 filed Critical Mitsubishi Electric Corp
Priority to JP2004155742A priority Critical patent/JP2005341093A/en
Publication of JP2005341093A publication Critical patent/JP2005341093A/en
Application status is Pending legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem of a conventional video transcoding wherein spatial adaptation reproduction, such as limit of a video image placed on a target region and only the target part displayed on the entire screen with high definition is difficult, because video transcoding uses an entire input image frames for a conversion object and objects in the video image are all uniformly reduced, when the display screen of a reproduction terminal (such as a mobile terminal) is small. <P>SOLUTION: This contents adaptation apparatus extracts a target region in digital video contents, calculates resolution conversion ratio, on the basis of resolution information of input video contents and resolution designation information of output video contents, carries out conversion processing of the input video contents on the basis of the calculated resolution conversion ratio and the extracted target region, and converts original video contents into a form which is desirable by a reproduction terminal. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

  The present invention relates to a content adaptation apparatus and adaptation method used for transmission, storage, and reproduction techniques of digital multimedia data such as compressed moving image data.

  With the spread of the Internet and PCs in recent years and the digitization of information appliances such as DVDs and mobile phones, the format of digital multimedia data (such as video encoding methods) is diversifying. In addition, a variety of terminals such as digital broadcast compatible televisions, portable video players, PCs, PDAs, cellular phones, etc. have been generated in terminals that play back these multimedia data. In addition, many of these multimedia-compatible devices can be connected to various networks such as the Internet, mobile communication networks, and home networks, and a function of receiving and reproducing multimedia data through the network is being realized.

  However, while this degree of freedom on the terminal side is expanded, certain video content can be transferred to various terminals via various networks due to restrictions such as processing performance of individual terminals and differences in the bandwidth of the connected network. Realization of an environment for transmission and online playback is extremely difficult. In order to improve such a situation, video transcoding technology for converting compressed video content according to an international standard video encoding scheme such as MPEG to a different bit rate or resolution has been developed (Japanese Patent Laid-Open No. 2001-268571). Publication). Using such transcoding technology, video content can be converted to video content that matches the bandwidth of the network to which the terminal is connected, or video content that matches the video resolution (number of pixels / frames per second) that the terminal can receive Can be provided.

JP 2001-268571 A

However, in the conventional video transcoding, the entire input image frame is subject to conversion.For example, when the display screen of the playback terminal is small (such as a mobile terminal), all the subjects in the video are uniformly reduced. However, it is difficult to perform spatial adaptive reproduction in which only the area to be watched in the video is displayed in high definition on the entire screen.
To perform these displays, 1) First, the video content with the entire frame reduced is received, and a part of the video content is enlarged on the playback terminal side. 2) The original video content resolution is received and played back. It is necessary to devise such that the entire frame is reduced or partially cut out as appropriate on the terminal side, and a preferable solution from the viewpoint of information transmission efficiency for the playback terminal and the processing load on the playback terminal side I can't say that.
In order to solve the problems as described above, it is desirable to perform conversion by adaptively determining an image area to be converted when video transcoding is executed.

  An object of the present invention is to make it possible to convert an original video content into a format most desirable for the playback terminal based on the restrictions on the display on the video content playback terminal side and the requirements for video playback.

  In the present invention, the gaze area in the digital video content is extracted, the resolution conversion ratio is calculated based on the resolution information of the input video content and the output video content resolution designation information, and the calculated resolution conversion ratio The input video content is converted based on the extracted gaze area.

  According to the present invention, only the video content gaze area requested by the content playback terminal can be adapted to the resolution that can be played back by the playback terminal, so that it is possible to perform playback without unnecessary image processing on the content playback terminal side. Flexibility can be realized.

Embodiment 1 FIG.
In the present embodiment, a content adaptation system including a content adaptation device according to the present invention is assumed as a desirable example for describing a specific implementation configuration of the present invention. The content adaptation system in the present embodiment as shown in FIG. 1 includes a network 3, a content server 1, a content playback terminal 4, and a content adaptation device 5 connected to the network 3.
The content server 1 stores a plurality of high-resolution and high-quality video contents, and distributes the video contents 2 in accordance with an external request through the network 3. The content playback terminal 4 requests the content server 1 for video content to be played back, and receives and plays back the video content distributed through the network 3. Here, it is assumed that the content reproduction terminal 4 cannot receive the video content transmitted from the content server 1 in the same video format. In particular, regarding the horizontal / vertical size (number of pixels, number of scanning lines) of each video frame, the size that can be received by the content reproduction terminal 4 is smaller than the size of the video content transmitted from the content server 1.

In such a situation, the content adaptation device 5 adapts the video content sent from the content server 1 to a format suitable for playback on the content playback terminal 4 and sends it as converted video content 6. Through the content adaptation device 5, the content playback terminal 4 can receive and play back the video content requested of the content server 1 as the converted video content 6.
Based on such a system, the configuration and operation of the content adaptation apparatus 5 of the present embodiment will be described in detail below. The content adaptation device 5 according to the present embodiment generates the converted video content 6 while changing the image area to be converted in the input video content 2 based on the designation from the terminal side. To do.
By using such a content adaptation device 5, the content reproduction terminal 4 can be configured to receive and reproduce the video information that is really necessary at the minimum necessary cost. In the following description, the converted video content 6 is described as “output video content” from the standpoint of the operation of the content adaptation device 5.

The content adaptation apparatus 5 according to the present embodiment has a function of automatically detecting a region having a high activity related to a predetermined video feature amount for an input video content and extracting a rectangular image region including a region having a high activity. The display size of the content playback terminal 4 is selected based on the designation from the user by selecting the process of reducing the entire input video content or cutting out the area where the activity is high while maintaining the resolution of the input video content. It has a function of outputting output video content that matches the above.
Various kinds of information can be defined for the video feature amount, but in this embodiment, attention is particularly paid to information on movement in the input video content.
An example of specific functions realized by the content adaptation apparatus 5 of the present embodiment will be described with reference to FIG. In FIG. 2, it is assumed that an area indicated by a broken-line rectangle is extracted as a high activity area in the input video content (not shown in the figure, but an object having movement such as a person is included in the broken-line area. It is assumed that it exists).

The original resolution of the input video content must be converted to the resolution of the output video content by the content adaptation device 5. At this time, according to the content adaptation device 5 of this embodiment, the frame of the input video content Select the method to generate the output video content by reducing the whole and the method to generate the output video content without cutting down the high activity area indicated by the broken line with the frame size of the output video content and downconverting the resolution It becomes possible to do.
As a result, the effect of improving the flexibility of content playback on the content playback terminal 4 side can be obtained. Whichever one is selected, the number of pixels transmitted from the content adaptation apparatus 5 to the content reproduction terminal 4 is the same, so there is no need to transmit useless information to the reproduction terminal 4 side.

  In addition, the content adaptation apparatus 5 assumes an MPEG format widely used as an international standard system as input / output video content. In the MPEG format, each frame of video content is divided into 16 pixel × 16 scanning line rectangular blocks called macroblocks, and digital compression coding is performed in units of macroblocks. Therefore, in principle, the image resolution of the input video content is an integer multiple of 16 in both the horizontal and vertical directions. The content adaptation device 5 utilizes the fact that both the input and output are in the MPEG format, and reuses the compression-coded data of each macroblock of the input video content for the output video content generation processing, thereby converting the processing. To improve efficiency.

FIG. 3 shows the internal configuration of the content adaptation apparatus 5 in the present embodiment. FIG. 4 shows a processing flow of the content adaptation apparatus 5 of FIG.
First, the content adaptation apparatus 5 receives the input video content 2 and the input video content resolution information 7 from the content server 1, and the output video content resolution designation information 8 and the gaze area reproduction instruction information 9 from the content reproduction terminal 4 as inputs.
The output video content resolution designation information 8 and the gaze area reproduction instruction information 9 may be transferred to the content adaptation device 5 through the content server 1. The content adaptation apparatus 5 confirms the gaze area reproduction instruction information 9 and switches the subsequent processing based on the state of the information (step S1).

When the gaze area reproduction instruction information 9 indicates “reproduce the gaze area”, the video data of the input video content 2 is analyzed, and an image area to be the gaze area is extracted (step S2). This process is performed by the gaze area extraction unit 10.
In this embodiment, the gaze area extraction result is the coordinate information of the macroblock position that is the start point of the gaze area (conversion processing start point 11), and the gaze area reproduction instruction information 9 indicates that the gaze area is not reproduced. In the case shown, this processing is not performed, but equivalently, as a result of this processing, the position of the first macroblock of the input video content is set at the conversion processing start point 11. Details of the process will be described later.

  If it is determined in step S1 that “the gaze area is not reproduced”, the process proceeds to the resolution conversion ratio determination process (step S3). This process is performed by the resolution conversion ratio determination unit 12. Specifically, the resolution conversion ratio 13 between the input and output video contents is obtained from the input video content resolution information 7 and the output video content resolution designation information 8, and the output of the macroblock coding information of the input video content is based on the resolution conversion ratio 13. This corresponds to the processing for obtaining the mapping ratio to the macroblock information. When the gaze area reproduction instruction information 9 indicates “reproduce the gaze area”, this processing is not performed, but the resolution conversion ratio 13 is equivalently “1”. Details will be described later.

Next, the input video content conversion process is performed using the two parameters of the conversion process start point 11 and the resolution conversion ratio 13 (step S4). This processing is performed by the video content conversion unit 14. As basic processing, each frame of the input video content is temporarily decoded, and further, macroblock information of the output video content is generated using the macroblock information of the input video content extracted from the compression encoded data in the decoding process. The output video content is generated by re-encoding the input video content returned to the decoded image using the result.
That is, in this process, the conversion processing start point 11 designates the point at which the process of mapping the macroblock information of the input video content is started, and the resolution conversion ratio 13 is a number of pieces of macroblock information of the input video content. Will be specified. These parameters are indispensable information for area adaptive video content adaptation realized by the content adaptation apparatus 5 in the present embodiment.

Hereinafter, each process will be described in detail.
Gaze area extraction process (step S2)
A gaze area extraction process = conversion process start point 11 determination process in the gaze area extraction unit 10 will be described. In the present embodiment, a plurality of areas that are candidates for the gaze area are determined in advance as an area group. The gaze area is defined as an image area in the input video content whose motion activity is larger than a predetermined threshold. The activity of motion is defined by a regional sum Σ E k of a motion vector magnitude E k (k: an increment counter of a macroblock in a frame) which is macroblock encoded data of the input video content.
Applying Σ E k to an area starting from all macroblocks in the frame increases the computational cost, so here, the starting point candidates are limited to several in advance. A specific example is shown in FIG. The largest rectangle is the frame resolution of the input video content. FIG. 5 (a) shows the case where the resolution of the output video content is exactly half the size of the resolution of the input video content. This is an example in which the resolution of the output video content is exactly one-fourth the size of the resolution of the input video content. A cross indicates the starting point of the gaze area. Which of (a) and (b) is selected depends on the output video content resolution designation information 8 input to the content adaptation device 5. Each gaze area candidate is assumed to have a size that is an integral multiple of the macroblock both horizontally and vertically. Of course, these are only examples of settings, and various other setting methods are conceivable. The gaze area extraction unit obtains the activity Σ E k for each gaze area candidate set as shown in FIG. 5 and extracts the area having the largest activity as the gaze area.

  Note that each gaze area candidate determined as shown in FIG. 5 is spatially dispersed, so switching the gaze area in units of frames is unsightly as a reproduced video. In order to avoid such frequent changes in the gaze area, the gaze area is changed by, for example, when a gaze area change request is explicitly issued from the content server 1 or the content reproduction terminal 4, for a predetermined time interval and a different gaze area. For example, when a candidate has a state in which the activity is extremely high, the switching is limited to some trigger (gaze area change trigger). Although not shown, the gaze area change trigger may be information detected in the process of analyzing the input video content inside the gaze area extraction unit 10 or information supplied to the gaze area extraction unit 10 as an external signal. Good.

Resolution conversion ratio determination process (step S3)
Processing for determining the resolution conversion ratio 13 in the resolution conversion ratio determination unit 12 will be described. In general, in the system as shown in FIG. 1, when the input video content resolution 7> = the output video content resolution designation information 8 and the input video content resolution information 7 and the output video content resolution designation information 8 do not match, the input video content resolution 7 In order to facilitate mapping between the macroblock information of the content and the macroblock information of the output video content, the resolution of the input video content resolution is converted to 1 / R 2 (R is a positive integer) in the same horizontal and vertical directions. Is desirable.
For example, when the input video content resolution is CIF (352 pixels × 288 scan lines) and the output video content resolution is QCIF (176 pixels × 144 scan lines), R is exactly 1. At this time, four macro blocks of the input video content correspond to exactly one macro block of the output video content, and it is easy to perform mapping of macro block information between input and output. Let R be the resolution conversion ratio 13 determined from the input video content resolution information 7 and the output video content resolution designation information 8.

  In practical applications, R is only a value of about 1 to 3 at most. The reason is that when R is 4, the image size is 1/16 horizontal and vertical with respect to the input video content resolution, and even if the input video content is HDTV, the output is small enough to be used for a mobile phone or the like. This is because the resolution is smaller than the resolution of the device. As R increases, the accuracy of compression-encoded data mapping between input and output decreases, which is not preferable in terms of conversion processing performance.

Meanwhile, practice, output picture content resolution designation information 8, there is always a problem that does not conform to 2 R portion of the first resolution of the input image content resolution.
For example, it is assumed that the input video content is compression-encoded video of 704-pixel × 480-scan video in ITU-R BT.601 format, and QCIF (176 pixels × 144 scan-line) is used as output video content resolution designation information 8. Consider the case where it is specified. Both are horizontal and vertical resolutions that are integer multiples of the macroblock, but at this time, the horizontal direction may be 704 / (2 * 2) = 176 (R = 2). 480 / (2 * 2) = 120, which is not the QCIF designated by the output video content resolution designation information 8. In the above example, since the number of scanning lines in the case of R = 2 in the vertical direction is not a multiple of 16, the number of scanning lines is not suitable for re-encoding in units of macroblocks.

From the above, in addition to the resolution conversion ratio determining process for determining the R as resolution conversion ratio 13 verifies whether or not the input picture content resolution information 7 2 R min 1 result for is a multiple of 16, Based on the result, a process for converting the final output video content is required. In the present embodiment, this latter process is performed as a part of the process in the video content conversion unit 14, and will be specifically described later in the description of the video content conversion process.

Video content conversion process (step S4)
Next, video content conversion processing in the video content conversion unit 14 will be described.
As described above, the basic processing here is to temporarily decode each frame of the input video content, and then use the macroblock information of the input video content extracted from the compression encoded data in the decoding process to output the macroblock of the output video content A process of generating information is performed, and the output video content is generated by re-encoding the input video content returned to the decoded image using the result.
In this re-encoding process, the conversion process start point 11 from the gaze area extraction unit 10 designates a point at which the process of mapping the macroblock information of the input video content is started, and from the resolution conversion ratio determination unit 12 The resolution conversion ratio 13 defines how many pieces of macroblock information of the input video content are mapped together, and re-encoding is performed using these parameters to generate the output video content. The

In the following, a specific procedure in the video content conversion process will be described.
When only the specific image area in each frame of the input video content is converted as a result of the gaze area extraction, special treatment is required for the macroblock located at the edge of the specific image area. A motion vector included in such a macroblock may be a vector indicating the inside of the frame in the input video content, but may be a vector indicating the outside of the frame in the output video content. If this is used as it is, the encoding efficiency at the screen edge of the output video content may be reduced.

  There are various methods for solving this problem. For example, only when the result of mapping the motion vector of the input video content to the motion vector of the output video content points outside the screen of the output video content, the motion vector is searched again. To implement. Also, when resolution conversion is involved (for example, when R = 1), the four macroblocks of the input video content are mapped to one macroblock of the output video content in order to map the motions included in the four macroblocks of the input video content. Of the vectors, a vector indicating the screen may be selected as a motion vector of the output video content.

Further, as described in the above resolution conversion ratio determination process (step 3), in practice, the output picture content resolution designation information 8, may be referred to not necessarily match the input picture content resolution 2 R min of 1 resolution A video content conversion process in such a case will be described below. To verify whether the input picture content resolution information 7 2 R min 1 result for is a multiple of 16, performs conversion processing of the final output video content based on the results.

First, a value that is lower than both the result of downscaling the resolution 7 of the input video content based on the resolution conversion rate 13 determined by the resolution conversion rate determination unit 12 and the output video content resolution designation information 8, and the macro The maximum provisional value of the resolution that is an integral multiple of the block is obtained. Then, re-encoding is performed with the provisional resolution, and the encoded data corresponding to the difference between the provisional resolution and the resolution based on the output video content resolution designation information 8 is filled with dummy encoded data.
For example, if the input video content resolution 7 indicates a QVGA resolution of 320 pixels × 240 scanning lines and the output video content resolution designation information 8 is QCIF (176 pixels × 144 scanning lines), the provisional value is horizontal. The direction is 160 and the vertical direction is 112. As a result, by adding dummy data for one macroblock (16 pixels) in the horizontal direction and two macroblocks (32 scanning lines) in the vertical direction, the resolution of the final output video content is specified as output video content resolution designation information. 8 (see FIG. 6).

  The dummy encoded data is, for example, in a frame encoded intra (intra-frame), adding data encoded with monochrome data such as black and gray, and in a frame encoded inter (inter-frame prediction) This can be realized by a method in which encoded data that is forcibly set to an encoding mode for copying dummy data information of a frame to be intra-encoded is directly applied. A predetermined background image or the like may be used for the monochromatic data. However, the monochromatic data is more desirable in terms of encoding efficiency because the monochromatic data eliminates the need for an AC component to be encoded.

  Also, depending on the encoding method, there is a method for encoding / decoding macroblocks by using information of neighboring macroblocks for prediction. Therefore, when inserting dummy encoded data, dummy data is intentionally It is necessary to generate the encoded data by predicting the operation on the decoder side that cannot recognize the insertion. If this is performed strictly one by one, the processing load of the content adaptation device 5 itself is increased. In general, in a moving picture coding system such as MPEG, compression is performed for the purpose of suppressing transmission errors occurring in video compression data that has undergone intraframe / interframe prediction or variable length coding in the time direction / space direction. A structure such as a slice video packet that breaks the neighborhood dependency of encoding is defined. Therefore, when inserting dummy encoded data, the syntax of the dummy encoded data becomes a slice or video packet that is different from the original video data, so that it depends on the neighborhood in the decoder as described above. It is possible to configure the content adaptation device 5 without generating dummy encoded data assuming a reliable operation.

  With the content adaptation device 5 as described above, the content reproduction terminal 4 can not only reproduce the requested video content in accordance with its reproducible resolution, but also the resolution of the video content based only on the gaze area. Can be played. Therefore, the flexibility of playback can be realized without any image processing on the content playback terminal 4 side, and since the video content of the same resolution is always received on the content playback terminal 4 side, adaptive playback is also possible from the viewpoint of information transmission efficiency. Therefore, it is not necessary to receive and reproduce useless information.

Embodiment 2. FIG.
As a specific application example that brings about the effect of the content adaptation device 5, for example, in a video surveillance system, the content server 1 accumulates videos from a plurality of surveillance cameras and generates a video that is synthesized into one screen and compresses it. Some of them are encoded and transmitted as input video content.
In this case, the input video content to the content adaptation apparatus 5 has semantically the number of gaze areas corresponding to the number of synthesized screens. For the sake of simplicity, it is assumed that there is an input video content in which four video images of the surveillance camera having the playback resolution of the content playback terminal 4 are synthesized. At this time, the abnormality detection alarm in the monitoring camera is set as the gaze area change trigger described above. That is, the abnormality detection alarm may be information detected in the process of analyzing the input video content inside the gaze area extraction unit 10 or supplied to the gaze area extraction unit 10 as an external signal (for example, according to a supervisor's instruction). Information may be used. In this case, the gaze area change trigger includes information for instructing the area change and designating the change area.

  In such a case, the content adaptation apparatus 5 normally converts the resolution of the video obtained by combining the video of a plurality of surveillance cameras into half each of the horizontal and vertical directions so as to match the resolution of the content playback terminal 4. To transmit. When an abnormality is detected by a certain camera, the alarm is received and the gaze area where the video of the corresponding camera is synthesized is specified, so that only the gaze area is high-definition with the resolution of the original camera. It is transmitted to the content reproduction terminal 4. As a result, the playback terminal 4 can obtain a higher-definition image than the case where four regions are synthesized for the region to be watched, so that the watched object can be monitored in detail.

Embodiment 3 FIG.
Further, as an example of a content adaptation system using the content adaptation device 5 of the present invention, a system configuration as shown in FIG. 7 can be adopted. In this case, the content server 1 includes the content adaptation device 5, and the content server 1 includes the functions of the content adaptation device 5.

Embodiment 4 FIG.
In the above embodiment, when the gaze area reproduction instruction information 9 indicates “reproduce the gaze area” (Y in step S1), the resolution conversion ratio determination process (step S3) is not performed and the resolution conversion is equivalently performed. The ratio 13 is “1”. This is to cut out the gaze area with the frame size of the output video content and generate the output video content without down-converting the resolution. In the present invention, even when the gaze area is reproduced, the gaze area is reproduced. The output video content may be generated by changing the resolution corresponding to the output video content resolution designation information 8.

  A flowchart in this case is shown in FIG. After the gaze area extraction process (step S2), the process proceeds to the resolution conversion ratio determination process (step S3), and the resolution conversion ratio of the gaze area corresponding to the playback terminal display resolution 8 is determined. In this case as well, as described in the first embodiment, the post-conversion resolution based on the resolution conversion ratio of the gaze area may not match the output video content resolution designation information 8, so the video content conversion process (step S 4). ), The same re-encoding processing as in the first embodiment is performed.

  In the description of the above embodiment, the video feature quantity used as the key for gaze area extraction is the motion activity (the magnitude of the motion in the input video content). May be any information that can trigger a gaze area extraction. For example, the color in the input video content may be used as the video feature amount to detect a specific noteworthy color. In this case, only when a region including a specific color is included in the input video content, the gaze region including the region can be extracted. Furthermore, if there is a function of extracting a human face area from input video content, an area including a person or a video area including a specific person from facial features can be set as a gaze area.

  In the video content conversion process, the conversion operation focusing on the resolution has been described, but it goes without saying that the content adaptation apparatus may convert the data format, data bit rate, and the like of the content.

  Note that the above description of the embodiment also shows a method for adapting content. Each device can also be realized by software operating on a computer.

1 is a configuration diagram showing a content adaptation system in Embodiment 1. FIG. It is explanatory drawing explaining the example of an adaptation process by the content adaptation apparatus. 3 is a configuration diagram showing a configuration of a content adaptation device 5. FIG. It is the flowchart which showed the processing flow of the content adaptation apparatus. It is explanatory drawing which showed the specific example of a gaze area | region setting. It is explanatory drawing which showed the specific example of the video content conversion process containing resolution conversion. It is the block diagram which showed another example of the content adaptation system. It is the flowchart which showed the processing flow of the content adaptation apparatus.

Explanation of symbols

DESCRIPTION OF SYMBOLS 1 Content server 2 Input video content 3 Network 4 Content reproduction terminal 5 Content adaptation apparatus 6 Output video content 7 Input video content resolution information 8 Output video content resolution designation information 9 Gaze area reproduction instruction information 10 Gaze area extraction unit 11 Conversion processing start point 12 Resolution conversion ratio determination unit 13 Resolution conversion ratio 14 Video content conversion unit

Claims (13)

  1. A gaze area extraction unit that extracts a gaze area in the digital video content;
    A resolution conversion ratio calculation unit that calculates a resolution conversion ratio based on the resolution information of the input video content and the output video content resolution designation information;
    And a video content conversion unit that performs conversion processing of the input video content based on the resolution conversion ratio calculated by the resolution conversion ratio calculation unit and the gaze area extracted by the gaze area extraction unit. Content adaptation device.
  2. The gaze area extraction unit stores a group of areas that are predetermined gaze area candidates for the frame image of the input video content data, and quantifies the image feature amount included in each area of the area group as an activity. 2. The content adaptation apparatus according to claim 1, wherein an area with the highest activity is extracted as a gaze area.
  3. Both the input video content and the output video content are MPEG-encoded video data, and the candidate areas for the gaze area are areas each composed of an integral multiple of a macroblock in both horizontal and vertical directions. 3. The content adaptation apparatus according to claim 2, wherein the activity is calculated as a sum of magnitudes of motion vectors of macroblocks included in a region as a gaze region candidate.
  4. Both the input video content and the output video content are MPEG-encoded video data, and the video content conversion unit converts the resolution of the input video content according to the resolution conversion ratio to be input, either horizontal or vertical If the converted resolution does not match the integer multiple of the macroblock, the result of converting the resolution of the input video content based on the resolution conversion ratio and the resolution of the specified output video content are both lower than the macroblock. 2. A maximum provisional value among resolutions that are integer multiples is obtained, and image data that is a difference between the resolution of the designated output video content and the provisional value is filled with dummy encoded data. The content adaptation apparatus described.
  5. 2. The content adaptation apparatus according to claim 1, wherein the control signal for controlling the driving of the gaze area extraction unit is a signal notified as a command for requesting execution of extraction from the content reproduction terminal.
  6. 2. The content adaptation according to claim 1, wherein the control signal for controlling the driving of the gaze area extracting unit is a signal notifying whether or not an activity calculated in the gaze area extracting unit exceeds a predetermined threshold value. Device.
  7. A content adaptation system that adapts digital video content according to the playback capability of a content playback terminal,
    A content server that delivers digital video content in response to a request from a content playback terminal;
    A content adaptation device that inputs digital video content distributed by the content server, converts image resolution, and outputs it as output video content;
    A content playback terminal that requests digital video content from the content server and receives output video content from the content adaptation device;
    The content adaptation device includes a gaze area extraction unit that extracts a gaze area in digital video content, and a resolution conversion ratio that calculates a resolution conversion ratio based on resolution information of the input video content and output video content resolution designation information A calculation unit; and a video content conversion unit that performs a conversion process of the input video content based on the resolution conversion ratio calculated by the resolution conversion ratio calculation unit and the gaze area extracted by the gaze area extraction unit. Content adaptation system characterized by
  8. A gaze area extraction step for extracting a gaze area in the digital video content;
    A resolution conversion ratio calculating step for calculating a resolution conversion ratio based on the resolution information of the input video content and the output video content resolution designation information;
    And a video content conversion step of performing a conversion process of the input video content based on the resolution conversion ratio calculated in the resolution conversion ratio calculation step and the gaze area extracted in the gaze area extraction step. Content adaptation method.
  9. In the gaze area extraction step, the image feature amount included in each area of the area group that is a predetermined gaze area candidate for the frame image of the input video content data is quantified as an activity, and the area with the largest activity is determined. The content adaptation method according to claim 8, wherein the content adaptation method is extracted as a gaze area.
  10. Both the input video content and the output video content are MPEG-encoded video data, and the candidate areas for the gaze area are areas each composed of an integral multiple of a macroblock in both horizontal and vertical directions. The content adaptation method according to claim 9, wherein the activity is calculated as a sum of magnitudes of motion vectors of macroblocks included in a region that is a gaze region candidate.
  11. Both the input video content and the output video content are MPEG-encoded video data, and the video content conversion step converts the resolution of the input video content according to the input resolution conversion ratio. If the converted resolution does not match the integer multiple of the macroblock, the result of converting the resolution of the input video content based on the resolution conversion ratio and the resolution of the specified output video content are both lower than the macroblock. 9. The maximum provisional value among resolutions that are integer multiples is obtained, and image data that is the difference between the resolution of the designated output video content and the provisional value is filled with dummy encoded data. The content adaptation method described.
  12. 9. The content adaptation method according to claim 8, wherein the gaze area extraction step is controlled based on a command requesting execution of extraction from the content reproduction terminal.
  13. 9. The content adaptation method according to claim 8, wherein the gaze area extraction step is controlled based on whether or not the activity calculated in the gaze area extraction step exceeds a predetermined threshold value.
JP2004155742A 2004-05-26 2004-05-26 Contents adaptating apparatus, contents adaptation system, and contents adaptation method Pending JP2005341093A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2004155742A JP2005341093A (en) 2004-05-26 2004-05-26 Contents adaptating apparatus, contents adaptation system, and contents adaptation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2004155742A JP2005341093A (en) 2004-05-26 2004-05-26 Contents adaptating apparatus, contents adaptation system, and contents adaptation method

Publications (1)

Publication Number Publication Date
JP2005341093A true JP2005341093A (en) 2005-12-08

Family

ID=35494165

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2004155742A Pending JP2005341093A (en) 2004-05-26 2004-05-26 Contents adaptating apparatus, contents adaptation system, and contents adaptation method

Country Status (1)

Country Link
JP (1) JP2005341093A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009247564A (en) * 2008-04-04 2009-10-29 Namco Bandai Games Inc Game animation distribution system
JP2009247562A (en) * 2008-04-04 2009-10-29 Namco Bandai Games Inc Game animation distribution system
WO2012060459A1 (en) * 2010-11-01 2012-05-10 日本電気株式会社 Dynamic image distribution system, dynamic image distribution method, and dynamic image distribution program
JP2013115597A (en) * 2011-11-29 2013-06-10 Canon Inc Image processor
WO2015034061A1 (en) * 2013-09-06 2015-03-12 三菱電機株式会社 Video encoding device, video transcoding device, video encoding method, video transcoding method and video stream transmission system
WO2017142354A1 (en) * 2016-02-19 2017-08-24 알카크루즈 인코포레이티드 Method and system for gpu based virtual reality video streaming server

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009247564A (en) * 2008-04-04 2009-10-29 Namco Bandai Games Inc Game animation distribution system
JP2009247562A (en) * 2008-04-04 2009-10-29 Namco Bandai Games Inc Game animation distribution system
WO2012060459A1 (en) * 2010-11-01 2012-05-10 日本電気株式会社 Dynamic image distribution system, dynamic image distribution method, and dynamic image distribution program
JP5811097B2 (en) * 2010-11-01 2015-11-11 日本電気株式会社 Moving image distribution system, moving image distribution method, and moving image distribution program
US9414065B2 (en) 2010-11-01 2016-08-09 Nec Corporation Dynamic image distribution system, dynamic image distribution method and dynamic image distribution program
JP2013115597A (en) * 2011-11-29 2013-06-10 Canon Inc Image processor
WO2015034061A1 (en) * 2013-09-06 2015-03-12 三菱電機株式会社 Video encoding device, video transcoding device, video encoding method, video transcoding method and video stream transmission system
JPWO2015034061A1 (en) * 2013-09-06 2017-03-02 三菱電機株式会社 Moving picture coding apparatus, moving picture transcoding apparatus, moving picture coding method, moving picture transcoding method, and moving picture stream transmission system
WO2017142354A1 (en) * 2016-02-19 2017-08-24 알카크루즈 인코포레이티드 Method and system for gpu based virtual reality video streaming server
US9912717B2 (en) 2016-02-19 2018-03-06 Alcacruz Inc. Systems and method for virtual reality video conversion and streaming
US10334224B2 (en) 2016-02-19 2019-06-25 Alcacruz Inc. Systems and method for GPU based virtual reality video streaming server

Similar Documents

Publication Publication Date Title
US7054366B2 (en) Systems and methods for MPEG subsample decoding
JP4637585B2 (en) Mosaic program guide generation device and mosaic program guide generation method
TWI475891B (en) Image processing apparatus and method, program and recording medium
EP1878260B1 (en) Method for scalably encoding and decoding video signal
JP2011101411A (en) Method for sub-pixel value interpolation
US7362804B2 (en) Graphical symbols for H.264 bitstream syntax elements
US9414082B1 (en) Image decoding device and method thereof using inter-coded predictive encoding code
JP4611640B2 (en) Method for encoding motion in a video sequence
JP4369090B2 (en) Method for encoding and decoding video information, motion compensated video encoder and corresponding decoder
JP5089658B2 (en) Transmitting apparatus and transmitting method
US20030095603A1 (en) Reduced-complexity video decoding using larger pixel-grid motion compensation
US6771704B1 (en) Obscuring video signals for conditional access
JP4414345B2 (en) Video streaming
JP2010525658A (en) Adaptive reference image data generation for intra prediction
JP4682410B2 (en) Image processing apparatus and image processing method
KR101336244B1 (en) System and method for introducing virtual zero motion vector candidates in areas of a video sequence involving overlays
US8218638B2 (en) Method and system for optical flow based motion vector estimation for picture rate up-conversion
EP1605706A2 (en) Advanced video coding (AVC) intra prediction scheme
JP2003309851A (en) Video data converter and video data converting method
US20110026591A1 (en) System and method of compressing video content
JP3979897B2 (en) Video coding bitstream transcoding method
JP4987322B2 (en) Moving picture decoding apparatus and moving picture decoding method
US20130089265A1 (en) Method for encoding/decoding high-resolution image and device for performing same
JP4755093B2 (en) Image encoding method and image encoding apparatus
JP5037938B2 (en) Image encoding / decoding device, encoding / decoding program, and encoding / decoding method

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20070227

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20090113

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090127

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20090327

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20100209