KR20120100448A - Apparatus and method for video encoder using machine learning - Google Patents
Apparatus and method for video encoder using machine learning Download PDFInfo
- Publication number
- KR20120100448A KR20120100448A KR1020110019350A KR20110019350A KR20120100448A KR 20120100448 A KR20120100448 A KR 20120100448A KR 1020110019350 A KR1020110019350 A KR 1020110019350A KR 20110019350 A KR20110019350 A KR 20110019350A KR 20120100448 A KR20120100448 A KR 20120100448A
- Authority
- KR
- South Korea
- Prior art keywords
- video
- image
- unit
- mode
- optimal
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
The present invention relates to a method and apparatus for compressing and decompressing a moving image. By using a machine learning method, a motion prediction and a macro block mode determination portion, which requires the most resources in the process of compressing and decompressing a video, can be performed. An image compression apparatus and method for reducing the compression time of the motion predictor by 40% or more and the overall image compression time by 20% or more while maintaining the same level as the method.
H.264 or AVC (Advanced Video Coding) is the latest video compression technology that offers superior performance in compression efficiency, image quality, and bit rate compared to previous compression technologies. This video decompression technology is widely commercialized through digital TV, and is widely used in various applications such as video telephony, video conferencing, DVD, games, and 3D TV, and its market is also enormous. At the same time, research institutes are speeding up the development of relevant technologies. These technologies are currently being standardized at each stage, and jointly developed by the ITU's Video Coding Experts Group (VCEG) and ISO / IEC's Motion Picture Experts Group (MPEG), to verify and compare the developed algorithms. A reference software is provided. Currently, H.264 or AVC compression technology provides superior performance in terms of compression efficiency, image quality, and bit rate compared to previous versions.However, compared to this version, the motion prediction method is much more complicated. 8x16, 8x8, 4x8, 8x4, and 4x4 modes are used. Therefore, more than 70% of the resources required for image compression are consumed for motion prediction and macroblock mode determination for image compression.
As described above, the video decompression technology, which forms a huge market, is applied to various kinds of products and forms a huge market worldwide. In the latest video decompression technology, H.264 or AVC (Advanced Video Coding), motion prediction and macro block mode determination consume more than 70% of the resources required for full video compression. Therefore, in the case of H.264 or AVC (Advanced Video Coding) technology, it is necessary to increase the efficiency of the motion prediction and macro block mode decision part. In the present invention, the motion prediction and the same are maintained while maintaining the same compression efficiency, image quality, and bit rate of the image. The purpose is to provide a solution that can reduce the resources consumed by the macro block mode decision part by more than 40%.
A macro block feature extractor which extracts feature values, such as an average value and a variance, of each size macroblock from an input video, and a position in a mode decision tree determined offline by using feature extract values obtained for each macroblock. Decision tree calculation unit for searching for, Optimal mode determination unit for extracting the best mode information from the current macro block using the decision tree value to which the feature belongs, Macro block mode using the mode decision tree using the existing video compression method It consists of an image compression unit that performs video compression using
As described above, the video compression technique proposed in the present invention has an advantage of reducing the resources required for video compression by 40% or more compared to H.264, which is the latest video decompression technique.
Therefore, in PC-based environment where video decompression technology is used, the time required for video compression can be minimized in video decompression environments such as video conferencing, telemedicine, and video telephony. There are advantages to it. It also has the advantage of making more efficient use of PC's computing resources.
Even when implemented as a dedicated ASIC such as a digital TV or a video phone, the ASIC can be easily implemented by simplifying a high-computation video compression portion, which can shorten the development time. In addition, by reducing the number of gates of the ASIC chip can be implemented as a small, low-cost ASIC, and can be developed as a low-power chip, it can be effectively used in a portable video telephony environment that requires low power.
1 is a block diagram of a video compression device using machine learning according to an embodiment of the present invention
2 is an internal view of offline mode decision tree generation according to an embodiment of the present invention.
3 is an example of determining macro block information in an image according to an embodiment of the present invention.
4 is an internal configuration example of a video selection input unit according to an embodiment of the present invention;
5 is an internal configuration example of a video selection input unit according to an embodiment of the present invention
6 is an internal configuration example of a video selection input unit according to an embodiment of the present invention;
7 is a motion prediction time comparison table for 14 types of test images according to an embodiment of the present invention.
8 is a motion prediction time comparison chart for 14 types of test images according to an embodiment of the present invention.
9 is a file size comparison chart for 14 types of test images according to an embodiment of the present invention.
10 is a PSNR comparison chart of 14 kinds of test images according to an embodiment of the present invention.
11 is an MSE comparison chart of 14 kinds of test images according to an embodiment of the present invention.
12 is a SSIM comparison chart for 14 types of test images according to an embodiment of the present invention.
In accordance with an aspect of the present invention, there is provided a video compression technique and apparatus using machine learning, including: a macro block feature value extraction unit for extracting feature values, such as an average value and a variance, of a macro block of each size from an input video, each macro; A decision tree calculation unit for searching a position in a mode decision tree determined offline by using feature value extraction values obtained for each block, and extracting optimal mode information from a current macro block using a decision tree value to which feature values belong. An optimal mode determiner and an image compressor which use the existing video compression method but perform video compression using a macro block mode using a mode decision tree.
BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.
In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. The following terms are defined in consideration of the functions in the embodiments of the present invention, which may vary depending on the intention of the user, the intention or the custom of the operator. Therefore, the definition should be based on the contents throughout this specification.
Combinations of the steps of each block in the accompanying block diagrams may be performed by computer program instructions. These computer program instructions may be mounted on a processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, such that the instructions executed by the processor of the computer or other programmable data processing equipment are described in each block of the block diagram. It creates a means to perform the functions. These computer program instructions may be directed to a computer or other programmable data processing equipment to implement the functionality in a particular manner, and instructions stored in a computer usable or computer readable memory may perform the functions described in each block of the block diagram. It is also possible to produce articles of manufacture containing instruction means. Computer program instructions may also be mounted on a computer or other programmable data processing equipment, such that a series of operating steps may be performed on the computer or other programmable data processing equipment to create a computer-implemented process to create a computer or other programmable data. Instructions that perform processing equipment may also provide steps for executing the functions described in each block of the block diagram.
Each block may also represent a module, segment or portion of code that includes one or more executable instructions for executing a specified logical function (s). It should also be noted that in some alternative embodiments, the functions noted in the blocks may occur out of order. For example, the two blocks shown in succession may in fact be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending on the corresponding function.
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
1 is a block diagram of a video compression apparatus using machine learning according to an embodiment of the present invention, a
Referring to FIG. 1, the
[Table 1] shows the low resolution video of 176x144, which is widely used as a video transmission format of mobile devices such as mobile phones. It can include up to 1280x720 high-definition video in use and up to 1920x1080 Full HD video in large-screen display devices. These image signals are output to the
The macro block
In this manner, the macro
Referring to FIG. 2, an image provided by the video
The
The final training data learned through the mode learner generates the
The mode information provided by the optimal image compression unit and the feature value information provided by the macro block feature value extractor are important in mode learning in the
That is, as in the actual example of FIG. 3, in the case of an arbitrary image, the macroblock mode of all seven cases may or may not be included depending on the image. In addition, even if all seven macroblock mode information is included, information necessary for learning may be insufficient. Therefore, various kinds of video inputs are required to secure sufficient data for learning.
The video
The video input selector extracts not only four types of video inputs as shown in FIG. 4, but also an image having a very high complexity in the image and a video having a very low complexity in the image as shown in FIG. 5. By providing the
Referring back to FIG. 1, the
Using the mode information provided by the
The
6 to 12 illustrate the video compression performance using the video
Comparison of the image quality between the conventional method and the method proposed in this patent is shown in FIGS. 10 to 12, and it can be seen that there is a difference that cannot be recognized by the human eye.
In addition, while the above has been illustrated and described with respect to a preferred embodiment of the present invention, the present invention is not limited to the specific embodiment described above, the invention belongs without departing from the gist of the invention claimed in the claims. Various modifications may be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or the prospect of the present invention.
Description of the Related Art
100: video input unit 200: video selection input unit
210: low speed motion image providing unit 220: high speed motion image providing unit
230: complex form image providing unit 240: simple form image providing unit
250: Image selector 300: Macro block feature value extractor
400: decision tree calculator 500: optimal mode determiner
600: image compression unit 700: optimal image compression unit
800: compressed video 900: mode learning unit
1000: Optimal Decision Tree
Claims (6)
A macroblock feature value extraction unit for extracting feature value information of a macroblock from a video provided by the video input unit;
A decision tree calculator configured to provide mode decision information by using feature value information provided by the macro block feature value extractor and an offline optimally generated decision tree;
An optimum mode determiner for providing an optimal mode information to the image compressor from the mode decision information provided by the decision tree calculator;
And an image compressor for generating a compressed video using the video provided by the video input unit and the optimal mode information provided by the optimal mode determiner.
Video compression device using machine learning.
The decision tree calculation unit using an optimal decision tree generated offline,
A video selection input unit providing an image simultaneously to an optimum image compression unit and a macroblock feature extraction unit;
A macroblock feature value extraction unit for extracting feature value information of a macroblock from a video provided by the video selection input unit;
An optimal image compression unit which performs optimal image compression from a video provided by the video selection input unit;
A mode learner configured to determine a mode of a macroblock by using feature value information provided by the macroblock feature value extractor and optimal compression mode information provided by the optimal image compressor;
An optimal decision tree for generating an optimal decision tree using mode information provided by the mode decision unit;
Video compression device using machine learning.
The video selection input unit,
A low-speed motion image providing unit having almost no motion between the images,
A high speed motion image providing unit having a very large motion between the images;
A complex image providing unit having a complicated shape in the image;
A simple form image providing unit having a simple form in an image,
And an image selecting unit for selectively outputting images provided by the various kinds of image providing units.
Video Compression Device Using Machine Learning
A macroblock feature value extraction step of extracting feature value information of a macroblock from a video provided in the video input step;
A decision tree calculation step of providing mode decision information using the feature value information provided in the macro block feature value extraction step and an optimal decision tree generated offline;
An optimal mode determination step of providing optimal mode information to the image compression step from the mode determination information provided in the decision tree calculating step;
And a video compression step of generating a compressed video using the video provided in the video input step and the optimal mode information provided in the optimal mode determination step.
Video compression method using machine learning.
The decision tree calculation step using the optimal decision tree generated offline,
A video selection input step of providing an image at the same time as an optimal video compression step and a macroblock feature value extraction step;
A macroblock feature value extraction step of extracting feature value information of a macroblock from a video provided in the video selection input step;
An optimal image compression step of performing optimal image compression from a video provided in the video selection input step;
A mode learning step of determining a mode of a macroblock by using the feature value information provided in the macroblock feature value extracting step and the optimum compression mode information provided in the optimum image compression step;
An optimal decision tree for generating an optimal decision tree using the mode information provided in the mode decision step;
Video compression method using machine learning.
The video selection input step,
Providing a low-speed motion image with little motion between the images,
Providing a high-speed motion image having a large motion between the images;
Providing a complex shape image having a complicated shape in the image,
Providing a simple form image having a simple form in the image,
And an image selecting step of selectively outputting images provided by the various kinds of image providing units.
Video compression method using machine learning.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020110019350A KR20120100448A (en) | 2011-03-04 | 2011-03-04 | Apparatus and method for video encoder using machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020110019350A KR20120100448A (en) | 2011-03-04 | 2011-03-04 | Apparatus and method for video encoder using machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20120100448A true KR20120100448A (en) | 2012-09-12 |
Family
ID=47110181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020110019350A KR20120100448A (en) | 2011-03-04 | 2011-03-04 | Apparatus and method for video encoder using machine learning |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20120100448A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101998839B1 (en) | 2018-01-03 | 2019-07-10 | 중앙대학교 산학협력단 | Data compression system and method for image-based embedded wireless sensor network |
KR20200073078A (en) * | 2018-12-13 | 2020-06-23 | 주식회사 픽스트리 | Image processing device of learning parameter based on machine Learning and method of the same |
KR20200073079A (en) * | 2018-12-13 | 2020-06-23 | 주식회사 픽스트리 | Image processing device of learning parameter based on machine Learning and method of the same |
US11115673B2 (en) | 2017-10-19 | 2021-09-07 | Samsung Electronics Co., Ltd. | Image encoder using machine learning and data processing method of the image encoder |
CN113489997A (en) * | 2021-05-27 | 2021-10-08 | 杭州博雅鸿图视频技术有限公司 | Motion vector prediction method, motion vector prediction device, storage medium and terminal |
-
2011
- 2011-03-04 KR KR1020110019350A patent/KR20120100448A/en not_active Application Discontinuation
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11115673B2 (en) | 2017-10-19 | 2021-09-07 | Samsung Electronics Co., Ltd. | Image encoder using machine learning and data processing method of the image encoder |
US11694125B2 (en) | 2017-10-19 | 2023-07-04 | Samsung Electronics Co., Ltd. | Image encoder using machine learning and data processing method of the image encoder |
KR101998839B1 (en) | 2018-01-03 | 2019-07-10 | 중앙대학교 산학협력단 | Data compression system and method for image-based embedded wireless sensor network |
KR20200073078A (en) * | 2018-12-13 | 2020-06-23 | 주식회사 픽스트리 | Image processing device of learning parameter based on machine Learning and method of the same |
KR20200073079A (en) * | 2018-12-13 | 2020-06-23 | 주식회사 픽스트리 | Image processing device of learning parameter based on machine Learning and method of the same |
CN113489997A (en) * | 2021-05-27 | 2021-10-08 | 杭州博雅鸿图视频技术有限公司 | Motion vector prediction method, motion vector prediction device, storage medium and terminal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111066326B (en) | Machine learning video processing system and method | |
CN105900420B (en) | Select motion vector accuracy | |
TWI569629B (en) | Techniques for inclusion of region of interest indications in compressed video data | |
CN109565587B (en) | Method and system for video encoding with context decoding and reconstruction bypass | |
US10440366B2 (en) | Method and system of video coding using content based metadata | |
CN106165418B (en) | Equipment, method and the computer-readable media of coded video data | |
CN107005697B (en) | Method and system for entropy coding using look-up table based probability updating for video coding | |
US10051271B2 (en) | Coding structure | |
Shen et al. | Ultra fast H. 264/AVC to HEVC transcoder | |
WO2021042957A1 (en) | Image processing method and device | |
CN108495135A (en) | A kind of fast encoding method of screen content Video coding | |
KR20120100448A (en) | Apparatus and method for video encoder using machine learning | |
CN115866356A (en) | Video watermark adding method, device, equipment and storage medium | |
CN106165420A (en) | For showing the system and method for the Pingdu detection of stream compression (DSC) | |
CN104754362A (en) | Image compression method using fine division block matching | |
CN110495178A (en) | The device and method of 3D Video coding | |
WO2021147463A1 (en) | Video processing method and device, and electronic apparatus | |
CN111225214B (en) | Video processing method and device and electronic equipment | |
CN105100799A (en) | Method for reducing intraframe coding time delay in HEVC encoder | |
JP2012147290A (en) | Image coding apparatus, image coding method, program, image decoding apparatus, image decoding method, and program | |
WO2023164020A2 (en) | Systems, methods and bitstream structure for video coding and decoding for machines with adaptive inference | |
WO2022179600A1 (en) | Video coding method and apparatus, video decoding method and apparatus, and electronic device | |
CN106878754A (en) | A kind of 3D video depths image method for choosing frame inner forecast mode | |
US20210321093A1 (en) | Method and system of video coding with efficient intra block copying | |
CN107483936B (en) | A kind of light field video inter-prediction method based on macro pixel |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application | ||
E601 | Decision to refuse application | ||
E601 | Decision to refuse application |