CN107666612A - Block content categorizing method based on convolutional neural networks - Google Patents

Block content categorizing method based on convolutional neural networks Download PDF

Info

Publication number
CN107666612A
CN107666612A CN201711049706.7A CN201711049706A CN107666612A CN 107666612 A CN107666612 A CN 107666612A CN 201711049706 A CN201711049706 A CN 201711049706A CN 107666612 A CN107666612 A CN 107666612A
Authority
CN
China
Prior art keywords
block
neural networks
convolutional neural
content
content type
Prior art date
Application number
CN201711049706.7A
Other languages
Chinese (zh)
Inventor
陈志波
叶淑睿
Original Assignee
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学技术大学 filed Critical 中国科学技术大学
Priority to CN201711049706.7A priority Critical patent/CN107666612A/en
Publication of CN107666612A publication Critical patent/CN107666612A/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals

Abstract

The invention discloses a kind of block content categorizing method based on convolutional neural networks, including:Build data set, the label using its content type as training sample;Build convolutional neural networks, training sample is converted into gray-scale map, each pixel of gray-scale map is indicated with eight bit binary numbers again, the last position bit for extracting each pixel carrys out input as convolutional neural networks, passes through to train and obtains last position bit convolution neural network model;When being predicted to N × N of input encoding block, using the content type of last position bit convolution Neural Network model predictive present encoding block, if output shoots block for camera, classification results are obtained;If output generates block for computer, it is predicted using last position bit convolution neural network model, obtains corresponding computer generation text block or computer generates non-textual piece of classification results.This method can improve the degree of accuracy and the computational efficiency of content type prediction, and then reduce redundant computation, improve compression quality.

Description

Block content categorizing method based on convolutional neural networks

Technical field

The present invention relates to technical field of video coding, more particularly to a kind of block classifying content side based on convolutional neural networks Method.

Background technology

Convolutional neural networks have been widely used in image classification and mould at present as one kind in deep learning algorithm In formula identification field.At the same time, the screen content coding (SCC) of efficient video coding (HEVC) extension extension employs toning Plate mode (Palette), intra block predictive mode (IBC) improve code efficiency, so also inevitably bring very high Encoder complexity.

The content type for predicting each coding unit is a crucial step, although it is special by low layer to have a few thing at present Sign, such as gradient, variance, entropy and number of colors etc., can be used for the classification of encoding block.However, correlation technique is for coding The degree of accuracy of block content type prediction need to be improved.

The content of the invention

It is an object of the invention to provide a kind of block content categorizing method based on convolutional neural networks, content class can be improved The degree of accuracy of type prediction and computational efficiency.

The purpose of the present invention is achieved through the following technical solutions:

A kind of block content categorizing method based on convolutional neural networks, including:

Build data set, and the label using its content type as training sample;

Convolutional neural networks are built, training sample is converted into gray-scale map, then by each pixel of gray-scale map with eight bits Binary number is indicated, and the last position bit for extracting each pixel carrys out input as convolutional neural networks, is passed through to train and is obtained Last position bit-convolutional neural networks model;

When being predicted to N × N of input encoding block, first with last position bit-convolutional neural networks model prediction The content type of present encoding block, if output shoots block for camera, obtain classification results;If output generates block for computer, Then continue with last position bit-convolutional neural networks model to be predicted, obtain corresponding computer generation text block or calculating Machine generates non-textual piece of classification results.

As seen from the above technical solution provided by the invention, it is right according to the convolutional neural networks model of training in advance Each coding unit predicts its content type, and prediction result has the higher degree of accuracy;In addition, it is used as pre- place by the use of prediction result Reason method, easily it can be combined with fast mode decision and rate control module, to instruct coding mode selection and code check Control, to reduce redundant computation, improve compression quality.

Brief description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Accompanying drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this For the those of ordinary skill in field, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.

Fig. 1 is the pattern of the input provided in an embodiment of the present invention that the gray-scale map of original image is converted into convolutional neural networks Schematic diagram;

Fig. 2 is the schematic diagram of the block content categorizing method provided in an embodiment of the present invention based on convolutional neural networks;

Fig. 3 is the schematic diagram of different size encoding block content type provided in an embodiment of the present invention prediction.

Embodiment

With reference to the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Based on this The embodiment of invention, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to protection scope of the present invention.

Some new coding toolses, including pallet mode are added in existing screen content coding standard (Palette), intra block matching (Intra Block Copy, IBC) etc..These instruments significantly improve compression quality, Also considerably increase encoder complexity simultaneously.Therefore a kind of effective method is found, can both keeps compression quality, can save again The about scramble time, will be an extremely important and promising method.It is proposed by the present invention a kind of based on convolutional neural networks Block content categorizing method as a preprocess method, can easily with fast mode decision and rate control module knot Close, to solve this problem.

A kind of block content categorizing method based on convolutional neural networks provided in an embodiment of the present invention, it is main to include following step Suddenly:

Step 1, structure data set, and the label using its content type as training sample.

Step 2, structure convolutional neural networks, training sample is converted into gray-scale map, then each pixel of gray-scale map is used Eight bit binary numbers are indicated, and the last position bit for extracting each pixel carrys out input as convolutional neural networks, passes through instruction Practice and obtain last position bit-convolutional neural networks model.

As shown in figure 1, it is the pattern of the input that the gray-scale map of training sample (i.e. original graph) is converted to convolutional neural networks Schematic diagram;Fig. 1 (a) is the gray-scale map after the original graph conversion of RGB patterns, and Fig. 1 (b) is the last position bit diagram after conversion.

It can be seen that from Fig. 1 (b):The last position bit distribution of camera content of shooting and computer generation content has higher Discrimination, unordered flakes is presented in camera content of shooting region, and computer generation content area can then reflect artwork substantially Texture.Therefore generate content to distinguish camera content of shooting and computer by last position bit diagram and can reduce classification difficulty, carry The high-class degree of accuracy.

It will be understood by those skilled in the art that Fig. 1 (a)~Fig. 1 (b) is primarily to displaying gray-scale map is converted to last position ratio Camera content of shooting and computer generate the difference of content in spy the figure difference of the latter two, and last position bit diagram, are obscured in figure Word, figure and various images be only for example and be not construed as limiting, while fuzzy content nor affects on the complete of the present invention Whole property.

When step 3, the encoding block to N × N of input are predicted, first with last position bit-convolutional neural networks mould Type predicts the content type of present encoding block, if output shoots block for camera, obtains classification results;If output is given birth to for computer It is blocking, then continue with last position bit-convolutional neural networks model and be predicted, obtain corresponding computer generation text block or Computer generates non-textual piece of classification results.

As shown in Fig. 2 classification prediction is divided into two steps, the first step, camera is distinguished by last position bit diagram and shoots block and calculating Machine generates block;Block is generated for computer, second step is performed and computer generation text block is further subdivided into by its gray-scale map Non-textual piece is generated with computer.It is specific as follows:

Present encoding block is converted into gray-scale map, then each pixel of gray-scale map is subjected to table with eight bit binary numbers Show, extract the last position bit of each pixel, obtain corresponding end bit diagram, recycle last position bit-convolutional neural networks mould Type is predicted to the content type of end bit diagram;

If output shoots block for camera, classification results, termination process are obtained;

If output generates block for computer, according to corresponding to the positional information of computer generation block is extracted from gray-scale map Grey blocks, last position bit-convolutional neural networks model is recycled to be predicted the content type of grey blocks;The classification knot of output Fruit is that computer generates text block or computer generates non-textual piece.

It will be understood by those skilled in the art that the process of classification prediction is only schematically provided as shown in Figure 2, institute in figure The various diagrams being related to, which are only for example, to be not construed as limiting, meanwhile, the representation of various diagrams nor affects on the complete of the present invention Whole property.

In addition, non-N × N of encoding block to(for) input, is divided into the following two kinds situation:

If size is more than N × N, predicted according to the content type of all N × N included inside it encoding block;Such as The content type of all N × N of fruit encoding block is all identical, then the encoding block using corresponding content type as non-N × N of input Content type;Otherwise, by the content type of non-N × N of input encoding block labeled as mixing content blocks;

If size is less than N × N, then it is assumed that its content type is identical with the content type of N × N encoding blocks where it.

Exemplary, N=32 can be set.As shown in figure 3,64 × 64 encoding block on the left of Fig. 3 is designated as a, its four 32 The content type of × 32 encoding block is incomplete same, then by encoding block a labeled as mixing content blocks.The lower left corner on the right side of Fig. 3 32 × 32 encoding block is designated as b, and it is that camera shoots block, and the encoding block in Fig. 3 right sides upper right side 32 × 32 generates text for computer Block, its internal 16 × 16 encoding block c generate text block for computer, and its internal 8 × 8 encoding block d is computer generation text This block.

After the encoding block to input carries out classifying content, coding mode selection, and Rate Control can also be carried out, is had Body is as follows:

1st, coding mode selects.

In the embodiment of the present invention, coding mode selection is carried out according to the size of encoding block and its content type.

As shown in table 1, block is shot if 2N × 2N camera, then using Skip patterns (skip mode);If N × N, N/ 2 × N/2 or N/4 × N/4 camera shooting block, then using Intra patterns (conventional intra prediction pattern);

Text block is generated if 2N × 2N computer, then using Skip patterns;If N × N, N/2 × N/2 or N/4 × N/4 computer generation text block, then using Palette patterns (pallet mode);

Non-textual piece is generated if 2N × 2N computer, then using Intra patterns;Computer generation if N × N is non- Text block, then using Intra patterns or Palette patterns;Computer generation if N/2 × N/2 or N/4 × N/4 is non- Text block, then using Intra patterns, Palette patterns or IBC patterns (intra block predictive mode);

If 2N × 2N mixing content blocks, then using Skip patterns.

The coded block size of table 1 and content type, the corresponding relation with coding mode

In addition, also being tested to the performance of such scheme, experimental result is as shown in table 2.

2 coding mode selection scheme of the present invention of table and SCM-6.0 performance comparisions

In table 2, △ BD-rate are that the BD-rate of Y/U/V triple channels increases average value.As can be known from Table 2, surveyed in standard Under test ring border (Common Test Conditions), coding mode selection scheme of the present invention can reduce 40.1% than SCM-6.0 Scramble time, 1.3% compression performance is only brought to decline.

2nd, Rate Control.

In the embodiment of the present invention, according to the content type of encoding block, with reference to the compression matter in different application to different content Amount requires, realizes Rate Control by changing the quantization parameter of encoding block and skipping the pattern that time-consuming, can improve pressure Contracting efficiency reduces encoder complexity simultaneously.

Such scheme of the embodiment of the present invention, it is pre- to each coding unit according to the convolutional neural networks model of training in advance Its content type is surveyed, prediction result has the higher degree of accuracy;In addition, by the use of prediction result as preprocess method, can be very Readily combined with fast mode decision and rate control module, it is superfluous to reduce to instruct coding mode selection and Rate Control Remaining calculating, improve compression quality.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can To be realized by software, the mode of necessary general hardware platform can also be added by software to realize.Based on such understanding, The technical scheme of above-described embodiment can be embodied in the form of software product, the software product can be stored in one it is non-easily In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are causing a computer to set Standby (can be personal computer, server, or network equipment etc.) performs the method described in each embodiment of the present invention.

The foregoing is only a preferred embodiment of the present invention, but protection scope of the present invention be not limited thereto, Any one skilled in the art is in the technical scope of present disclosure, the change or replacement that can readily occur in, It should all be included within the scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Enclose and be defined.

Claims (6)

  1. A kind of 1. block content categorizing method based on convolutional neural networks, it is characterised in that including:
    Build data set, and the label using its content type as training sample;
    Convolutional neural networks are built, training sample are converted into gray-scale map, then each pixel of gray-scale map is entered with eight bits two Number processed is indicated, and the last position bit for extracting each pixel carrys out input as convolutional neural networks, is passed through to train and is obtained last position Bit-convolutional neural networks model;
    It is current first with last position bit-convolutional neural networks model prediction when being predicted to N × N of input encoding block The content type of encoding block, if output shoots block for camera, obtain classification results;If output generates block for computer, after It is continuous to be predicted using last position bit-convolutional neural networks model, obtain corresponding computer generation text block or computer life Into non-textual piece of classification results.
  2. 2. a kind of block content categorizing method based on convolutional neural networks according to claim 1, it is characterised in that utilize The content type of last position bit-convolutional neural networks model prediction present encoding block includes:
    Present encoding block is converted into gray-scale map, then each pixel of gray-scale map is indicated with eight bit binary numbers, is carried The last position bit of each pixel is taken, obtains corresponding end bit diagram, recycles last position bit-convolutional neural networks model to end The content type of tail bit diagram is predicted;
    If output shoots block for camera, classification results, termination process are obtained;
    If output generates block for computer, the gray scale according to corresponding to the positional information of computer generation block is extracted from gray-scale map Block, last position bit-convolutional neural networks model is recycled to be predicted the content type of grey blocks;The classification results of output are Computer generates text block or computer generates non-textual piece.
  3. A kind of 3. block content categorizing method based on convolutional neural networks according to claim 1, it is characterised in that for Non- N × N of input encoding block;
    If size is more than N × N, predicted according to the content type of all N × N included inside it encoding block;If institute Have that the content type of N × N encoding block is all identical, then using corresponding content type as the interior of non-N × N of input encoding block Hold type;Otherwise, by the content type of non-N × N of input encoding block labeled as mixing content blocks;
    If size is less than N × N, then it is assumed that its content type is identical with the content type of N × N encoding blocks where it.
  4. 4. a kind of block content categorizing method based on convolutional neural networks according to claim 1, it is characterised in that to defeated The encoding block entered also includes after carrying out classifying content:Coding mode selection is carried out according to the size of encoding block and its content type.
  5. A kind of 5. block content categorizing method based on convolutional neural networks according to claim 4, it is characterised in that
    Block is shot if 2N × 2N camera, then using Skip patterns;If N × N, N/2 × N/2 or N/4 × N/4 camera Block is shot, then using Intra patterns;
    Text block is generated if 2N × 2N computer, then using Skip patterns;If N × N, N/2 × N/2 or N/4 × N/4 Computer generation text block, then using Palette patterns;
    Non-textual piece is generated if 2N × 2N computer, then using Intra patterns;Computer generation if N × N is non-textual Block, then using Intra patterns or Palette patterns;Computer generation if N/2 × N/2 or N/4 × N/4 is non-textual Block, then using Intra patterns, Palette patterns or IBC patterns;
    If 2N × 2N mixing content blocks, then using Skip patterns.
  6. 6. a kind of block content categorizing method based on convolutional neural networks according to claim 1, it is characterised in that to defeated The encoding block entered also includes after carrying out classifying content:According to the content type of encoding block, with reference in different application to different content Compression quality requirement, realize Rate Control by changing the quantization parameter of encoding block and skipping the pattern that time-consuming.
CN201711049706.7A 2017-10-31 2017-10-31 Block content categorizing method based on convolutional neural networks CN107666612A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711049706.7A CN107666612A (en) 2017-10-31 2017-10-31 Block content categorizing method based on convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711049706.7A CN107666612A (en) 2017-10-31 2017-10-31 Block content categorizing method based on convolutional neural networks

Publications (1)

Publication Number Publication Date
CN107666612A true CN107666612A (en) 2018-02-06

Family

ID=61143708

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711049706.7A CN107666612A (en) 2017-10-31 2017-10-31 Block content categorizing method based on convolutional neural networks

Country Status (1)

Country Link
CN (1) CN107666612A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228142A (en) * 2016-07-29 2016-12-14 西安电子科技大学 Face verification method based on convolutional neural networks and Bayesian decision
CN107016405A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image classification method based on classification prediction convolutional neural networks
US10032070B1 (en) * 2016-09-19 2018-07-24 King Fahd University Of Petroleum And Minerals Method for identifying an individual by walking style

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228142A (en) * 2016-07-29 2016-12-14 西安电子科技大学 Face verification method based on convolutional neural networks and Bayesian decision
US10032070B1 (en) * 2016-09-19 2018-07-24 King Fahd University Of Petroleum And Minerals Method for identifying an individual by walking style
CN107016405A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 A kind of insect image classification method based on classification prediction convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
叶淑睿: "HEVC SCC的编解码技术优化研究", 《万方数据知识服务平台》 *

Similar Documents

Publication Publication Date Title
Han et al. Learning both weights and connections for efficient neural network
Jo et al. A digital image watermarking scheme based on vector quantisation
Long et al. Fully convolutional networks for semantic segmentation
EP1281164B1 (en) Visual attention location system
KR100926193B1 (en) Segmenting digital image and producing compact representation
Pan et al. An efficient encoding algorithm for vector quantization based on subvector technique
US7120297B2 (en) Segmented layered image system
Kekre et al. Image Retrieval using Texture Features extracted from GLCM, LBG and KPE
CN101243685B (en) Coding method, processing method and processing device for digital media data
Huang et al. Multi-scale dense networks for resource efficient image classification
JP4486780B2 (en) Nonlinear quantization and similarity matching method for image information retrieval
EP2526699A1 (en) Data pruning for video compression using example-based super-resolution
KR20030096439A (en) Picture encoder, picture decoder, picture encoding method, picture decoding method, and medium
CN104796719B (en) The method and apparatus that video is decoded
FR2965383A1 (en) Image classification employing compressed image vectors using vectorial quantification
US8180165B2 (en) Accelerated screen codec
EP1388815A2 (en) Segmented layered image system
KR101461209B1 (en) Method and apparatus for image compression storing encoding parameters in 2d matrices
Tsai et al. Location coding for mobile image retrieval
Xing-Yuan et al. Fractal image compression based on spatial correlation and hybrid genetic algorithm
Chang et al. Reversible steganographic method using SMVQ approach based on declustering
Li et al. Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification
WO2007072895A1 (en) In-screen prediction mode decision method, image encoding method, and image encoding device
CN103858427A (en) Signal processing and tiered signal encoding
CN108632626A (en) The method for deriving reference prediction mode value

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination