CN111259826B - Method, system and storage medium for fast dividing image characteristic information frame - Google Patents

Method, system and storage medium for fast dividing image characteristic information frame Download PDF

Info

Publication number
CN111259826B
CN111259826B CN202010061114.2A CN202010061114A CN111259826B CN 111259826 B CN111259826 B CN 111259826B CN 202010061114 A CN202010061114 A CN 202010061114A CN 111259826 B CN111259826 B CN 111259826B
Authority
CN
China
Prior art keywords
classifier
matrix
vertical
acquiring
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010061114.2A
Other languages
Chinese (zh)
Other versions
CN111259826A (en
Inventor
梁凡
李正仁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202010061114.2A priority Critical patent/CN111259826B/en
Publication of CN111259826A publication Critical patent/CN111259826A/en
Application granted granted Critical
Publication of CN111259826B publication Critical patent/CN111259826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture
    • G06T7/45Analysis of texture based on statistical description of texture using co-occurrence matrix computation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a system and a storage medium for fast dividing image characteristic information frames, wherein the method comprises the following steps: acquiring a training video sequence, and extracting characteristics from the video sequence to acquire training data; training a support vector machine through the training data to obtain a classifier; filtering the mode list through the trained classifier to obtain a filtered mode list; and skipping the trial process of the coding mode according to the filtered mode list, and performing the coding process. After the image characteristic information is extracted, the classifier is obtained through training, the classifier is used for judging the image to be processed, the trial process of the coding mode which takes longer time is skipped, and therefore the time spent in the video coding process is shortened. The invention can be widely applied to the technical field of machine vision and pattern recognition.

Description

Method, system and storage medium for fast dividing image characteristic information frame
Technical Field
The invention relates to the technical field of machine vision and mode recognition, in particular to a method, a system and a storage medium for fast dividing image characteristic information frames.
Background
The Video Coding standard (VVC) is a new generation Coding standard proposed in 2018 by Joint Video expansion Team (jfet) working group following the previous generation Video Coding standard High Efficiency Video Coding (HEVC). The VVC is similar to the previous generation coding standard, a hybrid coding framework is adopted, technologies such as QTMT, ATMVP and PDPC are introduced, and the compression capability of the VVC is greatly improved compared with HEVC. Similar to HEVC, each image is divided into a plurality of Tree structure units (CTUs) and then recursively divided into a plurality of Coding Units (CUs) for Coding, so as to adapt to different texture features of different images, and achieve more effective compression. Different from HEVC in which each CTU is recursively divided into CUs according to a quadtree division method, VVC employs a quadtree + multi-way tree division method, that is, each CTU is first quadtree-divided, and then leaf nodes of the quadtree are further divided into binary trees, ternary trees, or quadtrees.
The VVC attempts to try different modes and partitions for each image, which is a process of trying different available modes one by one to determine the best mode, and thus the whole process is accompanied by a large number of unnecessary attempts, which makes the process very time-consuming and has a large impact on the encoding time. The great increase of the coding time has certain influence on the application of the technology, so that the application and popularization of the technology are limited, and a technical method capable of shortening the coding time is needed at present.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method, a system and a storage medium for fast dividing an image feature information frame.
A method for fast dividing image characteristic information frames comprises the following steps:
acquiring a training video sequence, and extracting characteristics from the video sequence to acquire training data;
training a support vector machine through the training data to obtain a classifier;
filtering the mode list through the trained classifier to obtain a skip mode list;
and according to the skip mode list, skipping the trial process of the coding mode and carrying out the coding process.
Further, the step of acquiring the training video sequence and extracting the features of the video sequence to acquire the training data includes the following steps:
acquiring a gray level co-occurrence matrix of the characteristic data;
acquiring the gradient and the absolute value of the characteristic data through a Sobel operator;
and training a support vector machine by using the gray level co-occurrence matrix and the gradient and absolute value to obtain a classifier.
Further, the classifier comprises a quad tree structure classifier, a vertical structure classifier, a binary tree structure classifier and an intra-frame mode classifier.
Further, the step of training a support vector machine to obtain a classifier through the training data further includes the following steps:
performing convolution calculation processing on the pixel matrix through a Sobel operator template to obtain a first-order horizontal gradient matrix and a first-order vertical gradient matrix of the image;
acquiring vertical matrix energy values and horizontal matrix energy values corresponding to vertical and horizontal gray level co-occurrence matrixes, vertical matrix contrast and horizontal matrix contrast, vertical matrix entropy and horizontal matrix entropy, and vertical matrix inverse difference and horizontal matrix inverse difference;
obtaining the quadtree partition depth of the left, upper and upper left coded blocks of the current partition block to be coded;
acquiring the difference of pixel mean values of a left region and a right region of a current to-be-coded partition block and the first pixel mean value difference of an upper region and a lower region;
and acquiring second pixel mean value differences of different divided areas of the current divided block to be coded under the condition of the ternary tree division.
And training the support vector machine to obtain the classifier by taking the vertical matrix energy value and the horizontal matrix energy value corresponding to the vertical and horizontal gray level co-occurrence matrix, the vertical matrix contrast and the horizontal matrix contrast, the vertical matrix entropy and the horizontal matrix entropy, the vertical matrix inverse difference and the horizontal matrix inverse difference, the quadtree division depth, the first pixel average value difference and the second pixel average value difference as input data.
Further, the step of filtering the pattern list through the trained classifier to obtain the skip pattern list includes the following steps:
obtaining a first probability value divided into a quad-tree structure through a quad-tree structure classifier, and if the first probability value is greater than a first preset probability threshold of the quad-tree, directly dividing the quad-tree; if the first probability value is lower than a second preset probability threshold of the quadtree, forbidding the attempt process of the quadtree division;
acquiring a second probability value divided into vertical structures through a vertical structure classifier, and if the second probability value is greater than a first preset probability threshold value of the vertical structure, forbidding an attempt process of horizontal division; if the second probability value is smaller than a second preset probability threshold value of the vertical structure, the attempt process of vertical division is forbidden;
and acquiring a third probability value divided into a binary tree through a binary tree structure classifier, if the third probability value is greater than a first preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree, otherwise, if the third probability value is less than a second preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree.
Further, the attempting process of skipping the coding mode according to the skip mode list and the step of performing the coding process further include the steps of:
acquiring a mode list, and entering a division trial process of list options;
skipping the prohibited partition attempting procedure according to the skip mode list;
and entering an intra-frame decision process, acquiring an intra-frame mode and coding.
Further, the step of entering an intra-frame decision process to acquire an intra-frame mode and perform encoding further includes the steps of:
starting an intra-frame decision process, loading an intra-frame mode classifier, and performing a first round of intra-frame mode selection;
acquiring a fourth probability value for skipping a current mode through an intra-frame mode classifier, wherein the fourth probability value is smaller than a preset third probability threshold value, and performing a second round of intra-frame mode selection;
and coding according to the selected intra-frame mode and a preset coding flow.
The invention also provides a system for quickly dividing the image characteristic information frame, which comprises the following steps:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, the at least one program causes the at least one processor to implement a method for fast intra frame partitioning of image feature information as described above.
The invention also provides a system for fast dividing the image characteristic information frame, which comprises:
the preprocessing module is used for acquiring a training video sequence and extracting characteristics of the video sequence to acquire training data;
the classifier training module is used for training a support vector machine through the training data to obtain a classifier;
the classifier dividing module is used for filtering the mode list through the trained classifier to obtain a skip mode list;
and the coding module is used for skipping the trial process of the coding mode according to the skipping mode list and carrying out a coding process.
The present invention also proposes a storage medium having stored therein processor-executable instructions for performing a method for fast partitioning within an image feature information frame as described above when executed by a processor.
One or more of the above-described embodiments of the present invention have the following advantages:
the invention adopts a method for quickly dividing the image characteristic information frame, the method obtains a classifier by training after extracting the image characteristic information, judges the image to be processed by using the classifier, and skips the trial process of a coding mode which takes longer time, thereby shortening the time spent in the process of video coding. The method has a remarkable promoting effect on the application and popularization of the video coding standard, particularly the VVC coding standard, which conforms to the technology of the method.
Drawings
FIG. 1 is a flow chart of a method for fast intra-frame division of image feature information according to the present invention;
FIG. 2 is a schematic diagram of a quad-tree structure classifier to be partitioned into blocks and peripheral blocks of the present invention;
FIG. 3 is a schematic diagram of a vertical structure classifier to be partitioned into blocks and peripheral blocks;
FIG. 4 is a diagram of a binary tree structure classifier of the present invention for partitioning blocks and surrounding blocks;
FIG. 5 is a flowchart detailing the steps of filtering the pattern list by the trained classifier to obtain a filtered pattern list according to the present invention;
FIG. 6 is a flowchart detailing the steps of filtering the pattern list by the trained classifier to obtain a filtered pattern list according to the present invention;
fig. 7 is a flowchart illustrating the detailed steps of the present invention for entering the intra decision process by dividing according to the current option.
Detailed Description
The technical scheme of the invention is described in detail in the following with reference to the accompanying drawings.
As shown in fig. 1, the method for fast dividing image characteristic information frames of the present invention includes the following steps:
s1: acquiring a training video sequence, and extracting characteristics from the video sequence to acquire training data;
s2: training a support vector machine through the training data to obtain a classifier;
s3: filtering the mode list through the trained classifier to obtain a skip mode list;
s4: and according to the skip mode list, skipping the trial process of the coding mode and carrying out the coding process.
In step S1, a training video sequence is obtained, and features are extracted from the video sequence to obtain training data. The training video sequence comprises a video sequence for training a model and a video sequence for testing a model result. In one embodiment of the invention, the characteristic acquisition is carried out by adopting reference software VTM of VVC, and the VTM is standard reference software for operating VVC.
In step S2, a support vector machine is trained by the training data to obtain a classifier. The extracted feature data for training needs to be marked correspondingly according to the result of block division, so that subsequent data training is facilitated.
In the invention, a gray level co-occurrence matrix is adopted to describe the texture, the gray level co-occurrence matrix is a matrix for describing the texture by researching the space correlation characteristic of the gray level, and the gray level co-occurrence matrix is obtained by counting the condition that two pixels which keep a set distance on an image respectively have a certain gray level. For example, an arbitrary point (x, y) in an N × N size image and another point (x + a, x + b) that is a distance from a and b are taken, and a point pair consisting of the two points is (g 1, g 2). And then (x, y) is moved on the whole picture to obtain various combinations of (g 1, g 2) values, the times of occurrence of each combination are counted, the counted times are arranged into a square matrix with the size of g1 multiplied by g2, the times of occurrence of each gray pair are normalized into the probability P (g 1, g 2) of occurrence, and the finally obtained matrix is the gray co-occurrence matrix.
When the value of the differential value is (1, 0) and (0, 1), the obtained gray level co-occurrence matrix is a matrix reflecting image texture distribution characteristics under the horizontal condition and the vertical condition of the image. The gray level co-occurrence matrix contains a scalar quantity to describe it, such as: energy, contrast, entropy, inverse difference, etc.
In the description scalar of the gray level co-occurrence matrix, the energy (regular Second Moment) is expressed using the following formula:
Figure BDA0002374522420000041
where ASM denotes energy, k denotes the number of row elements or column elements of the gray level co-occurrence matrix (i.e., the number of gray levels of the image to be processed), and G (i, j) denotes the normalized frequency of occurrence of each gray level pair. The energy may reflect the uniformity of the image gray scale distribution and the texture coarseness.
In the description scalar of the gray level co-occurrence matrix, the Contrast (Contrast) is expressed using the following formula:
Figure BDA0002374522420000051
where CON represents contrast, and k and (G (i, j)) represent the same meanings as in the energy expression. The contrast reflects the contrast between the brightness of a certain pixel value and the brightness of the peripheral pixel values.
In the description scalar of the gray level co-occurrence matrix, entropy (Entropy) is expressed using the following formula:
Figure BDA0002374522420000052
where ENT denotes entropy, and k and (G (i, j)) denote the same meanings as in the energy expression. Entropy reflects the degree of non-uniformity or complexity of the texture in the image.
In the description scalar of the gray level co-occurrence matrix, the Inverse difference distance (Inverse difference Moment) is expressed using the following formula:
Figure BDA0002374522420000053
where IDM represents the inverse distance, and k and (G (i, j)) represent the same meaning as in the energy expression. The retrogradation distance reflects the homogeneity of the image texture.
Obtaining the energy ASM of the gray level co-occurrence matrix in the vertical direction and the horizontal direction by adopting the formula and the values of the difference values of (1, 0) and (0, 1) v And ASM h Contrast ratio CON v And CON h Entropy ENT v And ENT h Inverse differential IDM v And IDM h
In this step, the feature data of the image is obtained by using the gray level co-occurrence matrix generation method and the description scalar of the gray level co-occurrence matrix described above. Meanwhile, a first-order horizontal gradient matrix G of the image is obtained by adopting a Sobel operator x And a first order vertical gradient matrix G y . The Sobel operator is a commonly used method for acquiring a first-order gradient of a digital image, and a calculation formula of the Sobel operator is as follows:
Δ x f(x,y)=[f(x-1,y+1)+2f(x,y+1)+f(x+1,y+1)]-[f(x-1,y-1)+2f(x,y-1)+f(x+1,y-1)]
Δ y f(x,y)=[f(x-1,y-1)+2f(x-1,y)+f(x-1,y+1)]-[f(x+1,y-1)+2f(x+1,y)+f(x+1,y+1)]
wherein, delta x f (x, y) represents a gradient in the horizontal direction of a point of the image at which a certain pixel value at the (x, y) position is f (x, y), and Δ y f (x, y) represents the vertical of a point in the image at the (x, y) position where a certain pixel value is f (x, y)A gradient in direction.
In actual calculation, a specific gradient matrix is obtained by adopting the convolution of a Sobel operator template and a pixel matrix as follows:
Figure BDA0002374522420000061
Figure BDA0002374522420000062
wherein, G x First order horizontal gradient matrix, G, representing an image y A first order vertical gradient matrix representing the image, P being the image to be processed. Before the subsequent model training, the sum of the absolute values of each value of the two matrices is obtained, and the formula is as follows:
Figure BDA0002374522420000063
Figure BDA0002374522420000064
wherein | G x I is G x The sum of the absolute values of each of the values, wherein G y I is G y The sum of the absolute values of each value in (a), w is the width of the image, and h is the height of the image.
Obtain the required | G x |、|G y And after | and the gray level co-occurrence matrix of the image, training the classifier. The classifier is realized by a support vector machine. The support vector machine is a supervised learning model and mainly aims at analyzing the linear divisible condition, or converting linear inseparable samples input by a low-dimensional space into linear divisible samples of a high-order feature space by using a nonlinear mapping algorithm for analysis. In short, two different kinds of data are continuously fitted and learned, and a straight line called a decision surface is found to classify the two kinds of data. Support vector machineTo learn the learning result of the global optimization.
The classifier comprises a quadtree structure classifier, a vertical structure classifier, a binary tree structure classifier and an intra-frame mode classifier.
Among the four kinds of classifiers, a quad tree structure classifier for identifying whether an attempt of quad tree structure division is necessary for the current block. The classification features used are: previously calculated by Sobel operator, | G x |、|G y | G and the sum of | G x |+|G y Energy ASM of gray level co-occurrence matrix in vertical and horizontal directions calculated from differential values of (1, 0) and (0, 1) v And ASM h Contrast ratio CON v And CON h Entropy ENT v And ENT h Inverse differential IDM v And IDM h And the quadtree partition depth of the already coded blocks on the left, above and left above the current block to be partitioned, as shown in fig. 2, the quadtree partition depth of the already coded region of 3 blocks a, B and C beside the current block to be coded is obtained.
Among the four classifiers, a vertical structure classifier is used to identify whether a direction of division is a vertical direction when performing binary tree and ternary tree division. The classification characteristics adopted by the method are similar to those of a quad-tree structure classifier, and the | G calculated by a Sobel operator is adopted x |、|G y And the difference | G between them x |-|G y The difference between | and energy ASM v -ASM h Difference of contrast CON v -CON h Entropy difference ENT v -ENT h Inverse difference IDM v -IDM h . The difference is that, as shown in fig. 3, the vertical structure classifier needs to obtain the difference GD between the pixel mean values of the left and right regions a and B of the current block to be coded x And difference GD between pixel mean values of upper and lower regions C and D y The formula is obtained as follows:
Figure BDA0002374522420000071
Figure BDA0002374522420000072
where k denotes the number of row elements or column elements of the gray level co-occurrence matrix (i.e., the number of gray levels of the image to be processed), and G (i, j) denotes the normalized frequency of occurrence of each gray level pair.
Among the four classifiers, the binary tree classifier is used for identifying whether binary tree division is performed or not on the basis of the vertical structure classifier. The classification features adopted are as follows: the quadtree division depth of the coded blocks on the left, upper and upper left of the current block to be divided is obtained, as shown in fig. 2, the quadtree division depth of 3 coded areas, namely, a, B and C, beside the current block to be coded is obtained; and under the condition that the current to-be-coded partition block is divided into the ternary trees, acquiring the pixel mean value difference GD of the areas A and B as shown in figure 4 AB Difference GD of pixel mean values in B and C regions BC Difference of pixel mean values GD in D and E regions DE And difference GD of pixel mean values of E and F regions EF The calculation formula of the mean difference is as follows:
Figure BDA0002374522420000073
Figure BDA0002374522420000074
Figure BDA0002374522420000075
Figure BDA0002374522420000076
where w is the width of the image and h is the height of the image, (G (i, j)) represents the normalized frequency of occurrence of each gray pair.
Among the four classifiers, the intra mode classifier is the same as the second filtering process in the decision making process in the judgment and skipping frame. The adopted classification characteristics are similar to those of a quad-tree structure classifier, and the | G obtained by calculation of Sobel operator is adopted x |、|G y | and the sum | G of both | x |+|G y Sum of | and energy ASM v +ASM h Sum of contrast CON v +CON h Sum of entropy ENT v +ENT h Sum of inverse difference IDM v +IDM h
And extracting and obtaining corresponding training data according to the classification characteristics adopted by the classifier, and randomly dividing all the data into non-coincident training data and test data. Training data needs to be preprocessed to discard part of the data, so that the number of samples of two classification results is kept equal, for example, for a sample of a quadtree structure classifier, the number of samples for quadtree classification is equal to the number of samples for which quadtree classification is performed and the number of samples for which quadtree classification is not performed are guaranteed to be equal.
After the training data is preprocessed, the training data corresponding to each classifier is added into a training program written according to the support vector machine principle for training, for the problem that linearity is inseparable, a kernel function is used for mapping the data to a high-dimensional space to complete dimension increasing, the dimension increasing is converted into the problem that linearity is separable in the high-dimensional space, and then classification and training are carried out on the high-dimensional space. According to practical experience, the Gaussian kernel function is adopted to carry out dimensionality-increasing operation on the data.
After the training is finished, the classification effect of the trained model is tested by using the test data divided before, and parameters during the training are adjusted according to the test result until the training result meets the preset requirement.
After the method and the characteristic data are adopted to train the support vector machine, the classifier is obtained, and the trained classifier is stored.
In step S3, the trained classifier filters the pattern list to obtain a skip pattern list. Referring to fig. 5, the process of filtering includes the following steps:
s301: and initializing and loading trained classifiers, wherein the classifiers comprise a quadtree structure classifier, a vertical structure classifier and a binary tree structure classifier.
S302: and (4) using a quadtree structure classifier to perform quadtree division judgment to obtain quadtree division possibility. If the judgment possibility is greater than the first preset probability threshold of the quadtree, directly entering the step S305; and if the judgment possibility is smaller than a second preset probability threshold of the quadtree, forbidding the trying process of the quadtree division in the division.
S303: and (4) performing vertical structure division judgment by using a vertical structure classifier to obtain the possibility of vertical structure division. If the judgment possibility is larger than a first preset probability threshold value of the vertical structure, forbidding horizontal structure division in the division; and if the judgment possibility is smaller than a second preset probability threshold value of the vertical structure, forbidding the vertical structure division in the division.
S304: and performing binary tree structure division judgment by using a binary tree structure classifier to obtain the binary tree structure division possibility. If the judgment possibility is larger than a first preset probability threshold of the binary tree structure, forbidding the division of the ternary tree structure in the division; and if the judgment possibility is smaller than a second preset probability threshold of the binary tree structure, forbidding the binary tree structure division in the division.
S305: and adding the judgment result of the selector into the skip list, and entering a dividing stage.
In the filtering process, the output results of the quad-tree structure classifier, the vertical structure classifier and the binary tree structure classifier are all probability values P for dividing the image into a certain class, and in order to ensure that the division classification is as accurate as possible, the probability threshold value for classification is subjected to multiple experience and statistical tests, for example, the first preset probability threshold value of the quad-tree is 0.88, and the second preset probability threshold value of the quad-tree is 0.12. The two quadtree preset probability threshold values are also reference values of other preset probability threshold values.
In step S4, according to the skip mode list, skipping the trial process of the coding mode, and performing the coding process, referring to fig. 6 and 7, the step mainly includes the following steps:
s401: and judging whether all the partition options in the current mode list are tried, if so, entering the step S404, otherwise, entering the step S402.
S402: whether the current division option is in a skip list or not belongs to forbidden division; if yes, the process returns to step S401, otherwise, the process proceeds to step S403.
S403: and (4) dividing according to the current option, entering an intra-frame decision process, and recursively entering the step S401.
S403-1: loading an intra mode classifier;
s403-2: performing a first round of intra mode selection;
s403-3: judging the acquired image characteristics by an in-frame mode classifier, skipping a second-round screening process when judging that the possibility of not performing the second-round screening is greater than a preset fourth probability value, and entering the step S403-5; otherwise, go to step S403-4;
s403-4: performing a second round of intra mode selection;
s403-5: the intra decision process is ended.
S404: the partition and intra decision process is ended.
And the fourth probability value is 0.88 after multiple times of statistics and empirical calculation.
In order to implement the method for fast dividing the image characteristic information frame, the invention also provides a system for fast dividing the image characteristic information frame, which comprises the following steps:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, the at least one program causes the at least one processor to implement a method for fast intra frame partitioning of image feature information as described above.
In order to implement the method for fast dividing the image characteristic information frame, the invention also provides a system for fast dividing the image characteristic information frame, which comprises the following steps:
the preprocessing module is used for acquiring a training video sequence and extracting characteristics of the video sequence to acquire training data;
the classifier training module is used for training a support vector machine through the training data to obtain a classifier;
the classifier dividing module is used for filtering the mode list through the trained classifier to obtain a skip mode list;
and the coding module is used for skipping the trial process of the coding mode according to the skipping mode list and carrying out a coding process.
In order to implement a method for fast intra-frame division of image characteristic information according to the present invention, the present invention further proposes a storage medium having stored therein processor-executable instructions, which when executed by a processor, are used to execute a method for fast intra-frame division of image characteristic information as described above.
In summary, compared with the prior art, the invention has the following advantages:
(1) After the image characteristic information is extracted, the classifier is obtained through training, the classifier is used for judging the image to be processed, and the trying process of the coding mode which consumes longer time is skipped, so that the time spent in the process of video coding is shortened.
(2) According to the method, indexes of three aspects, namely the gray level co-occurrence matrix and scalar quantity characteristics thereof, the gradient obtained by the Sobel operator and the dividing depth of the peripheral blocks of the current block to be divided are recorded to train the support vector machine and obtain the dividing result, so that the dividing reliability is improved.
(3) In the process of selecting the intra-frame mode, the invention adopts a higher threshold value to judge the selection result, thereby reducing the dividing error or the dividing time loss caused by misjudgment.
(4) The invention obtains the classifier by training the support vector machine, can enable the learning result to achieve the global optimum, and meets a certain upper bound with a certain probability in the expectation of the whole sample space.
(5) The invention adds the classifier to the dividing process and the intra-frame decision process to judge the dividing mode and judge whether to skip the current dividing mode, thereby improving the fault tolerance and reliability of the process.
The step numbers in the above method embodiments are set for convenience of illustration only, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A method for fast dividing image characteristic information frame is characterized by comprising the following steps:
acquiring a training video sequence, and extracting characteristics from the video sequence to acquire training data;
training a support vector machine through the training data to obtain a classifier;
filtering the mode list through the trained classifier to obtain a skip mode list;
according to the skipping mode list, skipping the trial process of the coding mode and carrying out the coding process;
the step of acquiring the training video sequence and extracting the characteristics of the video sequence to acquire the training data comprises the following steps:
acquiring a gray level co-occurrence matrix of the characteristic data;
acquiring the gradient and the absolute value of the characteristic data through a Sobel operator;
training a support vector machine by using the gray level co-occurrence matrix and the gradient and absolute value to obtain a classifier;
the classifier comprises a quadtree structure classifier, a vertical structure classifier, a binary tree structure classifier and an intra-frame mode classifier;
the step of training a support vector machine to obtain a classifier through the training data further comprises the following steps:
performing convolution calculation processing on the pixel matrix through a Sobel operator template to obtain a first-order horizontal gradient matrix and a first-order vertical gradient matrix of the image;
acquiring vertical matrix energy values and horizontal matrix energy values corresponding to vertical and horizontal gray level co-occurrence matrixes, vertical matrix contrast and horizontal matrix contrast, vertical matrix entropy and horizontal matrix entropy, and vertical matrix inverse difference and horizontal matrix inverse difference;
acquiring the quadtree partition depth of the left, upper and upper left coded blocks of the current to-be-coded partition block;
acquiring the difference of the pixel mean values of the left area and the right area of the current to-be-coded partition block and the first pixel mean value difference of the upper area and the lower area;
acquiring second pixel mean value differences of different division areas of a current division block to be coded under the condition of the ternary tree division;
taking a vertical matrix energy value and a horizontal matrix energy value corresponding to the vertical and horizontal gray level co-occurrence matrix, a vertical matrix contrast and a horizontal matrix contrast, a vertical matrix entropy and a horizontal matrix entropy, a vertical matrix inverse difference and a horizontal matrix inverse difference, a quadtree division depth, a first pixel mean value difference and a second pixel mean value difference as input data, and training the support vector machine to obtain a classifier;
the step of filtering the pattern list through the trained classifier to obtain the skip pattern list comprises the following steps:
obtaining a first probability value divided into a quad-tree structure through a quad-tree structure classifier, and if the first probability value is greater than a first preset probability threshold of the quad-tree, directly dividing the quad-tree; if the first probability value is lower than a second preset probability threshold of the quadtree, forbidding the attempt process of the quadtree division;
acquiring a second probability value divided into vertical structures through a vertical structure classifier, and if the second probability value is larger than a first preset probability threshold value of the vertical structure, forbidding an attempt process of horizontal division; if the second probability value is smaller than a second preset probability threshold value of the vertical structure, the attempt process of vertical division is forbidden;
and acquiring a third probability value divided into a binary tree through a binary tree structure classifier, if the third probability value is greater than a first preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree, otherwise, if the third probability value is less than a second preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree.
2. The method of claim 1, wherein the image feature information frame is divided into frames according to a predetermined ratio: the attempting process of skipping the coding mode according to the skip mode list and the step of performing the coding process further comprise the following steps:
acquiring a mode list, and entering a division trial process of list options;
skipping the prohibited partitioning attempt procedure according to the skip mode list;
and entering an intra-frame decision process, acquiring an intra-frame mode and coding.
3. The method of claim 2, wherein the image feature information frame is divided into frames according to the following steps: the step of entering the intra-frame decision process, acquiring the intra-frame mode and coding further comprises the following steps:
starting an intra-frame decision process, loading an intra-frame mode classifier, and performing a first round of intra-frame mode selection;
acquiring a fourth probability value of skipping the current mode through an intra-frame mode classifier, and if the fourth probability value is smaller than a preset third probability threshold, performing a second round of intra-frame mode selection;
and coding according to the selected intra-frame mode and a preset coding flow.
4. A system for fast dividing image characteristic information frames is characterized in that: the method comprises the following steps:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement a method for fast intra image feature information partitioning as claimed in any one of claims 1 to 3.
5. A system for fast dividing image characteristic information frames is characterized in that: the method comprises the following steps:
the preprocessing module is used for acquiring a training video sequence and extracting characteristics of the video sequence to acquire training data;
the classifier training module is used for training a support vector machine through the training data to obtain a classifier;
the classifier dividing module is used for filtering the mode list through the trained classifier to obtain a skip mode list;
the encoding module is used for skipping the trial process of the encoding mode according to the skipping mode list and carrying out an encoding process;
the step of acquiring the training video sequence and extracting the characteristics of the video sequence to acquire the training data comprises the following steps:
acquiring a gray level co-occurrence matrix of the characteristic data;
acquiring the gradient and the absolute value of the characteristic data through a Sobel operator;
training a support vector machine by using the gray level co-occurrence matrix and the gradient and absolute value to obtain a classifier;
the classifier comprises a quadtree structure classifier, a vertical structure classifier, a binary tree structure classifier and an intra-frame mode classifier;
the step of training a support vector machine to obtain a classifier through the training data further comprises the following steps:
performing convolution calculation processing on the pixel matrix through a Sobel operator template to obtain a first-order horizontal gradient matrix and a first-order vertical gradient matrix of the image;
acquiring vertical matrix energy values and horizontal matrix energy values corresponding to vertical and horizontal gray level co-occurrence matrixes, vertical matrix contrast and horizontal matrix contrast, vertical matrix entropy and horizontal matrix entropy, and vertical matrix inverse difference and horizontal matrix inverse difference;
obtaining the quadtree partition depth of the left, upper and upper left coded blocks of the current partition block to be coded;
acquiring the difference of the pixel mean values of the left area and the right area of the current to-be-coded partition block and the first pixel mean value difference of the upper area and the lower area;
acquiring a second pixel mean value difference of different divided areas of the current block to be coded under the condition of the ternary tree division;
taking a vertical matrix energy value and a horizontal matrix energy value corresponding to the vertical and horizontal gray level co-occurrence matrix, a vertical matrix contrast and a horizontal matrix contrast, a vertical matrix entropy and a horizontal matrix entropy, a vertical matrix inverse difference and a horizontal matrix inverse difference, a quadtree division depth, a first pixel mean value difference and a second pixel mean value difference as input data, and training the support vector machine to obtain a classifier;
the step of filtering the pattern list through the trained classifier to obtain the skip pattern list comprises the following steps:
obtaining a first probability value divided into a quad-tree structure through a quad-tree structure classifier, and if the first probability value is greater than a first preset probability threshold of the quad-tree, directly dividing the quad-tree; if the first probability value is lower than a second preset probability threshold of the quadtree, forbidding an attempt process of quadtree division;
acquiring a second probability value divided into vertical structures through a vertical structure classifier, and if the second probability value is larger than a first preset probability threshold value of the vertical structure, forbidding an attempt process of horizontal division; if the second probability value is smaller than a second preset probability threshold value of the vertical structure, forbidding the trial process of vertical division;
and acquiring a third probability value divided into a binary tree through a binary tree structure classifier, if the third probability value is greater than a first preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree, otherwise, if the third probability value is less than a second preset probability threshold of the binary tree, forbidding the trying process of the division of the binary tree.
6. A storage medium having stored therein processor-executable instructions, which when executed by a processor, are configured to perform a method of fast partitioning within an image feature information frame as claimed in any one of claims 1 to 3.
CN202010061114.2A 2020-01-19 2020-01-19 Method, system and storage medium for fast dividing image characteristic information frame Active CN111259826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010061114.2A CN111259826B (en) 2020-01-19 2020-01-19 Method, system and storage medium for fast dividing image characteristic information frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010061114.2A CN111259826B (en) 2020-01-19 2020-01-19 Method, system and storage medium for fast dividing image characteristic information frame

Publications (2)

Publication Number Publication Date
CN111259826A CN111259826A (en) 2020-06-09
CN111259826B true CN111259826B (en) 2023-03-14

Family

ID=70949031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010061114.2A Active CN111259826B (en) 2020-01-19 2020-01-19 Method, system and storage medium for fast dividing image characteristic information frame

Country Status (1)

Country Link
CN (1) CN111259826B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116193147B (en) * 2022-10-19 2023-07-18 宁波康达凯能医疗科技有限公司 Inter-frame image coding method based on decision tree support vector machine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106412589A (en) * 2016-09-23 2017-02-15 四川长虹电器股份有限公司 HEVC intraframe coding method based on support vector radix
CN108737819A (en) * 2018-05-20 2018-11-02 北京工业大学 A kind of flexible coding unit division methods based on quaternary tree binary tree structure
CN110650338A (en) * 2019-09-20 2020-01-03 中山大学 Method, system and storage medium for dividing multifunctional video coding frame
CN110691254A (en) * 2019-09-20 2020-01-14 中山大学 Quick judgment method, system and storage medium for multifunctional video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106412589A (en) * 2016-09-23 2017-02-15 四川长虹电器股份有限公司 HEVC intraframe coding method based on support vector radix
CN108737819A (en) * 2018-05-20 2018-11-02 北京工业大学 A kind of flexible coding unit division methods based on quaternary tree binary tree structure
CN110650338A (en) * 2019-09-20 2020-01-03 中山大学 Method, system and storage medium for dividing multifunctional video coding frame
CN110691254A (en) * 2019-09-20 2020-01-14 中山大学 Quick judgment method, system and storage medium for multifunctional video coding

Also Published As

Publication number Publication date
CN111259826A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN111626300A (en) Image semantic segmentation model and modeling method based on context perception
CN111709265A (en) Camera monitoring state classification method based on attention mechanism residual error network
CN109842799A (en) The intra-frame prediction method and device of color component
CN113784124B (en) Block matching encoding and decoding method for fine division using multi-shape sub-blocks
Kim et al. Multiple level feature-based universal blind image quality assessment model
CN111291826A (en) Multi-source remote sensing image pixel-by-pixel classification method based on correlation fusion network
CN115297288B (en) Monitoring data storage method for driving simulator
CN110458812B (en) Quasi-circular fruit defect detection method based on color description and sparse expression
CN113239869A (en) Two-stage behavior identification method and system based on key frame sequence and behavior information
CN111259826B (en) Method, system and storage medium for fast dividing image characteristic information frame
CN116563293B (en) Photovoltaic carrier production quality detection method and system based on machine vision
CN114299036A (en) Electronic component detection method and device, storage medium and electronic equipment
CN107862344B (en) Image classification method
CN116188929A (en) Small target detection method and small target detection system
CN116152226A (en) Method for detecting defects of image on inner side of commutator based on fusible feature pyramid
CN115578585A (en) Industrial image anomaly detection method, system, computer device and storage medium
CN111950587B (en) Intra-frame coding block dividing processing method and hardware device
CN111291766A (en) Image recognition method and server using deep learning
CN116912130A (en) Image defogging method based on multi-receptive field feature fusion and mixed attention
CN110610508B (en) Static video analysis method and system
CN109741313A (en) The non-reference picture quality appraisement method of independent component analysis and convolutional neural networks
Mei et al. Lightweight High-Performance Blind Image Quality Assessment
CN112584154A (en) Video coding block dividing method and device, electronic equipment and storage medium
CN108629733A (en) Obtain the method and apparatus of high-definition picture
CN110647898B (en) Image processing method, image processing device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant