CN116385369A

CN116385369A - Depth image quality evaluation method and device, electronic equipment and storage medium

Info

Publication number: CN116385369A
Application number: CN202310225423.2A
Authority: CN
Inventors: 高跃; 程嘉梁; 别林
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-02-07
Filing date: 2023-03-01
Publication date: 2023-07-04

Abstract

The application relates to a depth image quality evaluation method, a depth image quality evaluation device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining a depth image to be evaluated and a corresponding reference RGB image, and performing block processing on the depth image and the reference RGB image in the same preset block mode to obtain a plurality of non-overlapping depth image blocks and RGB image blocks; extracting feature vectors of the depth image block and the RGB image block respectively; constructing a hypergraph based on the mapping feature vector; an incidence matrix of the hypergraph is calculated and an overall quality score of the depth image is generated based on the local quality scores of each tile. Therefore, the problems that in the related technology, the depth image quality evaluation work is not finished, the accurate quality score cannot be calculated for the distorted depth image under the condition that the reference depth image cannot be obtained, only simple characteristics are considered, the universality is lacking, the calculation efficiency is low, the high-order correlation in the depth image is ignored and the like are solved.

Description

Depth image quality evaluation method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of quality evaluation technologies, and in particular, to a depth image quality evaluation method and apparatus, an electronic device, and a storage medium.

Background

Today, the rapid development of hardware and computing technology has facilitated the application of many depth images, such as 3D video, while, as a complement to RGB data, depth images also help to address many challenging tasks, such as significant object detection in a scene. Generally, the depth image may be obtained through two modes of a depth camera (active) and stereo matching (passive), however, the existing stereo matching method is often affected by factors such as occlusion or lack of texture, so as to generate an inaccurate depth image, although the depth image can be captured through the depth camera, the quality of the depth image is also affected by inherent sensor noise, and the performance of all related tasks downstream is further affected by distortion in the depth image. Therefore, depth image quality assessment is necessary and important.

In the related technology, the data set image of the image quality to be predicted can be fused through a bottom texture feature extraction network, a low-level contour feature extraction network and a high-level global semantic feature extraction network by adopting a feature fusion method based on an attention mechanism, and the extracted multi-level features are fused, so that a loss function suitable for an overall model is designed, and the optimal quality evaluation prediction score is obtained.

However, in the related art, the depth image quality evaluation work is not enough, and in the case that the reference depth image cannot be obtained, a relatively accurate quality score cannot be calculated for the distorted depth image, and only simple features, lack of generality, low calculation efficiency and neglecting of high-order correlation inside the depth image are considered, so that the problem needs to be solved.

Disclosure of Invention

The application provides a depth image quality evaluation method, a device, electronic equipment and a storage medium, which are used for solving the problems that in the related technology, the depth image quality evaluation work is lacked, the accurate quality fraction cannot be calculated for a distorted depth image under the condition that a reference depth image cannot be obtained, and only simple characteristics, lack of generality, low calculation efficiency, neglect of high-order correlation in the depth image and the like are considered.

An embodiment of a first aspect of the present application provides a depth image quality evaluation method, including the following steps: obtaining a depth image to be evaluated and a corresponding reference RGB image, and performing block processing on the depth image and the reference RGB image in the same preset block mode to obtain a plurality of non-overlapping depth image blocks and RGB image blocks; extracting feature vectors of the depth image block and the RGB image block respectively; mapping the feature vectors of the depth image block and the RGB image block to the same European space, and constructing a hypergraph based on the mapped feature vectors; and calculating an incidence matrix of the hypergraph, performing hypergraph convolution on all the block vectors by using the incidence matrix to obtain a local quality score of each block, and generating an overall quality score of the depth image based on the local quality score of each block.

Optionally, in an embodiment of the present application, the mapping the feature vectors of the depth tile and the RGB tile to the same european space includes: and mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain the mapping feature vector.

Optionally, in an embodiment of the present application, the building a hypergraph based on the mapping feature vector includes: and scaling all the mapping feature vectors to be between preset intervals by using a preset Sigmoid function so as to represent the probability that the image block falls in the superside, and generating the supergraph based on the probability.

Optionally, in an embodiment of the present application, the generating the overall quality score of the depth image based on the local quality score of each tile includes: the quality predictor based on the preset hypergraph convolution uses the incidence matrix to perform hypergraph convolution on all feature vectors, and a one-dimensional vector is obtained after multiple propagation; and obtaining the local quality score of the corresponding block based on the one-dimensional vector.

Optionally, in an embodiment of the present application, the generating the overall quality score of the depth image based on the local quality score of each tile further includes: averaging the local mass fractions of all the image blocks to obtain an average value; and obtaining the overall quality fraction of the depth image from the average value.

An embodiment of a second aspect of the present application provides a depth image quality evaluation apparatus, including: the acquisition module is used for acquiring a depth image to be evaluated and a corresponding reference RGB image, and carrying out blocking processing on the depth image and the reference RGB image in the same preset blocking mode to obtain a plurality of non-overlapping depth image blocks and RGB image blocks; the extraction module is used for respectively extracting the feature vectors of the depth image block and the RGB image block; the construction module is used for mapping the feature vectors of the depth image block and the RGB image block to the same European space and constructing a hypergraph based on the mapped feature vectors; and the generation module is used for calculating the incidence matrix of the hypergraph, performing hypergraph convolution on all the block vectors by utilizing the incidence matrix to obtain the local quality score of each block, and generating the overall quality score of the depth image based on the local quality score of each block.

Optionally, in one embodiment of the present application, the building block includes: the first obtaining unit is used for mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain the mapping feature vector.

Optionally, in one embodiment of the present application, the building block includes: and the characterization unit is used for scaling all the mapping feature vectors to be between preset intervals by utilizing a preset Sigmoid function so as to characterize the probability that the image block falls in the superside, and generating the supergraph based on the probability.

Optionally, in one embodiment of the present application, the generating module includes: the convolution unit is used for carrying out hypergraph convolution on all feature vectors by utilizing the incidence matrix based on a quality predictor of the preset hypergraph convolution, and obtaining one-dimensional vectors after multiple propagation; and the second acquisition unit is used for obtaining the local quality score of the corresponding block based on the one-dimensional vector.

Optionally, in one embodiment of the present application, the generating module further includes: the third acquisition unit is used for averaging the local mass fractions of all the image blocks to obtain an average value; and a fourth acquisition unit, configured to obtain an overall quality score of the depth image from the average value.

An embodiment of a third aspect of the present application provides an electronic device, including: the depth image quality evaluation method comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the depth image quality evaluation method according to the embodiment.

The fourth aspect of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the depth image quality evaluation method as above.

According to the embodiment of the application, the more comprehensive shape and semantic features of the distorted depth image can be extracted, and the hypergraph structure is used for mining the inside of the depth image and the higher-order information association between the depth image and the corresponding reference RGB image, so that the more accurate quality scores of the local and the whole depth image can be calculated under the condition that the reference depth image is not available, the universality of the evaluation method is improved, the calculation efficiency is improved, and the pre-assistance is provided for downstream application related to the depth image. Therefore, the problems that in the related technology, the depth image quality evaluation work is not finished, the accurate quality score cannot be calculated for the distorted depth image under the condition that the reference depth image cannot be obtained, only simple characteristics are considered, the universality is lacking, the calculation efficiency is low, the high-order correlation in the depth image is ignored and the like are solved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

fig. 1 is a flowchart of a depth image quality evaluation method according to an embodiment of the present application;

fig. 2 is a schematic diagram of a depth image quality evaluation method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a depth image quality evaluation device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application.

The depth image quality evaluation method, the device, the electronic equipment and the storage medium according to the embodiments of the present application are described below with reference to the accompanying drawings. Aiming at the defect of the depth image quality evaluation work in the related technology in the background technology, under the condition that a reference depth image cannot be obtained, more accurate quality score can not be calculated for a distorted depth image, and only the problems of simple characteristics, lack of generality, low calculation efficiency and neglect of high-order correlation in the depth image are considered, the application provides a depth image quality evaluation method. Therefore, the problems that in the related technology, the depth image quality evaluation work is not finished, the accurate quality score cannot be calculated for the distorted depth image under the condition that the reference depth image cannot be obtained, only simple characteristics are considered, the universality is lacking, the calculation efficiency is low, the high-order correlation in the depth image is ignored and the like are solved.

Specifically, fig. 1 is a schematic flow chart of a depth image quality evaluation method according to an embodiment of the present application.

As shown in fig. 1, the depth image quality evaluation method includes the steps of:

in step S101, a depth image to be evaluated and a corresponding reference RGB image are obtained, and the depth image and the reference RGB image are subjected to block processing in the same preset block manner, so as to obtain a plurality of non-overlapping depth tiles and RGB tiles.

It can be appreciated that the embodiment of the present application may first acquire a depth image to be evaluated and a corresponding reference RGB image, for example, representing the depth image as X _dm RGB image is X _rgb And dividing the two images into non-overlapping blocks: x is X _dm ＝{x _dm1 ，x _dm2 ，…，x _dmN Sum X _rgb ＝{x _rgb1 ，x _rgb2 ，…，x _rgbN -a }; the depth image and the reference RGB image in the embodiment of the present application may be divided according to the same certain block manner, where N represents the number of blocks, and for convenience of calculation, the number of division in the transverse direction and the number of division in the longitudinal direction may be set to the same value, so that N corresponds to a square number, and the greater the number of blocks in the embodiment of the present application, the finer the result of quality evaluation, but the corresponding calculation time increases accordingly, so that the embodiment of the present application may select an optimal value.

For example, in the embodiment of the present application, N may be set to 7×7, and the numerical value may ensure the calculation efficiency, so that the embodiment of the present application obtains a relatively accurate result, but according to different practical application scenarios, the numerical value of N may be adjusted accordingly, and a specific numerical value may be adjusted adaptively by a person skilled in the art, which is not limited herein. In this embodiment, the original sizes of the depth image and the RGB image are 224×224, and thus each block size is 32×32.

According to the embodiment of the application, the depth image and the reference RGB image can be subjected to blocking processing in the same certain blocking mode, so that a plurality of non-overlapped depth image blocks and RGB image blocks are obtained, and feature vectors of the depth image blocks and the RGB image blocks can be conveniently and respectively extracted later, so that more comprehensive shape and semantic features of the distorted depth image are extracted, and the universality of an evaluation method is improved.

In step S102, feature vectors of the depth tile and the RGB tile are extracted, respectively.

It can be appreciated that the embodiment of the present application may extract the feature vectors of the depth tile and the RGB tile respectively, for example, using a dual-flow neural network model, and extract the feature vectors of the depth tile and the RGB tile respectively using two branches.

As a possible implementation manner, the embodiment of the present application may introduce a dual-flow neural network model, where the dual-flow neural network model is composed of two branches, namely, a depth image flow branch and an RGB image flow branch, each branch is a feature embedding module, and is used for extracting feature vectors of a depth image block and an RGB image block, where the two branches only have different input channel numbers, the input channel number of the depth image flow branch is 1, and the input channel number of the RGB image flow is 3.

Specifically, as shown in fig. 2, each branch contains 10 main layers, including 5 convolutional layers: conv7-32, conv5-64, conv3-128, conv3-256 and conv1-512, and 5 maximum pooling layers of 2 x 2 cores, the convolution layers and pooling layers are alternately arranged, a batch regularization layer and a ReLU activation layer are arranged behind each convolution layer, the last layer outputs a one-dimensional vector, the vector size is 512, and the vector is the feature vector corresponding to the input block. The extraction process in the embodiments of the present application may be expressed as

Where t.epsilon. { rgb, dm } corresponds to the data type,/o>

Representing parameters of corresponding branches, the dual-stream network can extract multi-mode feature vectors, so that the embodiment of the application can extract the feature vectors of the comprehensive depth image blocks and RGB image blocks.

According to the method and the device for the image matching of the RGB image, the feature vectors of the depth image block and the feature vectors of the RGB image block can be extracted respectively, so that the subsequent high-order information association between the interior of the depth image and the corresponding reference RGB image is facilitated, and the universality of the evaluation method is further improved.

In step S103, feature vectors of the depth tile and the RGB tile are mapped to the same euro space, and a hypergraph is constructed based on the mapped feature vectors.

According to the embodiment of the application, the feature vectors of the depth image block and the RGB image block can be mapped to the same European space, the mapping features of the depth image block and the RGB image block can be spliced, and the hypergraph is constructed more accurately according to the mapping feature vectors, so that the universality of an evaluation method is further improved, the calculation efficiency is improved, and the pre-assistance is provided for downstream application related to the depth image.

Optionally, in one embodiment of the present application, mapping feature vectors of the depth tile and the RGB tile to the same european space includes: and mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain mapped feature vectors.

It can be appreciated that the embodiment of the present application may map the feature vectors of the depth tile and the RGB tile to the same european space through linear transformation to obtain mapped feature vectors, and the embodiment of the present application may enable F _dm Feature vector set representing depth tile, F _rgb The calculation formula for the linear transformation, representing the feature vector set of the RGB image block, can be expressed as:

F′ _t ＝F _t ×W _t +b _t

wherein t epsilon { rgb, dm } corresponds toIn data type, W _t And b _t Is a linearly varying parameter.

Through linear change in the embodiment of the application, the feature vectors of the two types of data are projected to the same Euclidean space, and the result after projection is F' _dm And F' _rgb 。

Further, two types of mapping features in the embodiment of the present application are spliced to obtain F '= { F' _dm ，F′ _rgb }。

According to the embodiment of the application, the feature vectors of the depth image block and the RGB image block can be mapped through linear transformation to obtain the mapped feature vectors, so that the fact that the hypergraph structure can be used for mining the inside of the depth image and the high-order information association between the depth image and the corresponding reference RGB image is guaranteed, the fact that the local and the whole accurate depth image quality score can be calculated under the condition that the reference depth image is not available is achieved, and the computing efficiency is improved.

Optionally, in an embodiment of the present application, constructing the hypergraph based on the mapped feature vector includes: and scaling all the mapping feature vectors to be between preset intervals by using a preset Sigmoid function so as to represent the probability that the image block falls in the superside, and generating a supergraph based on the probability.

In the actual execution process, the embodiment of the application can calculate the inner product among the mapping feature vectors and scale to the range of 0-1 by using the Sigmoid function so as to represent the probability that the image block falls in the superside, and generate the supergraph according to the probability. Mapping feature F in embodiments of the present application ^′ Transposed with F ^′T Doing inner product and using Sigmoid function to restrict its value in the range of 0-1, the calculation formula can be expressed as:

H＝σ(F ^′ ×F ^′T )

where σ is a Sigmoid function, H is the illustrative matrix of the hypergraph, and the values in H represent the probability that a point, i.e., a tile, falls in a hyperedge.

The hypergraph matrix H constructed in the embodiments of the present application may more accurately represent the hypergraph structure than a conventional hypergraph matrix having only two values of 0 and 1.

In step S104, an incidence matrix of the hypergraph is calculated, and the hypergraph convolution is performed on all the tile vectors by using the incidence matrix, so as to obtain a local quality score of each tile, and an overall quality score of the depth image is generated based on the local quality score of each tile.

As a possible implementation manner, the embodiment of the present application may calculate the association matrix of the hypergraph, where the calculation formula is as follows:

wherein, the matrix L represents the connection relation between the points in the hypergraph, H is the illustrative matrix of the hypergraph, the value represents the probability that the points fall into the hyperedge, D _v And D _e A diagonal matrix representing the degrees of points and a diagonal matrix representing the degrees of supersides, respectively, wherein the degrees of points define the following formula:

w is the weight matrix of the superside.

The degree of the superside is defined as follows:

according to the embodiment of the application, the hypergraph convolution can be carried out on the feature vectors of all the tiles by utilizing the incidence matrix according to the calculated hypergraph incidence matrix, so that the local quality score of each tile is obtained, and the overall quality score of the depth image to be evaluated is generated based on the local quality score of each tile. Further, according to the embodiment of the application, the more comprehensive shape and semantic features of the distorted depth image can be extracted, and the hypergraph structure is used for mining the higher-order information association between the depth image and the internal region of the RGB image, so that the embodiment of the application can still calculate the more accurate quality scores of the local and the whole depth image under the condition that the reference depth image is not available, the universality of the evaluation method and the accuracy of the quality scores are improved, and pre-assistance is provided for downstream application related to the depth image.

Optionally, in one embodiment of the present application, generating the overall quality score of the depth image based on the local quality score of each tile includes: the quality predictor based on the preset hypergraph convolution uses the incidence matrix to perform hypergraph convolution on all feature vectors, and a one-dimensional vector is obtained after multiple propagation; local mass fractions of the corresponding tiles are obtained based on the one-dimensional vectors.

It can be appreciated that in the embodiment of the present application, the input data for constructing the quality predictor based on the preset hypergraph convolution may be the correlation matrix L of the hypergraph and the feature vector F for extracting the depth tile and the RGB tile respectively, the quality predictor includes 3 hypergraph convolution layers in total, and the ReLU activation and the Dropout operation are performed once after the hypergraph convolution of the previous two times.

For example, embodiments of the present application may define a hierarchical propagation rule for a quality predictor as:

wherein M is ^(t) For a random mask vector, α is the ReLU activation function, X ^(t+1) X is the output of the (t+1) th layer ⁽⁰⁾ ＝F，

A learnable parameter convolved for the hypergraph of layer t.

The quality predictor in the embodiment of the application can obtain a one-dimensional vector X after three layers of propagation ⁽³⁾ Local quality assessment scores corresponding to tiles:

Q＝X ⁽³⁾

according to the embodiment of the application, the accuracy of the mass fraction is improved by carrying out local mass fraction calculation on different image blocks, and the embodiment of the application can be applied to multiple complex conditions, so that the universality is improved.

Optionally, in one embodiment of the present application, generating the overall quality score of the depth image based on the local quality score of each tile further includes: averaging the local mass fractions of all the image blocks to obtain an average value; and obtaining the overall quality fraction of the depth image from the average value.

In the actual execution process, the embodiment of the application can average the evaluation scores Q of all the tiles to obtain the overall quality score of the depth image:

q＝∑Q/2N

according to the embodiment of the application, the overall quality score is obtained through summing and averaging, so that the score is more fit with the actual situation, and the accuracy is higher.

According to the depth image quality evaluation method provided by the embodiment of the application, the more accurate depth image quality scores of the local part and the whole part can be calculated under the condition that no reference depth image exists by extracting the more comprehensive shape and semantic characteristics of the distorted depth image and using the hypergraph structure to excavate the inside of the depth image and the higher-order information association between the depth image and the corresponding reference RGB image, so that the universality of the evaluation method is improved, the calculation efficiency is improved, and the pre-assistance is provided for downstream application related to the depth image. Therefore, the problems that in the related technology, the depth image quality evaluation work is not finished, the accurate quality score cannot be calculated for the distorted depth image under the condition that the reference depth image cannot be obtained, and only simple characteristics, lack of generality and low calculation efficiency are considered and the high-order correlation in the depth image is ignored are solved.

Next, a depth image quality evaluation apparatus according to an embodiment of the present application will be described with reference to the accompanying drawings.

Fig. 3 is a schematic structural diagram of a depth image quality evaluation apparatus according to an embodiment of the present application.

As shown in fig. 3, the depth image quality evaluation apparatus 10 includes: the device comprises an acquisition module 100, an extraction module 200, a construction module 300 and a generation module 400.

Specifically, the obtaining module 100 is configured to obtain a depth image to be evaluated and a corresponding reference RGB image, and perform blocking processing on the depth image and the reference RGB image in the same preset blocking manner, so as to obtain a plurality of non-overlapping depth tiles and RGB tiles.

The extracting module 200 is configured to extract feature vectors of the depth tile and the RGB tile, respectively.

The construction module 300 is configured to map feature vectors of the depth tile and the RGB tile to the same euro space, and construct a hypergraph based on the mapped feature vectors.

The generating module 400 is configured to calculate an incidence matrix of the hypergraph, perform hypergraph convolution on all the tile vectors by using the incidence matrix, obtain a local quality score of each tile, and generate an overall quality score of the depth image based on the local quality score of each tile.

Optionally, in one embodiment of the present application, the building module 300 includes: a first acquisition unit.

The first obtaining unit is used for mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain mapped feature vectors.

Optionally, in one embodiment of the present application, the building module 300 includes: the unit is characterized.

The representation unit is used for scaling all mapping feature vectors to be between preset intervals by utilizing a preset Sigmoid function so as to represent the probability that the image block falls in the superside and generate a supergraph based on the probability.

Optionally, in one embodiment of the present application, the generating module 400 includes: a convolution unit and a second acquisition unit.

The convolution unit is used for performing hypergraph convolution on all feature vectors by utilizing the incidence matrix based on a quality predictor of the preset hypergraph convolution, and obtaining one-dimensional vectors after multiple propagation.

The second obtaining unit is used for obtaining the local quality score of the corresponding block based on the one-dimensional vector.

Optionally, in an embodiment of the present application, the generating module 400 further includes: a third acquisition unit and a fourth acquisition unit.

The third obtaining unit is configured to average the local mass fractions of all the tiles to obtain an average value.

And a fourth acquisition unit for obtaining the overall quality fraction of the depth image from the average value.

It should be noted that the foregoing explanation of the embodiment of the depth image quality evaluation method is also applicable to the depth image quality evaluation device of this embodiment, and will not be repeated here.

According to the depth image quality evaluation device provided by the embodiment of the application, the more accurate depth image quality scores of the local part and the whole part can be calculated under the condition that no reference depth image exists by extracting the more comprehensive shape and semantic characteristics of the distorted depth image and using the hypergraph structure to excavate the inside of the depth image and the higher-order information association between the depth image and the corresponding reference RGB image, so that the universality of the evaluation method is improved, the calculation efficiency is improved, and the pre-assistance is provided for the downstream application related to the depth image. Therefore, the problems that in the related technology, the depth image quality evaluation work is not finished, the accurate quality score cannot be calculated for the distorted depth image under the condition that the reference depth image cannot be obtained, and only simple characteristics, lack of generality and low calculation efficiency are considered and the high-order correlation in the depth image is ignored are solved.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:

memory 401, processor 402, and a computer program stored on memory 401 and executable on processor 402.

The processor 402 implements the depth image quality evaluation method provided in the above-described embodiment when executing a program.

Further, the electronic device further includes:

a communication interface 403 for communication between the memory 401 and the processor 402.

A memory 401 for storing a computer program executable on the processor 402.

Memory 401 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

If the memory 401, the processor 402, and the communication interface 403 are implemented independently, the communication interface 403, the memory 401, and the processor 402 may be connected to each other by a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 4, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 401, the processor 402, and the communication interface 403 are integrated on a chip, the memory 401, the processor 402, and the communication interface 403 may complete communication with each other through internal interfaces.

The processor 402 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more integrated circuits configured to implement embodiments of the present application.

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the depth image quality evaluation method as above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "N" is at least two, such as two, three, etc., unless explicitly defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. A depth image quality evaluation method, comprising the steps of:

obtaining a depth image to be evaluated and a corresponding reference RGB image, and performing block processing on the depth image and the reference RGB image in the same preset block mode to obtain a plurality of non-overlapping depth image blocks and RGB image blocks;

extracting feature vectors of the depth image block and the RGB image block respectively;

mapping the feature vectors of the depth image block and the RGB image block to the same European space, and constructing a hypergraph based on the mapped feature vectors; and

and calculating an incidence matrix of the hypergraph, performing hypergraph convolution on all the block vectors by utilizing the incidence matrix to obtain the local quality score of each block, and generating the overall quality score of the depth image based on the local quality score of each block.

2. The method of claim 1, wherein the mapping feature vectors of the depth tile and the RGB tile to the same european space comprises:

and mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain the mapping feature vector.

3. The method of claim 1, wherein the constructing a hypergraph based on the mapped feature vectors comprises:

and scaling all the mapping feature vectors to be between preset intervals by using a preset Sigmoid function so as to represent the probability that the image block falls in the superside, and generating the supergraph based on the probability.

4. The method of claim 1, wherein the generating the overall quality score of the depth image based on the local quality score for each tile comprises:

the quality predictor based on the preset hypergraph convolution uses the incidence matrix to perform hypergraph convolution on all feature vectors, and a one-dimensional vector is obtained after multiple propagation;

and obtaining the local quality score of the corresponding block based on the one-dimensional vector.

5. The method of claim 4, wherein the generating the overall quality score of the depth image based on the local quality score for each tile further comprises:

averaging the local mass fractions of all the image blocks to obtain an average value;

and obtaining the overall quality fraction of the depth image from the average value.

6. A depth image quality evaluation apparatus, comprising:

the acquisition module is used for acquiring a depth image to be evaluated and a corresponding reference RGB image, and carrying out blocking processing on the depth image and the reference RGB image in the same preset blocking mode to obtain a plurality of non-overlapping depth image blocks and RGB image blocks;

the extraction module is used for respectively extracting the feature vectors of the depth image block and the RGB image block;

the construction module is used for mapping the feature vectors of the depth image block and the RGB image block to the same European space and constructing a hypergraph based on the mapped feature vectors; and

the generation module is used for calculating the incidence matrix of the hypergraph, performing hypergraph convolution on all the block vectors by utilizing the incidence matrix to obtain the local quality score of each block, and generating the overall quality score of the depth image based on the local quality score of each block.

7. The apparatus of claim 6, wherein the build module comprises:

and the obtaining unit is used for mapping the feature vectors of the depth image block and the RGB image block through linear transformation to obtain the mapping feature vector.

8. The apparatus of claim 6, wherein the build module comprises:

and the characterization unit is used for scaling all the mapping feature vectors to be between preset intervals by utilizing a preset Sigmoid function so as to characterize the probability that the image block falls in the superside, and generating the supergraph based on the probability.

9. An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the depth image quality assessment method according to any one of claims 1 to 5.

10. A computer-readable storage medium having stored thereon a computer program, characterized in that the program is executed by a processor for realizing the depth image quality evaluation method according to any one of claims 1 to 5.