CN113810720A

CN113810720A - Image processing method, device, equipment and medium

Info

Publication number: CN113810720A
Application number: CN202110908617.3A
Authority: CN
Inventors: 文湘鄂; 徐辉; 王世超; 东健慧; 张磊; 宋磊; 贾惠柱
Original assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Current assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Priority date: 2021-08-09
Filing date: 2021-08-09
Publication date: 2021-12-17

Abstract

The present disclosure relates to an image processing method, apparatus, device, and medium, wherein the method comprises: carrying out sensitive target detection processing on the original image to obtain the position of the sensitive target image in the original image; smoothing the area except the sensitive target image in the original image with the determined position of the sensitive target image to obtain a processed image; and inputting the processed image into an encoder for image encoding. The present disclosure performs similar smoothing on all regions except the target of interest, so that the encoder does not allocate too many bits to these regions (because high frequency components are removed), and the residual between the reconstructed image and the new original image is smaller, that is, the new frame rate is smaller. The method has the advantage that the complicated code rate control method can be avoided to achieve the aim of distributing the code rate according to the target importance.

Description

Image processing method, device, equipment and medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, apparatus, device, and medium.

Background

The essence of video coding is to produce video with high subjective quality or subjective quality as much as possible under the constraint of limited number of bits (code rate), and the industry has studied this for many years, namely so-called rate-distortion optimization. However, the importance of the objects in the video is different, for example, the importance of the traditional video communication for clearly compiling the object of the human face exceeds that of other objects, and for example, the most important object for monitoring the video is people, vehicles and non-motor vehicles but not others. The conventional idea is to enclose the object of interest by some technical means (inside the encoder or pre-processing module), such as object recognition, and pass this information (such as the coordinates and size of the object) to the encoder, which then assigns the code rate to the object of interest by manipulating the coding unit QP. The method has the disadvantages that firstly, the coding control algorithm becomes very complex, secondly, if the code rate of the regions which are originally clear outside the target of interest in the image is low, the regions in the reconstructed frame become fuzzy, and considering that the reconstructed frame is actually a reference of the subsequent original frame, the residual error between the subsequent original frame and the reconstructed frame is rather large, so that the coding code rate becomes high. The idea of the present invention is to go back and forth, without changing the original encoding mechanism of the encoder, without manipulating the QP, but the coded image is modified before the image is sent to the coder, the interested target is extracted by a target identification method, the pixels of the interested target are not processed, but the region outside the interested target is smoothed (the smooth scale can be adjusted according to the code rate/quality requirement), the processed image is sent to an encoder, only low-frequency information is left after the region outside the target is smoothed, no special parameters are required for the encoder, which naturally reduces the bit allocation to these regions, and the residual error between the reconstructed image and the new original image seen by the encoder is not very large (because both are subjected to similar smoothing processing in advance), so that the effects of clearly encoding the target and reducing the code rate are achieved.

Disclosure of Invention

The method aims to solve the technical problem that the prior art cannot meet the image processing requirements of users.

To achieve the above technical object, the present disclosure provides an image processing method, including:

carrying out sensitive target detection processing on the original image to obtain the position of the sensitive target image in the original image;

smoothing the area except the sensitive target image in the original image with the determined position of the sensitive target image to obtain a processed image;

and inputting the processed image into an encoder for image encoding.

Further, the step of performing the sensitive target detection processing on the original image to obtain the position of the sensitive target image in the original image specifically includes:

and carrying out sensitive target detection processing on the original image through a neural network or other computer practical methods to obtain the position of the sensitive target image in the original image.

Further, the neural network includes:

a recurrent neural network, a convolutional neural network, and/or a recurrent neural network.

Further, the smoothing process specifically includes:

median filtering, mean filtering, gaussian filtering, and/or bilateral filtering.

To achieve the above technical object, the present disclosure can also provide an image processing apparatus comprising:

the sensitive target detection processing module is used for carrying out sensitive target detection processing on the original image to obtain the position of the sensitive target image in the original image;

the image processing module is used for smoothing the areas except the sensitive target image in the original image with the position of the sensitive target image determined to obtain a processed image;

and the image coding module is used for inputting the processed image into an encoder to carry out image coding.

Further, the sensitive target detection processing module is specifically configured to:

and carrying out sensitive target detection processing on the original image through a neural network to obtain the position of the sensitive target image in the original image.

Further, the neural network includes:

Further, the smoothing process specifically includes:

To achieve the above technical object, the present disclosure can also provide a computer storage medium having stored thereon a computer program for implementing the steps of the image processing method described above when the computer program is executed by a processor.

To achieve the above technical objective, the present disclosure further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the image processing method when executing the computer program.

The beneficial effect of this disclosure does:

the method has the advantage that the complicated code rate control method can be avoided to achieve the aim of distributing the code rate according to the target importance. The traditional method for manipulating the QP (using a large QP for an unimportant target) is complex in rate control algorithm and poor in practical effect, and the source of the difficult effect is that if a non-important region is originally clear (i.e. has rich textures), if the non-important region is forced to be coded in a large QP manner, the rate of the current frame (especially an I frame or an I block) is inherently saved, but a considerable residual exists between the subsequent original frame and the current frame, so that the rate of the subsequent frame becomes larger. The present disclosure performs similar smoothing on all regions except the target of interest, so that the encoder does not allocate too many bits to these regions (because high frequency components are removed), and the residual between the reconstructed image and the new original image is smaller, that is, the new frame rate is smaller.

Drawings

Figure 1 shows a flow diagram of a method of embodiment 1 of the present disclosure;

fig. 2 shows a schematic structural diagram of the apparatus of embodiment 2 of the present disclosure;

fig. 3 shows a schematic structural diagram of embodiment 4 of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

Various structural schematics according to embodiments of the present disclosure are shown in the figures. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers, and relative sizes and positional relationships therebetween shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, as actually required.

The first embodiment is as follows:

as shown in fig. 1:

the present disclosure provides an image processing method, including:

s101: carrying out sensitive target detection processing on the original image to obtain the position of the sensitive target image in the original image;

s102: smoothing the area except the sensitive target image in the original image with the determined position of the sensitive target image to obtain a processed image;

s103: and inputting the processed image into an encoder for image encoding.

Further, the neural network includes:

Further, the smoothing process specifically includes:

Mean value filtering (Simple Blurring)

Mean filtering is typically a linear filtering algorithm that involves applying a template to the target pixel on the image, where the template includes its surrounding neighboring pixels (the 8 surrounding pixels centered on the target pixel form a filtering template, i.e., the target pixel itself is removed), and replacing the original pixel value with the average of all pixels in the template.

It is very sensitive to noisy images, especially images with large isolated points, and even large differences in the presence of very small numbers of points can result in significant fluctuations in the average.

Median filtering (Median Blurring)

The median filtering method is a non-linear smoothing technique, which sets the gray value of each pixel point as the median of the gray values of all pixel points in a certain neighborhood window of the point, i.e. the value of the central pixel is replaced by the median (not the average) of all pixel values.

The median filtering avoids the influence of isolated noise points of the image by selecting a median, has good filtering effect on impulse noise, and particularly can protect the edge of a signal from being blurred while filtering the noise. These good characteristics are not available with linear filtering methods. In addition, the algorithm of median filtering is simple and is easy to be realized by hardware. Therefore, once the median filtering method is proposed, it is applied to digital signal processing.

Gaussian filter (Gaussian blur)

The gaussian filtering is a linear smooth filtering, is suitable for eliminating gaussian noise, and is widely applied to a noise reduction process of image processing. Generally speaking, gaussian filtering is a process of performing weighted average on the whole image, and the value of each pixel point is obtained by performing weighted average on the value of each pixel point and other pixel values in the neighborhood. The specific operation of gaussian filtering is: each pixel in the image is scanned using a template (or convolution, mask), and the weighted average gray value of the pixels in the neighborhood determined by the template is used to replace the value of the pixel in the center of the template.

The general reason for performing gaussian filtering is that the pixels in the real image in space are slowly changing, so the pixel change of the nearby point is not obvious, but two random points may form a large pixel difference. Based on this, gaussian filtering reduces noise while preserving the signal. Unfortunately, this method is ineffective near the edges, and thus gaussian filtering breaks edges flat. However, the gaussian smoothing filter is still very effective for suppressing the noise that follows the normal distribution.

Bilateral filtering (Bilateral Blurring)

Bilateral filtering (Bilateral filter) is a nonlinear filtering method, which is a compromise treatment combining the spatial proximity and the pixel value similarity of an image, and simultaneously considers the spatial information and the gray level similarity to achieve the purpose of edge-preserving and denoising. Has the characteristics of simplicity, non-iteration and locality. Bilateral filtering can provide a method that does not smooth out edges, but at the cost of requiring more processing time.

Similar to gaussian filtering, bilateral filtering constructs a weighted average value according to each pixel and the field thereof, and the weighted calculation comprises two parts, wherein the weighting mode of the first part is the same as that in gaussian smoothing, and the second part also belongs to gaussian weighting, but not based on weighting on the space distance between a central pixel point and other pixel points, but based on weighting of the brightness difference values of other pixels and the central pixel. Bilateral filtering can be regarded as Gaussian smoothing, similar pixels are endowed with higher weight, dissimilar pixels are endowed with smaller weight, and the bilateral filtering can also be used for image segmentation.

The bilateral filter has the advantages that edge preservation (edge preservation) can be performed, generally, in the past, wiener filtering or Gaussian filtering is used for denoising, edges are obviously blurred, and the protection effect on high-frequency details is not obvious. The bilateral filter has a Gaussian variance sigma-d higher than Gaussian filter as the name suggests, and is a Gaussian filter function based on spatial distribution, so that pixels far away from the edge do not influence the pixel value on the edge too much near the edge, and the storage of the pixel value near the edge is ensured. However, since too much high frequency information is stored, the bilateral filter cannot filter out high frequency noise in the color image cleanly, and only can perform better filtering on low frequency information.

Example two:

as shown in figure 2 of the drawings, in which,

the present disclosure can also provide an image processing apparatus including:

a sensitive target detection processing module 201, configured to perform sensitive target detection processing on the original image to obtain a position of the sensitive target image in the original image;

the image processing module 202 is configured to perform smoothing processing on an area, except for the sensitive target image, in the original image where the position of the sensitive target image is determined to obtain a processed image;

and the image coding module 203 is configured to input the processed image into an encoder for image coding.

The sensitive object detection processing module 201 is sequentially connected to the image processing module 202 and the image encoding module 203.

Further, the sensitive target detection processing module 201 is specifically configured to:

Further, the neural network includes:

Further, the smoothing process specifically includes:

Example three:

the present disclosure can also provide a computer storage medium having stored thereon a computer program for implementing the steps of the image processing method described above when executed by a processor.

The computer storage medium of the present disclosure may be implemented with a semiconductor memory, a magnetic core memory, a magnetic drum memory, or a magnetic disk memory.

Semiconductor memories are mainly used as semiconductor memory elements of computers, and there are two types, Mos and bipolar memory elements. Mos devices have high integration, simple process, but slow speed. The bipolar element has the advantages of complex process, high power consumption, low integration level and high speed. NMos and CMos were introduced to make Mos memory dominate in semiconductor memory. NMos is fast, e.g. 45ns for 1K bit sram from intel. The CMos power consumption is low, and the access time of the 4K-bit CMos static memory is 300 ns. The semiconductor memories described above are all Random Access Memories (RAMs), i.e. read and write new contents randomly during operation. And a semiconductor Read Only Memory (ROM), which can be read out randomly but cannot be written in during operation, is used to store solidified programs and data. The ROM is classified into a non-rewritable fuse type ROM, PROM, and a rewritable EPROM.

The magnetic core memory has the characteristics of low cost and high reliability, and has more than 20 years of practical use experience. Magnetic core memories were widely used as main memories before the mid 70's. The storage capacity can reach more than 10 bits, and the access time is 300ns at the fastest speed. The typical international magnetic core memory has a capacity of 4 MS-8 MB and an access cycle of 1.0-1.5 mus. After semiconductor memory is rapidly developed to replace magnetic core memory as a main memory location, magnetic core memory can still be applied as a large-capacity expansion memory.

Drum memory, an external memory for magnetic recording. Because of its fast information access speed and stable and reliable operation, it is being replaced by disk memory, but it is still used as external memory for real-time process control computers and medium and large computers. In order to meet the needs of small and micro computers, subminiature magnetic drums have emerged, which are small, lightweight, highly reliable, and convenient to use.

Magnetic disk memory, an external memory for magnetic recording. It combines the advantages of drum and tape storage, i.e. its storage capacity is larger than that of drum, its access speed is faster than that of tape storage, and it can be stored off-line, so that the magnetic disk is widely used as large-capacity external storage in various computer systems. Magnetic disks are generally classified into two main categories, hard disks and floppy disk memories.

Hard disk memories are of a wide variety. The structure is divided into a replaceable type and a fixed type. The replaceable disk is replaceable and the fixed disk is fixed. The replaceable and fixed magnetic disks have both multi-disk combinations and single-chip structures, and are divided into fixed head types and movable head types. The fixed head type magnetic disk has a small capacity, a low recording density, a high access speed, and a high cost. The movable head type magnetic disk has a high recording density (up to 1000 to 6250 bits/inch) and thus a large capacity, but has a low access speed compared with a fixed head magnetic disk. The storage capacity of a magnetic disk product can reach several hundred megabytes with a bit density of 6250 bits per inch and a track density of 475 tracks per inch. The disk set of the multiple replaceable disk memory can be replaced, so that the disk set has large off-body capacity, large capacity and high speed, can store large-capacity information data, and is widely applied to an online information retrieval system and a database management system.

Example four:

the present disclosure also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the image processing method are implemented.

Fig. 3 is a schematic diagram of an internal structure of an electronic device in one embodiment. As shown in fig. 3, the electronic device includes a processor, a storage medium, a memory, and a network interface connected through a system bus. The storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions can make a processor realize an image processing method when being executed by the processor. The processor of the electrical device is used to provide computing and control capabilities to support the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform a method of image processing. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 3 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

The electronic device includes, but is not limited to, a smart phone, a computer, a tablet, a wearable smart device, an artificial smart device, a mobile power source, and the like.

The processor may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor is a Control Unit of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules (for example, executing remote data reading and writing programs, etc.) stored in the memory and calling data stored in the memory.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connected communication between the memory and at least one processor or the like.

Fig. 3 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor through a power management device, so that functions such as charge management, discharge management, and power consumption management are implemented through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used to establish a communication connection between the electronic device and other electronic devices.

Optionally, the electronic device may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.

Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims

1. An image processing method, comprising:

and inputting the processed image into an encoder for image encoding.

2. The method according to claim 1, wherein the obtaining of the position of the sensitive target image in the original image by performing the sensitive target detection processing on the original image specifically comprises:

and carrying out sensitive target detection processing on the original image through a neural network or other computer vision algorithms to obtain the position of the sensitive target image in the original image.

3. The method of claim 2, wherein the neural network comprises:

4. The method according to claim 1, wherein the smoothing process specifically comprises:

5. An image processing apparatus characterized by comprising:

6. The apparatus according to claim 5, wherein the sensitive object detection processing module is specifically configured to:

7. The apparatus of claim 6, wherein the neural network comprises:

8. The apparatus according to claim 5, wherein the smoothing process specifically comprises:

9. An electronic device comprising a memory, a processor and a computer program stored in the memory and operable on the processor, wherein the processor implements the steps corresponding to the image processing method according to any one of claims 1 to 4 when executing the computer program.

10. A computer storage medium having computer program instructions stored thereon, wherein the program instructions, when executed by a processor, are adapted to carry out the steps corresponding to the image processing method as claimed in any one of claims 1 to 4.