CN106303366A

CN106303366A - A kind of method and device of Video coding based on territorial classification coding

Info

Publication number: CN106303366A
Application number: CN201610685073.8A
Authority: CN
Inventors: 程国艮; 王语
Original assignee: Mandarin Technology (beijing) Co Ltd
Current assignee: Mandarin Technology (beijing) Co Ltd
Priority date: 2016-08-18
Filing date: 2016-08-18
Publication date: 2017-01-04
Anticipated expiration: 2036-08-18
Also published as: CN106303366B

Abstract

The invention discloses the method and device of a kind of Video coding based on territorial classification coding, relate to technical field of video transmission；Solve the technical problem how more effectively to transmit under cbr (constant bit rate)；This technical scheme includes: step one, identifies each content area in video pictures；Step 2, carries out pretreatment to each region, reduces image noise.

Description

A kind of method and device of Video coding based on territorial classification coding

Technical field

The present invention relates to technical field of video transmission, particularly to the side of a kind of Video coding based on territorial classification coding Method and device.

Background technology

Under normal circumstances, video conference main frame connects high-definition camera shooting meeting-place picture, as it is shown in figure 1, carry out video Coding transmission.But it is because the impact such as light, photographic head sampling, the slide region photographed and original slide image phase Ratio, has bigger noise, and color also can change, such as, the solid color regions on lantern slide, take with photographic head, Not being the most pure color, this causes the compression ratio after information distortion and Video coding to reduce.How to carry out more under cbr (constant bit rate) Effective transmission becomes technical problem urgently to be resolved hurrily.

Summary of the invention

The present invention is to solve the technical problem how more effectively to transmit under cbr (constant bit rate).

In order to solve the problems referred to above, a kind of method that the invention provides Video coding based on territorial classification coding, bag Include:

Step one, identifies each content area in video pictures；

Step 2, carries out pretreatment respectively to each region, reduces image noise.

Present invention also offers the device of a kind of Video coding based on territorial classification coding, including:

Recognition unit, identifies each content area in video pictures；

Pretreatment unit, carries out pretreatment respectively to each region, reduces image noise.

Technical scheme achieves the method and device of a kind of Video coding based on territorial classification coding, uses Different modes carries out pretreatment to image zones of different, can reduce image noise, thus highlight the content that user is interested, Improve the perceived quality of user.

Accompanying drawing explanation

The existing photographic head of Fig. 1 and video conference main frame connection diagram；

Fig. 2 photographic head of the present invention and video conference main frame connection diagram；

The method schematic diagram of a kind of Video coding based on territorial classification coding of Fig. 3；

The method flow schematic diagram of a kind of Video coding based on territorial classification coding of Fig. 4；

Fig. 5 reduces the preprocess method schematic diagram of spatial resolution；

The device schematic diagram of a kind of Video coding based on territorial classification coding of Fig. 6.

Detailed description of the invention

Below in conjunction with drawings and Examples, technical scheme is described in detail.

If it should be noted that do not conflict, each feature in the embodiment of the present invention and embodiment can mutually be tied Close, all within protection scope of the present invention.Although it addition, show logical order in flow charts, but in some situation Under, can be to be different from the step shown or described by order execution herein.

Embodiment one, a kind of method of Video coding based on territorial classification coding, as it is shown on figure 3, include:

Step one, identifies each content area in video pictures；

Embodiment two, a kind of method of Video coding based on territorial classification coding, as shown in Figure 4, in embodiment one On the basis of, including:

Further, described step one, each content area is divided into: human face region, computer viewing area, zone of action, no The combination in the one or more regions in zone of action.

Computer viewing area, human face region, zone of action and inactive region, human eye is at the emphasis perceptually paid close attention to not With.Human face region is of greatest concern.For zone of action, human eye more preferably pays close attention to its motion.And to inactive region, human eye Focus more on its details.Therefore, computer viewing area, human face region, zone of action and inactive region, in pretreatment link Treat with a certain discrimination.

By mark or image analysis technology in advance, identify the human face region in video pictures, computer viewing area, work Dynamic region and inactive region, before traditional coding flow process, carry out pre-place to image zones of different in different ways Reason, reduces image noise, the content that prominent user is interested, improves the perceived quality of user.

Further, described step 2, each region is carried out pretreatment, described human face region does not carry out pretreatment.

Use human face detection tech, detect the human face region in picture, be A by this area marking；Human face region is Of greatest concern, so human face region does not carry out pretreatment.

Further, described step 2, each region is carried out pretreatment, described computer viewing area, at camera collection To picture on, mark out computer picture, then by affine transformation, use the picture collected from computer to replace shooting The computer viewing area of mark in the picture that machine photographs.As shown in Figure 2.

If using Fig. 2 structure, video conference main frame connection photographic head and speech computer, pass through API on speech computer Directly collect original desktop images.By mark form, camera collection to picture on, mark out computer picture Four angle points, then by affine transformation, use the picture collected from computer to replace the picture that video camera photographs In the computer viewing area of mark, it is possible to the effective computer viewing area display quality promoted in the final picture of video conference, And can effectively improve compression ratio.

Because in video conference, photographic head is usually fixed, and can mark out computer and show by the way of mark in advance Show four focuses of region B；To region B, the real-time pictures that will obtain in speech computer, through affine transformation, cover frame On image；Video conference main frame is directly connected to video camera and computer equipment, by obtaining computer picture in real time, uses affine transformation Camera views corresponding content, strengthens picture.

Further, described step 2, each region is carried out pretreatment, described zone of action, carries out reducing spatial discrimination The pretreatment of rate.

Use frame difference method, in non-A, non-B region, identify zone of action C.

Further, the preprocess method reducing spatial resolution is: image pixel is divided into the little lattice of M*N, will be the least Image pixel in lattice, in employing grid, the meansigma methods of each pixel value substitutes.

The preprocess method reducing spatial resolution is:

Image pixel is divided into the little lattice of M*N, is typically 2*2.By the image pixel in every little lattice, each in using grid The meansigma methods of pixel value substitutes, as it is shown in figure 5, so reduce spatial resolution, improves Video coding compression ratio.

Further, described step 2, each region is carried out pretreatment, described inertia region, carrying out the reduction time divides The pretreatment of resolution.

Identify and mark out inactive region D.

Further, the preprocess method reducing temporal resolution is: assume that certain some pixel value is V, its front n frame pretreatment After pixel value be respectively V1, V2 ..., Vn, its meansigma methods is Vm, sets threshold value t, as V and Vm difference absolute value not higher than Threshold value t, then after pretreatment, this pixel value is Vm, is otherwise V.So reduce temporal resolution, improve Video coding compression Rate.

Embodiment three, the device of a kind of Video coding based on territorial classification coding, as shown in Figure 6, including:

Recognition unit, identifies each content area in video pictures；

Embodiment four, the device of a kind of Video coding based on territorial classification coding, as shown in Figure 6, in embodiment three On the basis of farther include:

Further, described recognition unit, each content area is divided into: human face region, computer viewing area, zone of action, The combination in the one or more regions in inertia region.

Further, described pretreatment unit, each region is carried out pretreatment, described human face region does not carry out pretreatment.

Further, described pretreatment unit, each region is carried out pretreatment, described computer viewing area, at photographic head On the picture collected, mark out computer picture, then by affine transformation, use the picture collected from computer to replace The computer viewing area of mark in the picture that video camera photographs.As shown in Figure 2.

Further, described pretreatment unit, each region is carried out pretreatment, described zone of action, carries out reducing space The pretreatment of resolution.Use frame difference method, in non-A, non-B region, identify zone of action C.

The preprocess method reducing spatial resolution is:

Further, described pretreatment unit, each region is carried out pretreatment, described inertia region, when reducing Between the pretreatment of resolution.Identify and mark out inactive region D.

Image in HD video meeting, according to the difference of the focus of user, is divided into four class regions by the present invention: face Region, computer viewing area, zone of action and region, four, inertia region, before traditional coding flow process, use difference Mode image zones of different is carried out pretreatment, reduce image noise, the content that prominent user is interested, improve the sense of user Know quality.

One of ordinary skill in the art will appreciate that all or part of step in said method can be instructed by program Related hardware completes, and described program can be stored in computer-readable recording medium, such as read only memory, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuit to realize.Accordingly Ground, each module/unit in above-described embodiment can realize to use the form of hardware, it would however also be possible to employ the shape of software function module Formula realizes.The present invention is not restricted to the combination of the hardware and software of any particular form.

Certainly, the present invention also can have other various embodiments, in the case of without departing substantially from present invention spirit and essence thereof, ripe Know those skilled in the art to work as and can make various corresponding change and deformation according to the present invention, but these change accordingly and become Shape all should belong to the scope of the claims of the present invention.

Claims

1. the method for a Video coding based on territorial classification coding, it is characterised in that including:

Step one, identifies each content area in video pictures；

2. the method for claim 1, it is characterised in that described step one, each content area is divided into: human face region, electricity The combination in the one or more regions in brain viewing area, zone of action, inertia region.

3. method as claimed in claim 2, it is characterised in that described step 2, carries out pretreatment, described face to each region Region does not carry out pretreatment.

4. method as claimed in claim 2, it is characterised in that described step 2, carries out pretreatment, described computer to each region Viewing area, camera collection to picture on, mark out computer picture, then by affine transformation, use from computer The picture collected replaces the computer viewing area of mark in the picture that video camera photographs.

5. method as claimed in claim 2, it is characterised in that described step 2, carries out pretreatment, described activity to each region Region, carries out reducing the pretreatment of spatial resolution.

6. method as claimed in claim 5, it is characterised in that the preprocess method reducing spatial resolution is: by image slices Element is divided into the little lattice of M*N, and by the image pixel in every little lattice, in employing grid, the meansigma methods of each pixel value substitutes.

7. method as claimed in claim 2, it is characterised in that described step 2, carries out pretreatment to each region, described does not lives Dynamic region, carries out reducing the pretreatment of temporal resolution.

8. method as claimed in claim 7, it is characterised in that the preprocess method reducing temporal resolution is: assume certain point Pixel value is V, and its front pretreated pixel value of n frame is respectively V1, V2 ..., Vn, its meansigma methods is Vm, sets threshold value t, such as V Be not higher than threshold value t with the absolute value of the difference of Vm, then after pretreatment, this pixel value is Vm, is otherwise V.

9. the device of a Video coding based on territorial classification coding, it is characterised in that including:

Recognition unit, identifies each content area in video pictures；

10. device as claimed in claim 9, it is characterised in that described recognition unit, each content area is divided into: human face region, The combination in the one or more regions in computer viewing area, zone of action, inertia region.

11. devices as claimed in claim 10, it is characterised in that described pretreatment unit, carry out pretreatment, institute to each region State human face region and do not carry out pretreatment.

12. devices as claimed in claim 10, it is characterised in that described pretreatment unit, carry out pretreatment, institute to each region State computer viewing area, camera collection to picture on, mark out computer picture, then by affine transformation, use from The picture collected on computer replaces the computer viewing area of mark in the picture that video camera photographs.

13. devices as claimed in claim 10, it is characterised in that described pretreatment unit, carry out pretreatment, institute to each region State zone of action, carry out reducing the pretreatment of spatial resolution.

14. devices as claimed in claim 13, it is characterised in that the preprocess method reducing spatial resolution is: by image Pixel is divided into the little lattice of M*N, and by the image pixel in every little lattice, in employing grid, the meansigma methods of each pixel value substitutes.

15. devices as claimed in claim 10, it is characterised in that described pretreatment unit, carry out pretreatment, institute to each region State inertia region, carry out reducing the pretreatment of temporal resolution.

16. devices as claimed in claim 15, it is characterised in that the preprocess method reducing temporal resolution is: assume certain Point pixel value is V, and its front pretreated pixel value of n frame is respectively V1, V2 ..., Vn, its meansigma methods is Vm, sets threshold value t, Absolute value such as the difference of V and Vm is not higher than threshold value t, then after pretreatment, this pixel value is Vm, is otherwise V.