WO2017203555A1

WO2017203555A1 - Encoding device, photographing device, and program

Info

Publication number: WO2017203555A1
Application number: PCT/JP2016/065115
Authority: WO
Inventors: 尚宏木皿; 龍博石橋; 達也橋本; 勝大草野; 隆宏平松
Original assignee: 三菱電機ビルテクノサービス株式会社; 三菱電機株式会社
Priority date: 2016-05-23
Filing date: 2016-05-23
Publication date: 2017-11-30

Abstract

Even when a differential picture generated by encoding is missing, the present invention enables differential pictures subsequent to the missing differential picture to be decoded. An encoding device 4 is provided with a GOP formation unit 41 for forming a picture group from a moving image generated by photographing by a photographing device, a picture generation unit 42 for generating a beginning picture (I picture) of the picture group and P pictures by a difference from each picture other than the I picture, a motion detection unit 43 for detecting the motion of each picture by a difference from the immediately preceding picture, and an importance setting unit 44 for setting highest importance to the I picture and setting importance to the P pictures in decreasing order of detected magnitude of motion. The photographing device having the encoding device 4 therein transmits encoded pictures in descending order of importance.

Description

Encoding device, photographing device, and program

The present invention relates to an encoding device, an imaging device, and a program, and more particularly to encoding of a moving image generated by imaging using an imaging device such as a surveillance camera.

A moving image is composed of a set of still images, but not a still image (hereinafter referred to as “picture”) as it is, but compresses a picture by encoding to reduce storage capacity and data communication amount. A moving image is composed of an I picture and a P picture by encoding. An I picture has no dependency with other pictures and can be decoded independently to generate an image. A P picture is an image formed by extracting only a moving part, and an image cannot be generated even if it is decoded alone. A conventional encoding apparatus forms an I picture and a P picture as follows.

First, a series of moving images is divided into a plurality of groups. For example, a group is formed every 30 pictures. This formed group is called a picture group or GOP (Group Of Pictures).

The encoding device sets the first picture of the formed picture group as an I picture. Subsequently, the encoding apparatus generates a P picture by extracting the difference between the remaining picture and the previous picture. For example, the second picture in the picture group is generated by the difference from the previous picture, that is, the first picture. The third picture is generated by the difference from the previous picture, that is, the second picture. In this way, each P picture is formed by a difference from the immediately preceding picture.

Thus, the picture group is composed of a set of one uncompressed I picture and a compressed n-1 P picture, thereby reducing the amount of data.

By the way, the video (moving image) shot and generated by the surveillance camera or the like is compressed as described above by the encoding device built in the surveillance camera, transmitted to the PC or the like, and reproduced.

When decoding a moving image generated by encoding, a process reverse to the above may be performed. For example, the second picture in the picture group is decoded by adding the corresponding P picture to the previous picture, that is, the first picture. The third picture is decoded by adding the corresponding P picture to the previous picture, that is, the decoded second picture. In this way, by decoding each P picture, the moving image can be reproduced at the transmission destination.

International Publication No. 2011/086952 JP-A-10-136383 JP 2012-118881 A Japanese Patent Laid-Open No. 10-210046 JP 2007-329707 A JP 2003-134077 A Japanese Patent Publication

A P picture generated by a conventional encoding device is a difference picture generated by a difference from the immediately preceding picture. For example, the third P picture cannot be decoded without the second picture. Also, the second P picture cannot be decoded without the first picture. That is, the third P picture cannot be decoded without the first and second pictures.

Thus, conventionally, the n (n = 2 to m) th picture in a picture group formed of m still images cannot be decoded without the 1st to (n−1) th pictures. Become. If the k (k = 2 to m) -th P picture cannot be normally transmitted due to an increase in network load, the k-th to m-th pictures cannot be decoded.

An object of the present invention is to enable decoding of a difference picture subsequent to a missing difference picture even if the difference picture generated by encoding is missing.

The encoding apparatus according to the present invention extracts any picture included in a picture group formed from a moving picture as a reference picture that can be decoded independently, and each picture other than the reference picture and the reference picture included in the picture group And a picture generating means for generating each differential picture from the difference between the two.

Further, a motion detection unit that detects a motion of the difference picture based on a difference between the difference picture and a picture immediately before the difference picture, and each of the difference pictures is important in descending order of the motion detected by the motion detection unit. Importance setting means for setting the degree.

An image capturing apparatus according to the present invention extracts a picture included in a picture group formed from a moving image generated by shooting by a shooting unit and the shooting unit as a reference picture that can be decoded independently, and a reference picture Coding means having a picture generation unit for generating each difference picture from a difference from each picture other than the reference picture included in the picture group.

The encoding means further includes a motion detection unit for detecting a motion of the difference picture based on a difference between the difference picture and a picture immediately before the difference picture, and for each difference picture, An importance level setting unit that sets importance levels in descending order of motion detected by the motion detection unit, and the transmission unit sets the difference pictures following the reference picture by the importance level setting unit. Are transmitted in the order of importance.

The program according to the present invention extracts a computer as a reference picture that can independently decode any picture included in a picture group formed from a moving image, and each of the reference pictures and the reference pictures included in the picture group. This is to function as a picture generating means for generating each differential picture from the difference from the picture.

According to the present invention, even if a difference picture generated by encoding is lost, a difference picture subsequent to the lost difference picture can be decoded.

Also, by setting the importance in descending order of detected motion, it is possible to transmit differential pictures in descending order of detected motion.

1 is a block configuration diagram showing an embodiment of a photographing apparatus according to the present invention. It is the block block diagram which showed one Embodiment of the encoding apparatus which concerns on this invention. It is the flowchart which showed the encoding process in this Embodiment. It is the conceptual diagram which showed the production | generation method of P picture in this Embodiment. It is a block block diagram of the information terminal device in this Embodiment.

Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

FIG. 1 is a block diagram showing an embodiment of a photographing apparatus according to the present invention. The photographing apparatus 1 in the present embodiment is an apparatus that generates a moving image by photographing such as a surveillance camera. The photographing device 1 includes a photographing unit 2 for photographing video, a moving image storage unit 3 for storing video data (moving images), an encoding device 4 for encoding moving images, and reproducing the encoded moving images on a PC or the like. Or it has the transmission part 5 which transmits to the apparatus which can be preserve | saved. Note that components not used in the description of the present embodiment are omitted from FIG. Although the details will be described later, the basic configuration of the photographing apparatus 1 in the present embodiment may be the same as before, and the encoding method in the encoding device 4 and the transmission method in the transmission unit 5 realized by hardware or software are the same as before. Different.

FIG. 2 is a block diagram showing an embodiment of the encoding apparatus according to the present invention. The encoding device 4 in the present embodiment includes a GOP forming unit 41, a picture generating unit 42, a motion detecting unit 43, and an importance setting unit 44. Note that components not used in the description of the present embodiment are omitted from FIG. The GOP forming unit 41 forms a picture group from a moving image. The picture generation unit 42 extracts any picture included in the picture group as an I picture that can be decoded independently, and generates each P picture from the difference between the I picture and each picture other than the I picture included in the picture group. To do. In the present embodiment, the first picture included in the picture group is extracted as an I picture. The motion detector 43 detects the motion of the P picture based on the difference between the P picture and the picture immediately before the P picture. The importance level setting unit 44 sets the importance level for each P picture in descending order of motion detected by the motion detection unit 43.

The photographing apparatus 1 has a built-in computer having storage means such as a CPU, ROM, and RAM, and a network interface. The components 41 to 44 in the encoding apparatus 4 are mounted on the computer and the computer. This is realized by a cooperative operation with a program running on a CPU. Alternatively, it may be realized by hardware.

Further, the program used in this embodiment can be provided not only by communication means but also by storing it in a computer-readable recording medium such as a USB memory. The program provided from the communication means or the recording medium is installed in the computer, and various processes are realized by the CPU of the computer sequentially executing the program.

Incidentally, a moving image is a set of a plurality of continuous still images. This one still image is generally called a “picture” or “frame”, but in this embodiment, a “picture” is used. In the moving image, a picture group is formed by a plurality of pictures (for example, 30 pictures). A picture group is also called GOP (Group Of Pictures) and is a unit for compression, playback, and editing of pictures.

Each picture included in the picture group is encoded into an I picture or a P picture. The I picture is a reference picture that has no dependency with other pictures and can be decoded independently to generate an image. A P picture is a differential picture formed by extracting only a moving part, and an image cannot be generated even if it is decoded alone. The present embodiment is characterized by this P picture generation method.

Next, the encoding process according to this embodiment will be described with reference to the flowchart shown in FIG.

When the GOP forming unit 41 obtains the moving image by reading out the moving image from the moving image storage unit 3 (step 101), the GOP forming unit 41 forms a picture group for each predetermined number of pictures (step 102). In this embodiment, a case where a picture group is formed every 30 sheets will be described as an example.

The picture generation unit 42 performs the following processing for each picture group. That is, when the picture generation unit 42 acquires one picture group from the GOP formation unit 41, the picture generation unit 42 extracts the first picture included in the picture group as an I picture (step 103). Then, the picture generation unit 42 extracts the difference between the I picture and the other n (n = 2 to 30) th pictures, and generates the nth P picture from these differences (step 104). For example, for the fifth picture, the picture generation unit 42 extracts the difference between the I picture and the fifth picture, and generates the fifth P picture from this difference. In this embodiment, the nth P picture is generated based on the difference between the I picture and the nth picture as described above, and the concept of the encoding method for generating each picture based on the difference from the I picture is illustrated in FIG. 4 shows.

The P picture generated based on the second picture may be the first (first) P picture in the P picture, but is generated based on the second picture. For convenience of explanation, it will be referred to as a second P picture. That is, the n (n = 2 to 30) -th P pictures are generated based on the corresponding n-th pictures.

Conventionally, the n (n = 2 to 30) th P picture is generated by the difference between the immediately preceding (n−1) th picture and the nth picture. For example, the fifth P picture is generated by the difference between the immediately preceding fourth picture and the fifth picture. The eighth P picture was generated by the difference between the previous seventh picture and the eighth picture. On the other hand, the present embodiment is characterized in that the n (n = 2 to 30) th P picture is generated by the difference between the I picture and the nth picture as described above. For example, the fifth P picture is generated by the difference between the I picture (first picture) and the fifth picture. The eighth P picture is generated by the difference between the I picture (first picture) and the eighth picture.

The motion detection unit 43 acquires the same picture group as that of the picture generation unit 42, and extracts the difference from the previous picture for the second and subsequent pictures included in the picture group, thereby detecting the motion in the picture. Detect (step 105). This may be calculated by adding up the total number of pixels whose pixel values have changed with respect to the previous picture. For the detection of this movement, the same method as before may be used.

Note that the processing in the picture generation unit 42 (steps 103 and 104) and the processing in the motion detection unit 43 (step 105) can be performed independently, so either of them may be executed first or in parallel. May be.

Subsequently, the importance level setting unit 44 sets the importance level for each picture generated by the encoding by the picture generation unit 42 (step 106). The importance level setting unit 44 first sets the highest importance level 1 for the I picture. Subsequently, the importance level setting unit 44 refers to the value indicating the motion detected by the motion detection unit 43 (in the above example, the total number of pixels in which the pixel value has changed), and the importance level setting unit 44 performs the second and subsequent values in descending order. Set importance.

According to the example of the picture shown in FIG. 4, for example, the

pictures

3 and 6 have a relatively high importance level because many motions are detected as compared with the I picture. On the other hand, the

pictures

2, 4, and 5 are set with relatively low importance because no motion is detected as compared with the I picture. If the value indicating motion is the same value, a predetermined rule such as setting a high importance level for a picture close to the head may be provided.

When the encoding device 4 encodes a moving image as described above, the transmission unit 5 transmits the encoded moving image to an information terminal device that can be reproduced or stored. At this time, the transmission unit 5 may transmit the images in the order in which the pictures are arranged, but may transmit the images in the descending order of importance with reference to the importance set by the encoding device 4.

FIG. 5 is a block configuration diagram of the information terminal device 6 in the present embodiment. The information terminal device 6 in the present embodiment is realized by a general-purpose computer such as a PC. That is, the information terminal device 6 includes storage means such as a CPU, ROM, RAM, and HDD, a network interface, and user interface means such as a mouse, keyboard, and display. The information terminal device 6 includes a receiving unit 7 that receives the encoded moving image transmitted from the photographing device 1, a decoding device 8 that decodes the encoded moving image, and a moving image that stores the decoded moving image. It has the memory | storage part 9 and the display part 10 which displays a moving image.

In the information terminal device 6 having the above configuration, the decoding device 8 decodes the moving image received by the receiving unit 7 as follows. Here, for convenience of explanation, it is assumed that P pictures are not rearranged according to importance.

The decoding device 8 may basically perform the reverse process of the encoding process in the encoding device 4. First, since an I picture is not encoded by differential compression like a P picture, an original image can be generated without decoding by differential compression. Subsequently, the decoding device 8 decodes the n (n = 2 to 30) th picture by adding the nth P picture to the I picture. The decoding device 8 writes and saves the moving image obtained by decoding each picture in this manner in the moving image storage unit 9. The display unit 10 reproduces the moving image stored in the moving image storage unit 9.

By the way, conventionally, the second picture is generated by decoding the immediately preceding first picture and second P picture. The third picture is generated by decoding the immediately preceding second picture and the third P picture. That is, the third picture can be decoded only after the immediately preceding second picture is decoded normally.

Here, it is assumed that the information terminal device cannot normally receive all the P pictures transmitted from the photographing device due to some event such as an increase in network load. For example, when the fourth P picture cannot be received normally, conventionally, the fourth and subsequent pictures cannot be generated by decoding. That is, the moving image cannot be normally decoded until the next I picture can be normally received.

On the other hand, in this embodiment, the nth picture is decoded by the I picture and the nth P picture. Therefore, even when the fourth P picture cannot be received normally, when the subsequent P pictures can be received normally, for example, when the fifth P picture can be received normally, the fifth picture is designated as an I picture. It can be normally decoded with the fifth P picture. As described above, according to the present embodiment, it is possible to minimize missing of a picture in a decoded moving image.

Also, as described above, in this embodiment, the importance is set in the descending order of the detected motion with respect to the encoded picture, and transmission is performed in the descending order of importance. Therefore, when a picture located in the second half of the picture group cannot be transmitted normally, it means that a picture with relatively small motion could not be transmitted. That is, even if transmission cannot be performed normally in the latter half of the transmission of a picture group, it means that a picture with a relatively small detected motion cannot be transmitted. That is, in this embodiment, even if a picture with relatively small motion cannot be decoded, it is highly possible that a picture with relatively large motion can be decoded normally. Becomes higher.

Also, instead of transmitting all P pictures in order of importance, it is possible not to transmit differential pictures with low importance in consideration of external factors such as network load. For example, a threshold value indicating a motion amount that does not allow a motion to be visually recognized with a difference from the previous picture even if it is decoded and played back is set in advance, and the transmission unit 5 detects a motion whose motion is less than the threshold value. Avoid sending pictures. Thereby, it is possible to prevent an increase in network load accompanying the transmission of moving images. Further, even if a picture in which little motion is detected is not reproduced, the viewer can view the moving image without feeling uncomfortable.

In this embodiment, the case where the encoding device 4 is installed in the imaging device 1 such as a surveillance camera has been described as an example. However, the encoding device 4 is not limited to the imaging device 1 and is installed in an apparatus that handles moving images. May be.

1 shooting device, 2 shooting unit, 3 moving image storage unit, 4 encoding device, 5 transmission unit, 6 information terminal device, 7 receiving unit, 8 decoding device, 9 moving image storage unit, 10 display unit, 41 GOP forming unit 42 picture generation unit, 43 detection unit, 44 importance setting unit.

Claims

One of the pictures included in the picture group formed from the moving image is extracted as a reference picture that can be decoded independently, and the difference picture is determined from the difference between the reference picture and each of the pictures other than the reference picture included in the picture group. An encoding device comprising picture generation means for generating.
Motion detection means for detecting a motion of the difference picture based on a difference between the difference picture and a picture immediately before the difference picture;
Importance setting means for setting importance for each difference picture in descending order of motion detected by the motion detection means;
The encoding device according to claim 1, comprising:
Photographing means;
Any picture included in a picture group formed from a moving image generated by shooting by the shooting unit is extracted as a reference picture that can be decoded independently, and each picture other than the reference picture and the reference picture included in the picture group is extracted. Encoding means having a picture generation unit for generating each difference picture from the difference between
A photographing apparatus comprising:
Having a transmission means,
The encoding means further includes a motion detection unit that detects a motion of the difference picture based on a difference between the difference picture and a picture immediately before the difference picture;
An importance setting unit for setting importance for each difference picture in descending order of motion detected by the motion detection unit;
Have
The transmission means transmits the difference pictures following the reference picture in the order of importance set by the importance setting unit.
The imaging device according to claim 3.
Computer
One of the pictures included in the picture group formed from the moving image is extracted as a reference picture that can be decoded independently, and the difference picture is determined from the difference between the reference picture and each of the pictures other than the reference picture included in the picture group. A program for functioning as a picture generating means for generating.