CN103279521A

CN103279521A - Video big data distributed decoding method based on Hadoop

Info

Publication number: CN103279521A
Application number: CN2013102039001A
Authority: CN
Inventors: 洪明坚; 张小洪; 冯强; 杨飞; 蒲薇榄; 杨梦宁; 徐玲; 葛永新; 杨丹; 王陈林; 陈霞霞
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2013-05-28
Filing date: 2013-05-28
Publication date: 2013-09-04

Abstract

The invention discloses a video big data distributed decoding method based on Hadoop and belongs to the field of computers. The method mainly comprises the steps of uploading massive video files to a Hadoop distributed file system (HDFS) directly to be stored, mounting the HDFS onto a local file system with fuse_dfs so that files in the HDFS can be accessed with an uniform method, amending the data partitioning strategy of a MapReduce computing framework of the Hadoop and taking image frames as portioning boundaries so that the problems of frame splitting and decoding failure caused by byte-based partition can be solved, acquiring the public information needed by decoding through the HDFS mounted onto the local file system and accomplishing video big data distributed decoding by means of the MapReduce computing framework and an FFmpeg decoding bank, and inputting a decoding result as the Map of MapReduce for subsequent intelligent video analysis. Experimental results show that the method can effectively improve the decoding efficiency of massive video files, and the accuracy rate of the decoding reaches 100%.

Description

The distributed coding/decoding method of the big data of a kind of video based on Hadoop

Technical field

The invention belongs to computer realm, be specifically related to the distributed coding/decoding method of the big data of a kind of video based on Hadoop.

Background technology

Video data comprises a large amount of useful informations, is a heat subject in the computer vision research field to video analysis.Take up space greatly because video has institute, in storage and transmission, all need to compress with conserve storage and transmission bandwidth.Yet intelligent video analysis at first needs compressed video is decoded, obtain original picture frame after, just can carry out follow-up analytical work.Traditional decoding scheme lays particular emphasis on the efficient and the accuracy rate that improve decoding algorithm based on one-of-a-kind system, and this still can tackle demand in the age that video data does not possess certain scale; When the scale of video data during considerably beyond the processing power of unit, this decoding scheme becomes the performance bottleneck of intelligent video analysis.

The Hadoop platform has been used to do distributed video decoding work, but require earlier with the video segmentation software video to be divided into one by one small video less than HDFS block size (being generally 64M), be uploaded to HDFS again, do not support the problem of video file with the dividing method of avoiding the HDFS inherence.Yet this method requires prior divided video, has not only increased the decoding workload, and when handling massive video data, makes this method be difficult to carry out.

For this reason, patent of the present invention has proposed a kind of new technical scheme, has solved the deficiency that exists in the existing magnanimity video distribution formula coding/decoding method, has effectively improved decoding efficiency, and decoding rate of accuracy reached to 100%.

Summary of the invention

Goal of the invention: at the deficiency of existing method existence, the invention provides the distributed coding/decoding method of the big data of a kind of video based on Hadoop, solve the inefficient problem of magnanimity video decode, be implemented in the decoding work of finishing the big data of video in the low hardware environment that disposes.

Technical scheme: for solving the problems of the technologies described above, the present invention adopts following technical scheme: the distributed coding/decoding method of the big data of a kind of video based on Hadoop specifically comprises the steps:

Step a: the magnanimity video file directly is uploaded among the Hadoop distributed file system HDFS stores;

Step b: with fuse_dfs HDFS is mounted to local file system, to visit the file among the HDFS with unified method;

Step c: revise the data segmentation strategy of MapReduce calculating framework among the Hadoop,, solved by byte and cut apart the frame division that causes and the problem that can't decode as partitioning boundary with picture frame;

Steps d: read the required public information of decoding from the HDFS that is mounted to local file system, the distributed decoding of magnanimity video is finished in recycling MapReduce calculating framework and FFmpeg decoding storehouse;

Step e: the Map input of decoded result as MapReduce, be used for follow-up intelligent video analysis.

With respect to prior art, the present invention has following beneficial effect:

1, the present invention directly is stored in the magnanimity video among the distributed file system HDFS, need not the big data of prior divided video, has made things convenient for the storage of magnanimity video.

2, the present invention has revised the data segmentation strategy among the Hadoop,, has solved Hadoop and has cut apart the frame division that causes, the problem that can't decode by byte as partitioning boundary with picture frame.

3, the present invention is based on the Hadoop platform, the video decode task is distributed in the cluster environment of being made up of the computing machine of low configuration carries out, both saved system operation cost, and improved decoding efficiency again, decoded picture frame can be directly used in follow-up intelligent video analysis simultaneously.

Description of drawings

Fig. 1: the one-piece construction figure of method

Fig. 2: with the MapReduce logical division process flow diagram of frame of video position as partitioning boundary

Fig. 3: with the MapReduce logical division result of frame of video position as partitioning boundary

The distributed decoding of Fig. 4 and the contrast of unit decode time

The contrast of the distributed decoding of Fig. 5 and unit decoding efficiency

Embodiment

Below in conjunction with drawings and Examples technology of the present invention is described further, should understand these embodiment only is used for explanation the present invention and is not used in and limits the scope of the invention, after reading the present invention, those skilled in the art all fall within the application's gained claim institute restricted portion to the modification of the various equivalent form of values of invention.

As shown in Figure 1, the invention provides the distributed coding/decoding method of the big data of a kind of video based on Hadoop, specifically comprise the steps:

Step a: the magnanimity video file directly is uploaded among the distributed file system HDFS of Hadoop and stores:

Step c: revise the data segmentation strategy of MapReduce calculating framework among the Hadoop,, solved by byte and cut apart the frame division that causes and the problem that can't decode as partitioning boundary with picture frame:

C1: realized user-defined ImageInputFormat class, such is inherited from the FileInputFormat class.In the ImageInputFormat class, the createRecordReader method is returned an ImageRecordReader object; The isSplitable method is returned true, supposes that video file is always divisible;

C2: realized user-defined ImageRecordReader class, such is inherited from the RecordReader class.In the ImageRecordReader class, the initialize method is at first created a HFFmpegFrameGrabber decoder object, and the reference position of this object is made as the reference position of this split; By HFFmpegFrameGrabber object acquisition next frame image, judge whether this frame is key frame, if be key frame, the start frame of this frame as this split then; If be not key frame, continue to obtain downwards up to obtaining key frame then as the start frame of this split;

C3: in the ImageRecordReader class, the nextKeyValue method judges that at first whether the current location of HFFmpegFrameGrabber object has exceeded the end position of this split, if do not exceed, continues to obtain the next frame image; If exceed, judge whether next frame is key frame, if be key frame, stops to obtain, this split finishes; If be not key frame, continue to obtain downwards up to obtaining key frame, stop then obtaining, this split finishes.The logical division flow process as shown in Figure 2, logical division result as shown in Figure 3,1,6,11,16,21 is key frame among Fig. 3, all the other are non-key frame.

Steps d: obtain the required public information of decoding by the HDFS that is mounted to local file system, the distributed decoding of magnanimity video is finished in recycling MapReduce calculating framework and FFmpeg decoding storehouse;

D1: having realized the HFFmpegFrameGrabber class, is a realization of HframeGrabber interface, and the coding/decoding method that such has mainly encapsulated Ffmpeg decoding storehouse obtains the required public information of decoding by the HDFS that is mapped to this locality and carries out distributed decoding.In the HFFmpegFrameGrabber class, the setBytePos method is used for arranging the start byte position of demoder; The getImage method is used for obtaining next frame image and decoding; The isKeyFrame method is used for judging whether present frame is key frame;

D2: realized the FrameNumWritable class, inherited from WritableComparable that such is used for the play position that identification frames is positioned at sequence of frames of video, as key (Key) type of Map input; Realized the ImageWritable class, inherited from Writable that such is used for depositing the view data of frame, as value (Value) type of Map input;

D3: after the image decoding, at first the frame number in this split is stored in the FrameNumWritable object as the unique identification of frame the reference position of split and frame, and the view data IplImage of frame is stored in the ImageWritable object; Then these two objects are passed to the Map method as key (Key) and value (Value) respectively, as the input of Map method;

Step e: the Map input of decoded result as MapReduce, be used for follow-up intelligent video analysis

E1: after the user realizes the MapReduce application program of oneself, in the Map method, obtain decoded view data by key (Key) and value (Value), be used for follow-up intelligent video analysis;

After the e2:Map distributed treatment finished, the output result imported as Reduce and does the merger processing, finally finishes calculating.

Embodiment:

In the present embodiment, the Hadoop cluster is made up of 15 PCs, and every PC CPU is Intel (R) Pentium (R) 4CPU2.80GHz, in save as 1.5G, hard disk is 80G, wherein 1 as cluster Master, 14 as cluster Slaves; The video that is used for decoding is that size is respectively 15M, 30M, 60M, 100M, 300M, 500M, 1G, 2G, 6G, 12G, the AVI video of 24G.This example carries out the decoding of distributed decoding and unit respectively with the video of above-mentioned different sizes, and decode time, decoding efficiency and the accuracy rate of two kinds of decoding processes have been done contrast.

The time contrast of distributed decoding and unit decoding as shown in Figure 4.As can be seen from Figure 4, along with video increases gradually, distributed decode time is far smaller than the unit decode time.The efficient contrast of distributed decoding and unit decoding as shown in Figure 5.As can be seen from Figure 5, increase along with video, unit decoding efficiency level off, and distributed decoding efficiency during less than 60M, is lower than the unit decoding efficiency at video, at video during greater than 60M, progressively be higher than the unit decoding efficiency, and begin during greater than 6G to descend at video, the reason of decline is because experiment condition is limited, is subjected to the influence of clustered node quantity and cluster processing power.

Distributed decoding accuracy rate test data is the following video file of 1G, and the decoded video of unit is designated as E, and distributed decoded video is designated as A.Test result is as shown in table 1, and attribute is respectively the number percent that comprises frame among the E among the number percent that comprises frame among the A in the frame number, E of frame number, the A of decoding back E, the A in the table.In the implementation process, unit and distributed decoded frame number are identical, and A and E comprise frame number percent mutually and be 100%.Hence one can see that, is accurately based on the distributed decoding of Hadoop, the situation of frame and wrong frame occurs leaking.

Table 1

Video size (MB)	The frame number of E	The frame number of A	E comprises A	A comprises E
					15	1796	1796	100%	100%
30	4196	4196	100%	100%
					60	7796	7796	100%	100%
100	15584	15584	100%	100%
					300	39537	39537	100%	100%
500	71272	71272	100%	100%
					1024	143401	143401	100%	100%

Explanation is at last, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, although with reference to preferred embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that, can make amendment or be equal to replacement technical scheme of the present invention, and not breaking away from aim and the scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims

1. the distributed coding/decoding method of the big data of the video based on Hadoop specifically comprises the steps:

Step e: the Map input of decoded result as MapReduce, can be used in follow-up intelligent video analysis.

2. according to the distributed coding/decoding method of the described big data of a kind of video based on Hadoop of claim 1, it is characterized in that: among the described step a, be the magnanimity video file directly to be uploaded among the distributed file system HDFS of Hadoop store.

3. according to the distributed coding/decoding method of the described big data of a kind of video based on Hadoop of claim 1, it is characterized in that: among the described step c, be to have revised the data segmentation strategy among the Hadoop, with picture frame as partitioning boundary, namely cut apart for each, can judge that this is cut apart obtain decodable frame, cuts apart the frame division that causes, the problem that can't decode thereby solved Hadoop by byte.

4. according to the distributed coding/decoding method of the described big data of a kind of video based on Hadoop of claim 1, it is characterized in that: in the described steps d, the public information that video decode is required, header etc. for example, be stored in the local mount point of describing among the step b, the distributed decoding of magnanimity video has been realized in recycling MapReduce calculating framework and FFmpeg decoding storehouse.The video format of support of the present invention has AVI, MPEG-4, RMVB, FLV, MOV, ASF, WMV, MKV, TS, VCD, DVD, MPEG-1, MPEG-2.

5. according to the distributed coding/decoding method of the described big data of a kind of video based on Hadoop of claim 1, it is characterized in that: among the described step e, the Map input of decoded result as MapReduce, key (key) is frame number, value (value) is image information, and these information can be used in intelligent video analysis such as foreground detection, motion tracking, summary generation.