CN102710983A - Method for extracting audio and video data from multimedia - Google Patents

Method for extracting audio and video data from multimedia Download PDF

Info

Publication number
CN102710983A
CN102710983A CN2012101102817A CN201210110281A CN102710983A CN 102710983 A CN102710983 A CN 102710983A CN 2012101102817 A CN2012101102817 A CN 2012101102817A CN 201210110281 A CN201210110281 A CN 201210110281A CN 102710983 A CN102710983 A CN 102710983A
Authority
CN
China
Prior art keywords
data
link
filter
video
multimedia
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101102817A
Other languages
Chinese (zh)
Other versions
CN102710983B (en
Inventor
陈刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Yitan Culture Communication Co ltd
Original Assignee
HANGZHOU NO IMAGE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HANGZHOU NO IMAGE TECHNOLOGY Co Ltd filed Critical HANGZHOU NO IMAGE TECHNOLOGY Co Ltd
Priority to CN201210110281.7A priority Critical patent/CN102710983B/en
Publication of CN102710983A publication Critical patent/CN102710983A/en
Application granted granted Critical
Publication of CN102710983B publication Critical patent/CN102710983B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a method for extracting audio and video data from multimedia. The method mainly includes: firstly, setting up a complete media rendering chain in a DirectShow manner; secondly, disconnecting a renderer at the tail end; thirdly, inserting a screener before the renderer at the tail end and recovering connection; fourthly, extracting audio and video data through the inserted screener; fifthly, subjecting the extracted data to distribution and processing; and sixthly, delivering the processed data to an application layer. Therefore, audio and video data can be obtained in the playing process. The method for extracting audio and video data from multimedia is supportive to extensive media formats and applicable to the field of audio and video application.

Description

A kind of method that from multimedia, extracts audio frequency and video
Technical field
The present invention relates to the audio frequency and video process field, especially relate to a kind of method that from multimedia, extracts audio frequency and video.
Background technology
In the multimedia project; Often need adopt the audio frequency of specific format, video data to transmit, and Data Source differs, have plenty of the media file (rmvb or the like) of various forms; The Media Stream (mms etc.) that has plenty of network; If to every kind of concrete source of difference, all carry out link processing, then very loaded down with trivial details and do not have a versatility.
Source of media mainly contains two kinds of file type and network data stream types; We have AVI, RMVB, MOV, FLV, MP4 or the like at common file type; Each file type corresponding different media containers; It organizes the audio, video data of various different codings, and encapsulates according to specific format.Common network data stream type has MMS and RTSP of Microsoft etc., and it also has specific data format.
In the face of various medium, want to extract the audio, video data that it comprises, the simplest mode is exactly directly to read.According to the different vessels form, write corresponding program, file is read, the parsing of the row binary of going forward side by side, work such as separation are exported audio, video data at last.
In order to simplify certain operations, there are some kits on the boundary that increases income, has simplified operation; Such as FFMPEG, powerful, can change the file and the Media Stream of different-format; Its inside has comprised the realization of unprocessed form file read-write, makes the developer from hard work, break away from.But it lays particular emphasis on file conversion, and intelligent inadequately, needs manual work to discern file type, and needs to specify decoder and complicated parameter just can handle, and that is to say, also needs every kind of medium type all use different parameters.
Microsoft has proposed a DerectShow framework; Each medium (file and network flow) all are abstracted into a data supply filter (Filter); General character between them is extracted, mask the difference of bottom, use the mode of link to make up a filter chart (Filter Graph); Set up completion when chart, can play.Which kind of form need not to know specifically is, also need not the parameter that specified file is opened, and just can mate automatically.The media player of Microsoft is exactly the successful case that on this technology, realizes.
Setting up chart has two kinds of methods, and a kind of is manual foundation, need set up the filter of each grade, and manual the connection, this operates more complicated, does not have versatility simultaneously, has changed a machine and possibly just connect failure.Also have the intelligent link technology in addition, each medium can both match best link automatically, but the accurate process of control connection.
For the application of player, the method for intelligent link is enough perfect, because it has made up whole filters that file reads, separates, decodes, plays automatically.But for the application of extracting media data; This has just been realized not; Traditional method is to adopt the mode of manual construction, and like the media link technology of MPC, a kind of mode comes to this; Every kind of media formats is all done the configuration of one or more best link, and benefit is can the fast construction link, accurate controls playing effect.But too complicated, the link that storm wind oneself is announced just has the hundreds of bar, and the thing that these links are difficulties will be safeguarded, managed to common application.
Can find out that from top elaboration the mode of all media formats of manual process is the most original, workload is huge, and is inadvisable really.Use FFMPEG to change, parameter is provided with trouble, every kind of form all otherwise with parameter specify, so in the application scenario of extracting media data, have big limitation.Use DirectShow manual construction link, workload is also bigger, simultaneously every kind of form all otherwise with configuration make up.Use the mode of intelligent link because what stress is full-automatic, so the hand-guided aspect very a little less than, can only played file, can not extracted data.
Summary of the invention
The present invention mainly is that solution existing in prior technology manual construction link extraction audio, video data workload is big, speed slow, versatility is poor; Be difficult to obtain the technical problem of the data of needs, a kind of method that from multimedia, extracts audio frequency and video that can make up link automatically and have better versatility is provided.
The present invention is directed to above-mentioned technical problem mainly is able to solve through following technical proposals: a kind of method that from multimedia, extracts audio frequency and video may further comprise the steps:
Step 1, the media hype link that structure is complete;
Step 2, through the mode of search link, find the renderer of least significant end, find corresponding pin then, call the UnConnent mode, renderer is manually broken off;
Step 3, self-defining several filters are inserted the gap of links, call the Connent mode renderer that breaks off is connected again;
Step 4, filter extraction audio, video data through inserting;
Step 5, the data that extract are shunted, become and do multichannel output,, adopt the different coding parameter to handle then to the data on each road;
Step 6, the data after will handling are given application layer.
DirectShow is the multimedia framework storehouse of Microsoft, and numerous functional based methods is provided, and the Windows system all supports.
Filter chart Filter Graph is the assembly in the Microsoft DirectShow storehouse, is used for managing and filtering device assembly, wherein comprises interfaces such as Render, Run, is used to operate filter.
Filter F ilter is the assembly in the Microsoft DirectShow storehouse, it is put in the management through figures device, and connects correctly the competence exertion effect.
The pin Pin of filter is the attachment component of filter, is used for the manual attended operation of filter, and interfaces such as Connent, UnConnent are provided.
As preferably, in the step 1, make up the media hype link and be specially: use DirectShow intelligent link mode, call the Render method of chart manager, obtain complete link chart.
As preferably, in the step 3, the filter of insertion comprises screening washer.
This programme adopts the DerectShow media framework, adopts intelligence to play up, and dynamically insertion technology realizes, can be to various media files, MMS network flow self adaptation.
The management through figures device (Graph) of DerectShow provides Render method, can dispose automatically file, network flow automatically, and (Filter) is cascaded with each filter, constitutes complete broadcast link.
In the time of manual construction, can travel through the pin (Pin) of filter earlier, use then pin Connect, UnConnect method, different filters is linked together, this is manual method.What note least significant end is renderer, and its function is to call the physical layer interface of the video card of sound card, plays the voice data of final RGB, yuv video or PCM form.
The advantage of various schemes in the comprehensive background technology of the present invention realizes using automatic matching mode, need not designated parameter, just can extract audio, video data.
Advantage of the present invention is to support media formats widely, and has overcome the problem of the complexity of existing scheme, the mode that can cope with shifting events by sticking to a fundamental principle, the simple and effective extraction of carrying out data.New form also can be good at supporting to have expandability.The thinking of this scheme can also be generalized to other application scenario simultaneously, beautifies special efficacy or the like such as film sectional drawing, virtual video, virtual audio, video.
Description of drawings
Fig. 1 is a kind of flow chart of the present invention;
Fig. 2 is that chain graph is used in the broadcast that a kind of automatic structure of the present invention obtains;
Fig. 3 is the chain graph after Fig. 2 inserts screening washer.
Among the figure: 1, supply filter, 2, fore filter, 3, the audio frequency and video separator, 4, audio decoder; 5, voice band filter, 6, the audio frequency renderer, 7, Video Decoder, 8, audio filters; 9, video renderer, 10, the audio frequency screening washer, 11, the video screening washer.
Embodiment
Pass through embodiment below, and combine accompanying drawing, do further bright specifically technical scheme of the present invention.
Embodiment: a kind of method that from multimedia, extracts audio, video data of present embodiment, as shown in Figure 1, specific as follows:
1. make up complete media hype link: as long as this document ability normal play; Just can use the mode of DerectShow intelligent link; Call the Render method of chart manager; Directly obtain complete link chart (see figure 2); Multi-medium data process supply filter 1, fore filter 2 backs successively is separated into audio signal and vision signal by audio frequency and video separator 3, and 5 filtrations then get into audio frequency renderer 6 and play up broadcast through voice band filter after 4 decodings of audio signal entering audio decoder, and 8 filtrations then get into video renderer 9 and play up broadcast through audio filters after 7 decodings of vision signal entering Video Decoder; Fore filter 2, voice band filter 5 and audio filters 8 all not only only comprise a filter for the set of filter, and the various media files and the network media can both be supported.
2. dynamically insert technology: this step is crucial, through the mode of search link, finds the renderer of least significant end; Find corresponding pin (Pin) then, call the UnConnect method, renderer is manually broken off; In the middle of self-defining a plurality of filters (comprising screening washer) insertion, and then call the Connect method, the renderer that breaks off is connect again; At this time just recovered complete link (see figure 3); Specifically comprise two parts, the one, the connection between voice band filter 5 and the audio frequency renderer 6 is broken off, insert audio frequency screening washer 10 then; Then the output of voice band filter 5 and the input of audio frequency screening washer 10 are connected, and audio frequency screening washer 10 is connected with audio frequency renderer 6; The 2nd, the connection between audio filters 8 and the video renderer 9 is broken off; Insert video screening washer 11 then, then the output of audio filters 8 and the input of video screening washer 11 are connected, and video screening washer 11 is connected with video renderer 9; Video screening washer 11 is catcher with audio frequency screening washer 10; File still can normal play after recovering, but in the process of playing, data are extracted by the screening washer that is inserted by the centre endlessly.
3. recode: the data that extract are shunted, become and do multichannel output,, adopt the different coding parameter to handle, give application layer at last then to the data on each road.
Formal specification with false code is following:
// initialized processing
CComPtr<?IGraphBuilder?>?m_pGB;
m_pGB.CoCreateInstance(?CLSID_FilterGraph?);
M_pGB->RenderFile (" c: ", NULL); // can be the arbitrary format file
//
// implement here, constructed complete broadcast link, just can call IMediaControl->mode of Run (), a video playback window directly appears
// but not to make player now, so innovated, continue to walk downward
//
// adding filter ACMWrapper
IBaseFilter?*pACMWrapper;
AddFilterByCLSID(m_pGB,?CLSID_ACMWrapper,?_T("ACMWrapper"),?&pACMWrapper);
// adding filter Converter
AddFilterByCLSID(m_pGB,?CLSID_AVConverter,?_T("Converter"),?&pConverter);
// adding screening washer
CComPtr<?ISampleGrabber?>?m_pAudioGrabber;
m_pAudioGrabber.CoCreateInstance(?CLSID_SampleGrabber?);
CComQIPtr<?IBaseFilter,?&IID_IBaseFilter?>?pGrabBase(?m_pAudioGrabber?);
m_pGB->AddFilter(?pGrabBase,?L"audio?Grabber"?);
It is inappropriate also occurring the video playback window in the time of // data acquisition, so replace with empty broadcast window, does not so just eject broadcast window here
IBaseFilter?*pNullRender?=?NULL;
hr?=?AddFilterByCLSID(m_pGB,?CLSID_NullRenderer,?_T("NullRender"),?&pNullRender);
//
// dynamic the process of inserting
// filter ACMWrapper is replaced original audio frequency Render, last parameter has been specified audio frequency or video, and handle disconnection function inside
ReplaceRenderFilter(m_pGB,?pACMWrapper,?TRUE);
//pACMWrapper connects pConverter
ConnectFilters(m_pGB,?pACMWrapper,?pConverter);
//pConverter connects pGrabBase
ConnectFilters(m_pGB,?pConverter,?pGrabBase);
//pGrabBase connects pNullRender
ConnectFilters(m_pGB,?pGrabBase,?pNullRender);
// link structure finishes, and has brought into operation link
CComQIPtr<?IMediaControl,?&IID_IMediaControl?>?pControl?=?m_pGB;
hr?=?pControl->Run(?);
The screening washer of // back has just obtained the original PCM data of audio frequency continually
The mode that present embodiment can be coped with shifting events by sticking to a fundamental principle; Handle most medium type, effective especially for quick exploitation, in addition for unsupported file format; Can also expand; In operating system, add corresponding supply filter file to windows registry and can support that original program code one provisional capital need not be revised.
Simultaneously, the present invention also provides the data output of different-format, such as from a video file, adopting rgb format to extract, also can extract yuv format simultaneously, multichannel output, very flexibly.This is through after the extracted data, and inside carries out that color space conversion realizes.
Specific embodiment described herein only is that the present invention's spirit is illustrated.Person of ordinary skill in the field of the present invention can make various modifications or replenishes or adopt similar mode to substitute described specific embodiment, but can't depart from spirit of the present invention or surmount the defined scope of appended claims.
Although this paper has used terms such as link, renderer, filter morely, do not get rid of the possibility of using other term.Using these terms only is in order to describe and explain essence of the present invention more easily; It all is contrary with spirit of the present invention being construed to any additional restriction to them.

Claims (3)

1. a method that from multimedia, extracts audio frequency and video based on the DirectShow framework, is characterized in that, may further comprise the steps:
Step 1, in the filter chart, make up complete media hype link;
Step 2, through the mode of search link, find the renderer of least significant end, find corresponding pin then, call the UnConnent mode, renderer is manually broken off;
Step 3, self-defining several filters are inserted the gap of links, call the Connent mode renderer that breaks off is connected again;
Step 4, the filter extraction audio, video data through inserting are carried out the filter chart then;
Step 5, the data that extract are shunted, become and do multichannel output,, adopt the different coding parameter to handle then to the data on each road;
Step 6, the data after will handling are given application layer.
2. a kind of method that from multimedia, extracts audio frequency and video according to claim 1; It is characterized in that, in the said step 1, make up the media hype link and be specially: use DirectShow intelligent link mode; Call the Render method of chart manager, obtain complete link chart.
3. a kind of method that from multimedia, extracts audio frequency and video according to claim 1 and 2 is characterized in that in the said step 3, the filter of insertion comprises screening washer.
CN201210110281.7A 2012-04-16 2012-04-16 Method for extracting audio and video data from multimedia Expired - Fee Related CN102710983B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210110281.7A CN102710983B (en) 2012-04-16 2012-04-16 Method for extracting audio and video data from multimedia

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210110281.7A CN102710983B (en) 2012-04-16 2012-04-16 Method for extracting audio and video data from multimedia

Publications (2)

Publication Number Publication Date
CN102710983A true CN102710983A (en) 2012-10-03
CN102710983B CN102710983B (en) 2015-01-07

Family

ID=46903511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210110281.7A Expired - Fee Related CN102710983B (en) 2012-04-16 2012-04-16 Method for extracting audio and video data from multimedia

Country Status (1)

Country Link
CN (1) CN102710983B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103442210A (en) * 2013-08-20 2013-12-11 国家电网公司 Back-end audio and video data processing method for video monitoring system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377846A (en) * 2008-05-12 2009-03-04 上海激动通信有限公司 Method for optimum cutting a picture of a video file rapidly
KR20090042506A (en) * 2007-10-26 2009-04-30 주식회사 크레듀 A device and method for layering moving picture
CN101441555A (en) * 2008-04-03 2009-05-27 南京科融数据系统有限公司 Video multiple-screen combined playing technology based on windows multiple-screen system
CN101702132A (en) * 2009-09-07 2010-05-05 无锡景象数字技术有限公司 2D and 3D software switching method based on DirectShow technology
CN101783941A (en) * 2009-09-15 2010-07-21 上海海事大学 Real-time video transmission method based on IP network
CN101902471A (en) * 2010-07-16 2010-12-01 福建升腾资讯有限公司 Streaming media mapping method under RDP (Remote Desktop Protocol) environment
KR101051182B1 (en) * 2011-03-10 2011-07-22 주식회사 다이나믹앤라이브 A combining and dividing device for multimedia streams based on directshow filter graph

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090042506A (en) * 2007-10-26 2009-04-30 주식회사 크레듀 A device and method for layering moving picture
CN101441555A (en) * 2008-04-03 2009-05-27 南京科融数据系统有限公司 Video multiple-screen combined playing technology based on windows multiple-screen system
CN101377846A (en) * 2008-05-12 2009-03-04 上海激动通信有限公司 Method for optimum cutting a picture of a video file rapidly
CN101702132A (en) * 2009-09-07 2010-05-05 无锡景象数字技术有限公司 2D and 3D software switching method based on DirectShow technology
CN101783941A (en) * 2009-09-15 2010-07-21 上海海事大学 Real-time video transmission method based on IP network
CN101902471A (en) * 2010-07-16 2010-12-01 福建升腾资讯有限公司 Streaming media mapping method under RDP (Remote Desktop Protocol) environment
KR101051182B1 (en) * 2011-03-10 2011-07-22 주식회사 다이나믹앤라이브 A combining and dividing device for multimedia streams based on directshow filter graph

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103442210A (en) * 2013-08-20 2013-12-11 国家电网公司 Back-end audio and video data processing method for video monitoring system
CN103442210B (en) * 2013-08-20 2016-12-28 国家电网公司 Rear end audio and video data processing method for video monitoring system

Also Published As

Publication number Publication date
CN102710983B (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN101577110A (en) Method for playing videos and video player
CN101542623B (en) Reproducing apparatus, reproducing method, and program
CN101199011B (en) Demultiplexing device and demultiplexing method
CN105872755A (en) Video playing method and device
EP1755122A4 (en) Data processing device, data processing method, program, program recording medium, data recording medium, and data structure
CN101110935A (en) Method and system for inter cutting advertisement in network television roll-broadcasting program
WO2003096350A3 (en) Scalable video summarization
CN102790912A (en) Channel information and menu information updating method of set-top box
CN101552791B (en) Method and system for playing multiple media file
US20090007208A1 (en) Program, data processing method, and system of same
CN103716662A (en) Mixed transmission method and server
CN101621651B (en) Recording and reproducing apparatus, recording and reproducing method and program
CN103458321A (en) Method and device for loading subtitles
CN106416283A (en) Reception apparatus, transmission apparatus, and data processing method
US20040030694A1 (en) Search information transmitting apparatus
CN102710983A (en) Method for extracting audio and video data from multimedia
CN101127899B (en) Hint information description method
CN100393127C (en) Digital broadcast receiving device and method, and digital broadcast receiving program
CN1135552C (en) Method and device for receiving and sending audio data stream by digital interface
CN111479125A (en) Live broadcast code plug flow receiving and distributing system and method based on cloud management platform
CN101489052A (en) Subtitle data processing method and apparatus
CN102387177A (en) Method and device for downloading audio-visual files
CN105357531B (en) Based on video local code fly-cutting packaging method
KR101370290B1 (en) Method and apparatus for generating multimedia data with decoding level, and method and apparatus for reconstructing multimedia data with decoding level
JP4661447B2 (en) Transmission / reception system and method, transmission device and method, reception device and method, and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 310000 room 1001, building 2, No. 2, ZIJINGHUA Road, Xihu District, Hangzhou City, Zhejiang Province

Patentee after: HANGZHOU MEGA TECHNOLOGY Co.,Ltd.

Address before: 11, building 2, block B, The Union Buildings, No. 310000, Bauhinia Road, Hangzhou, Xihu District, Zhejiang

Patentee before: Hangzhou Mijia Technology Co.,Ltd.

CP03 Change of name, title or address
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: Room 1001, building 2, No.2 ZIJINGHUA Road, Xihu District, Hangzhou City, Zhejiang Province 310000

Patentee after: Hangzhou Sikai Data Technology Group Co.,Ltd.

Address before: Room 1001, building 2, No.2 ZIJINGHUA Road, Xihu District, Hangzhou City, Zhejiang Province 310000

Patentee before: HANGZHOU MEGA TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right

Effective date of registration: 20210608

Address after: 355200 no.181 erbatou, taimuyang village, Qinyu Town, Fuding City, Ningde City, Fujian Province

Patentee after: Deng Weiqiang

Address before: Room 1001, building 2, No.2 ZIJINGHUA Road, Xihu District, Hangzhou City, Zhejiang Province 310000

Patentee before: Hangzhou Sikai Data Technology Group Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210609

Address after: 071000 room 1119, block a, Huakang building, 899 Chaoyang North Street, GaoKai District, Baoding City, Hebei Province

Patentee after: Baoding Luheng Intellectual Property Agency Co.,Ltd.

Address before: 355200 no.181 erbatou, taimuyang village, Qinyu Town, Fuding City, Ningde City, Fujian Province

Patentee before: Deng Weiqiang

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210914

Address after: 071000 room 427, building 1, No. 2628, Xiangyang North Street, Baoding City, Hebei Province

Patentee after: Hebei Yitan Culture Communication Co.,Ltd.

Address before: 071000 room 1119, block a, Huakang building, 899 Chaoyang North Street, GaoKai District, Baoding City, Hebei Province

Patentee before: Baoding Luheng Intellectual Property Agency Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150107

CF01 Termination of patent right due to non-payment of annual fee