CN109614536A - Video batch crawling method, system, device based on YouTuBe and can storage medium - Google Patents
Video batch crawling method, system, device based on YouTuBe and can storage medium Download PDFInfo
- Publication number
- CN109614536A CN109614536A CN201811458982.3A CN201811458982A CN109614536A CN 109614536 A CN109614536 A CN 109614536A CN 201811458982 A CN201811458982 A CN 201811458982A CN 109614536 A CN109614536 A CN 109614536A
- Authority
- CN
- China
- Prior art keywords
- video
- unit
- youtube
- plug
- batch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Stored Programmes (AREA)
Abstract
The present invention relates to data acquisition technology fields, there is provided a kind of video batch crawling method based on YouTuBe, system, device and can storage medium, method therein includes: that will be pasted onto electrical form wait crawling webpage link address corresponding to video to form list of videos to be crawled;It is read using python circulation wait crawl the webpage link address in list of videos, and calls youtube-dl plug-in unit and ffmpeg plug-in unit to complete the batch downloading of the corresponding video to be crawled of webpage link address.Using the present invention, it is able to solve that current download efficiency is low, the problems such as cannot being downloaded in batches.
Description
Technical field
The present invention relates to data acquisition technology fields, more specifically, be related to it is a kind of based on YouTuBe video batch climb
Take method, system, device and can storage medium.
Background technique
YouTuBe is the world's largest video website, there is Clip Converter, YouTuBe video download online at present,
And can only single foradownloaded video online, Chrome browser is downloaded using plug-in unit Tampermonkey (oily monkey), but its is soft
The shortcomings that part is that highest only supports 720P video;Wherein, Android software download YouTube video, since Google forbids user
The video above YouTube is downloaded, supports the app of downloading oil pipe video all by undercarriage inside Google Play, therefore only
Installation Android app can be removed from the third party download website of the official website app and other safety, this mode is downloaded, low efficiency,
It not can be carried out batch to download.
To solve the above-mentioned problems, this patent provides a kind of video batch crawling method based on YouTuBe, system, dress
It sets and can storage medium.
Summary of the invention
In view of the above problems, the video batch crawling method that the object of the present invention is to provide a kind of based on YouTuBe is
System, device and can storage medium, with solve current download efficiency is low, cannot be downloaded in batches the problems such as.
In a first aspect, the present invention provides a kind of video batch crawling method based on YouTuBe, it is applied to electronic device,
Include:
Obtain list of videos to be crawled, wherein electronic watch will be pasted onto wait crawl webpage link address corresponding to video
Trellis is at list of videos to be crawled;
Python circulation reading is described wait crawl the webpage link address in list of videos, and calls youtube-dl plug-in unit
The batch downloading of video to be crawled is completed with ffmpeg plug-in unit.
Second aspect, the present invention also provides a kind of video batches in YouTuBe to crawl system characterized by comprising
List of videos acquiring unit to be crawled, for obtaining list of videos to be crawled, wherein will be wait crawl corresponding to video
Webpage link address be pasted onto electrical form and form list of videos to be crawled;
Video batch download unit is read described wait with crawling the web page interlinkage in list of videos for python circulation
Location, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch downloading of video to be crawled.
The third aspect, the present invention also provides a kind of electronic device, which includes: memory, processor and deposits
The computer program that can be run in the memory and on the processor is stored up, the processor executes the computer journey
The step of above-mentioned video batch crawling method based on YouTuBe is realized when sequence.
Fourth aspect, the present invention also provides a kind of computer readable storage medium, the computer readable storage medium is deposited
Contain computer program, wherein the computer program realizes the above-mentioned video batch based on YouTuBe when being executed by processor
The step of crawling method.
It can be seen from the above technical scheme that the video batch crawling method provided by the invention based on YouTuBe, system,
Device and can storage medium, by carrying out batch downloading to YouToBe video using youtube-dl plug-in unit and ffmpeg plug-in unit
Processing solves that current download efficiency is low, the problems such as cannot being downloaded in batches to reduce artificial participation.
To the accomplishment of the foregoing and related purposes, one or more aspects of the present invention includes the spy being particularly described below
Sign.Certain illustrative aspects of the invention is described in detail in the following description and the annexed drawings.However, these aspect instructions are only
It is that some of the various ways in the principles of the present invention can be used.In addition, the present invention is intended to include all such aspects with
And their equivalent.
Detailed description of the invention
By reference to the explanation below in conjunction with attached drawing, and with a more complete understanding of the present invention, of the invention is other
Purpose and result will be more clearly understood and understood.In the accompanying drawings:
Fig. 1 is the video batch crawling method flow chart based on YouTuBe according to the embodiment of the present invention;
Fig. 2 is to utilize youtube-dl plug-in unit and ffmpeg plug-in unit batch foradownloaded video method according to the embodiment of the present invention
Flow diagram;
Fig. 3 is according to the optimal wait climb by python and youtube-dl plug-in unit acquisition resolution ratio of the embodiment of the present invention
Take the flow diagram of video;
Fig. 4 is to crawl system logic structure block diagram according to the video batch based on YouTuBe of the embodiment of the present invention;
Fig. 5 is the electronic device logical construction schematic diagram according to the embodiment of the present invention.
Identical label indicates similar or corresponding feature or function in all the appended drawings.
Specific embodiment
In the following description, for purposes of illustration, it in order to provide the comprehensive understanding to one or more embodiments, explains
Many details are stated.It may be evident, however, that these embodiments can also be realized without these specific details.
Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should also be noted that unless in addition having
Body explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
The range of invention.
Simultaneously, it should be appreciated that for ease of description, the size of various pieces shown in attached drawing is not according to reality
Proportionate relationship draw.
Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the present invention
And its application or any restrictions used.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable
In the case of, the technology, method and apparatus should be considered as part of specification.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, then in subsequent attached drawing does not need that it is further discussed.
The embodiment of the present invention can be applied to the electronic equipments such as computer system/server, can with it is numerous other general
Or special-purpose computing system environment or configuration operate together.Suitable for what is be used together with electronic equipments such as computer system/servers
Well-known computing system, environment and/or the example of configuration include but is not limited to: personal computer system, server calculate
Machine system, thin client, thick client computer, hand-held or laptop devices, microprocessor-based system, set-top box, programmable-consumer
Electronic product, NetPC Network PC, minicomputer system, large computer system and the distribution including above-mentioned any system
Cloud computing technology environment, etc..
The electronic equipments such as computer system/server can be in the executable finger of the computer system executed by computer system
It enables and being described under the general context of (such as program module).In general, program module may include routine, program, target program, group
Part, logic, data structure etc., they execute specific task or realize specific abstract data type.Computer system/
The electronic equipments such as server can be implemented in distributed cloud computing environment, and in distributed cloud computing environment, task is by passing through
What the remote processing devices of communication network links executed.In distributed cloud computing environment, it includes depositing that program module, which can be located at,
On the Local or Remote computing system storage medium for storing up equipment.
Hereinafter, specific embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Embodiment 1
In order to illustrate the video batch crawling method provided by the invention based on YouTuBe, Fig. 1 is shown according to the present invention
The video batch crawling method process based on YouTuBe of embodiment.
As shown in Figure 1, the video batch crawling method provided by the invention based on YouTuBe includes:
S110: it is arranged being pasted onto electrical form and forming video to be crawled wait crawling webpage link address corresponding to video
Table;
In step s 110, list of videos be corresponding webpage link address is provided in the form of a spreadsheet, if under
The certain videos for carrying the video of YouTuBe, by web page interlinkage corresponding to the video in need crawled be pasted onto electrical form
On;As shown in following table 1:
The content of table 1 is exactly the video network address for needing to be crawled, and the video network address that institute's some need is downloaded is pasted onto electricity
In sub-table, list of videos to be crawled is formed.
S120: it is read using python circulation wait crawl the webpage link address in list of videos, and calls youtube-
Dl plug-in unit and ffmpeg plug-in unit are to complete to download to the corresponding batch for crawling video of webpage link address.
In the step s 120, Fig. 2 shows calling youtube-dl plug-in units according to an embodiment of the present invention and ffmpeg to insert
Part with complete the corresponding video to be crawled of webpage link address batch downloading the step of, as shown in Fig. 2,
S121: by python and youtube-dl plug-in unit, the optimal video to be crawled of resolution ratio is obtained;
S122: it is downloaded using youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio.
Wherein, in step S121, the downloading for the format selectivity that the available YouToBe of video format is provided, at this
The highest video of resolution ratio is selected to be downloaded in the embodiment of invention.The address video URL is obtained by youtube-dl plug-in unit
Middle video format, there are many files of video format and a variety of resolution ratio for a video council.The format of one video have m4a,
The resolution ratio of webm, mp4,3gp etc., a video can be judged from the size of file amount of storage, file amount of storage size
Difference has 4.96MB, 11.26MB, 10.5MB, 43.83MB etc. respectively, and the file amount of storage of same audio content is bigger, video
The resolution ratio of file is higher, that is to say, in above-mentioned video, amount of storage is the resolution ratio highest (optimal) of the video of 43.83MB,
So the video of 43.83MB is exactly the required video downloaded.
In an embodiment of the present invention, the amount of storage size of video format and video file is selected by python.Figure
3 show it is according to an embodiment of the present invention by python and youtube-dl plug-in unit obtain the optimal video to be crawled of resolution ratio
Process, as shown in figure 3, detailed process is as follows:
S1211: youtube-dl plug-in unit is transferred using python and is parsed with the video information to source;
S1212: the video format and video lattice of the youtube-dl plug-in unit parsing python video to be crawled read are utilized
Video corresponding to formula;
S1213: video format required for this video is chosen using python and file stores the maximum view of occupancy
Frequently, to obtain the optimal video of resolution ratio.
In the embodiment of invention, using the video of mp4 format, then the corresponding storage of the video for selecting mp4 format
Measure maximum video file.Wherein, it should be noted that in practical applications, can set as needed in addition to mp4 format it
The video of outer extended formatting (m4a, webm, 3gp etc.), and it is maximum to select file storage occupancy corresponding with this format
Video, this video are exactly the optimal video of resolution ratio.
Wherein, Python is pure free software, and source code and interpreter CPython follow GPL (GNU General
Public License) license.Python grammer simple and clear has one characteristic that pressure blank character (white space) makees
For sentence retraction.In an embodiment of the present invention, using Python read list of videos and transfer youtube-dl plug-in unit and
Ffmpeg plug-in unit.Wherein, youtube-dl plug-in unit is a simple order line download tool, supports up to a hundred global video networks
It stands downloading, or even supports Chinese major video site resource.
In step S122, combined using youtube-dl plug-in unit and ffmpeg plug-in unit optimal to resolution ratio wait crawl
Video is downloaded.
Specifically, using youtube-dl plug-in unit and ffmpeg plug-in unit combine video to be crawled optimal to resolution ratio into
The included step of row downloading is as follows:
Step 1: being downloaded using the optimal video to be crawled of resolution ratio of the youtube-dl plug-in unit to acquisition;
Step 2: being synthesized using video of the ffmpeg plug-in unit to downloading, to complete the batch downloading of video.
In an embodiment of the present invention, it is downloaded video using youtube-dl inserter tool, uses ffmpeg plug-in unit
It is synthesized.Ffmpeg plug-in unit be it is a set of can be used to record, converted digital audio, video, and opening for stream can be translated into
Source computer program.It provides recording, conversion and the total solution for fluidizing audio-video.It contains very advanced
Audio/video encoding and decoding library libavcodec.
It in an embodiment of the present invention, include that audio and video are drawn using the video that crawls after youtube-dl plug-in download
Face, and audio and video picture separates;Audio and video picture is synthesized using ffmpeg plug-in unit, forms audio and view
The video of frequency picture synchronization.Specifically, since the video pictures and audio of YouToBe video 1080p and the above resolution ratio are point
From, so ffmpeg plug-in unit is also needed to merge video pictures with audio, therefore used in the embodiment of the present invention
Youtube-dl plug-in unit and ffmpeg plug-in unit are downloaded processing.Wherein, youtube-dl plug-in unit and ffmpeg plug-in unit how phase
It mutually combines and carries out video download, by changing its environmental variance in ffmpeg plug-in unit, so that two plug-in units are associated together,
When downloading YouTuBe video, the video of available needs, the specific method is as follows:
A) ffmpeg plug-in unit is downloaded first;
B) solution presses out after downloading, it will be seen that a pile file, wherein and D disk > software > popular software > Python >
Then this file is directly changed name and is changed to " ffmpeg " and then moves on to C packing catalogue by FFmpeg > ffmpeg ...;
C) system property > advanced system setting > environmental variance is opened;
D) Path is found in environmental variance > system variable, is clicked and is edited > create, then in that file just now
The path bin (C: ffmpeg bin) copy to here;
E) Win+R is opened, cmd is inputted, carriage return is inputted to issue orders: ffmpeg-version
Ffmpeg plug-in unit is run in the above command cue row under any file.
In an embodiment of the present invention, it when python obtains the video of the optimal same content of resolution ratio, utilizes
Python calls youtube-dl plug-in unit and ffmpeg plug-in unit to be downloaded the video of the optimal resolution of selection, wherein
During this video of youtube-dl plug-in download, while format is carried out to the video downloaded using ffmpeg plug-in unit and is turned
It changes, is converted into the desired video format of user.
In specific embodiment of the present invention, the address URL in circulation reading electrical form is carried out by python and is realized
Batch is downloaded, it may be assumed that firstly, reading one of address URL in electrical form by python, and is called using python
It is one point corresponding that youtube-dl plug-in unit and ffmpeg plug-in unit using the above method are downloaded the currently-read address URL
The optimal video of resolution;Then, when reading second address URL by python, pass through youtube-dl plug-in unit and ffmpeg
The optimal video of the corresponding resolution ratio in second address URL of plug-in download, until circulation downloads the last one, to complete batch
Downloading.
Video batch crawling method provided in an embodiment of the present invention based on YouTuBe, will be wait crawl corresponding to video
Webpage link address is pasted onto electrical form and forms list of videos to be crawled;The video to be crawled is read using python circulation
Webpage link address in list, and call under the batch of youtube-dl plug-in unit and ffmpeg plug-in unit to complete video to be crawled
It carries.The present embodiment be by carrying out batch download process to YouToBe video using youtube-dl plug-in unit and ffmpeg plug-in unit,
To reducing artificial participation, solve that current download efficiency is low, the problems such as cannot being downloaded in batches.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
Embodiment 2
It corresponds to the above method, the present invention also provides a kind of, and the video batch based on YouTuBe crawls system, and Fig. 4 shows
Go out the video batch according to an embodiment of the present invention based on YouTuBe and crawls logical construction.
As shown in figure 4, the present invention, which provides a kind of video batch based on YouTuBe, crawls system 400, comprising: wait crawl
List of videos acquiring unit 410 and video batch download unit 420 realize the video based on YouTuBe in function and embodiment 1
The corresponding step of batch crawling method corresponds, and to avoid repeating, the present embodiment is not described in detail one by one.
List of videos acquiring unit 410 to be crawled, for will be pasted onto wait crawl webpage link address corresponding to video
Electrical form forms list of videos to be crawled;
Video batch download unit 420, it is described wait crawl the webpage in list of videos for being read using python circulation
Chained address, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch downloading of video to be crawled.
Moreover it is preferred that video batch download unit 420 includes: that resolution ratio optimal video obtains module 421, wait crawl
Video download module 422.
Wherein, resolution ratio optimal video obtains module 421, for obtaining and dividing by python and youtube-dl plug-in unit
The optimal video to be crawled of resolution;
Video download module 422 to be crawled, for optimal to resolution ratio using youtube-dl plug-in unit and ffmpeg plug-in unit
Video to be crawled be downloaded.
Moreover it is preferred that it includes that plug-in unit transfers module 4211, video information solution that resolution ratio optimal video, which obtains module 421,
It analyses module 4212 and video file chooses module 4213.
Wherein, plug-in unit transfers module 4211, believes for transferring youtube-dl plug-in unit using python the video in source
Breath is parsed;
Video information parsing module 4212, for parsing regarding wait crawl for python reading using youtube-dl plug-in unit
Video corresponding to the video format and video format of frequency;
Video file chooses module 4213, and video format needed for choosing this video for python and file storage account for
The maximum video of dosage, to obtain the optimal video of resolution ratio.
Moreover it is preferred that video download module 422 to be crawled includes: video download module 4221 and Video Composition module
4222。
Wherein, video download module 4221, for using youtube-dl plug-in unit to obtain the optimal video of resolution ratio into
Row downloading;
Video Composition module 4222, for the synthesizing to institute's foradownloaded video using ffmpeg plug-in unit, to complete to regard
The batch of frequency is downloaded.
Moreover it is preferred that include audio and video picture using the video that crawls after youtube-dl plug-in download, and
The separation of audio and video picture;
Audio and video picture is synthesized using ffmpeg plug-in unit, forms the video of audio and video picture synchronization.
Video batch provided in an embodiment of the present invention based on YouTuBe crawls system, and list of videos to be crawled obtains single
Member 410 forms list of videos to be crawled for will be pasted onto electrical form wait crawling webpage link address corresponding to video;Depending on
Frequency batch download unit 420, it is described wait crawl the webpage link address in list of videos for being read using python circulation, and
Youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch of video to be crawled and download.By being inserted using youtube-dl
Part and ffmpeg plug-in unit carry out batch download process to YouToBe video, to reduce artificial participation, solve downloading effect at present
Rate is low, the problems such as cannot being downloaded in batches.
Embodiment 3
Fig. 5 is the schematic diagram for the electronic device logical construction that one embodiment of the invention provides.As shown in figure 5, the embodiment
Electronic device 50 include processor 51, memory 52 and be stored in the meter that can be run in memory 52 and on processor 51
Calculation machine program 53.Processor 51 realizes the video batch side of crawling in embodiment 1 based on YouTuBe when executing computer program 53
Each step of method, such as step S110 to S120 shown in FIG. 1.Alternatively, processor 51 executes the video based on YouTuBe batch
Amount realizes the function of each module/unit in above-mentioned each Installation practice when crawling system, such as shown in Fig. 4: video to be crawled
List acquiring unit 410 and video batch download unit 420.
Illustratively, computer program 53 can be divided into one or more module/units, one or more mould
Block/unit is stored in memory 52, and is executed by processor 51, to complete the present invention.One or more module/units can
To be the series of computation machine program instruction section that can complete specific function, the instruction segment is for describing computer program 53 in electricity
Implementation procedure in sub-device 50.For example, computer program 53 can be divided into embodiment 2: list of videos to be crawled
Acquiring unit 410 and video batch download unit 420, function has a detailed description in example 2, does not go to live in the household of one's in-laws on getting married one by one herein
It states.
Electronic device 50 can be desktop PC, notebook, palm PC and cloud server etc. and calculate equipment.Electricity
Sub-device 50 may include, but be not limited only to, processor 51, memory 52.It will be understood by those skilled in the art that Fig. 5 is only
The example of electronic device 50 does not constitute the restriction to electronic device 50, may include components more more or fewer than diagram, or
Person combines certain components or different components, such as electronic device can also be set including input-output equipment, network insertion
Standby, bus etc..
Alleged processor 51 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
Memory 52 can be the internal storage unit of electronic device 50, such as the hard disk or memory of electronic device 50.It deposits
Reservoir 52 is also possible to the plug-in type hard disk being equipped on the External memory equipment of electronic device 50, such as electronic device 50, intelligence
Storage card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card)
Deng.Further, memory 52 can also both including electronic device 50 internal storage unit and also including External memory equipment.It deposits
Reservoir 52 is for storing other programs and data needed for computer program and electronic equipment.Memory 52 can be also used for temporarily
When store the data that has exported or will export.
Embodiment 4
The present embodiment provides a computer readable storage medium, computer journey is stored on the computer readable storage medium
Sequence realizes the video batch crawling method based on YouTuBe in embodiment 1, to keep away when the computer program is executed by processor
Exempt to repeat, which is not described herein again.Alternatively, realizing in embodiment 2 when the computer program is executed by processor based on YouTuBe's
Video batch crawls the function of each module/unit in system, and to avoid repeating, which is not described herein again.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function
Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of device are divided into different functional unit or module, to complete above description
All or part of function.Each functional unit in embodiment, module can integrate in one processing unit, be also possible to
Each unit physically exists alone, and can also be integrated in one unit with two or more units, above-mentioned integrated unit
Both it can use formal implementation of hardware, the form that also can use SFU software functional unit is realized.In addition, each functional unit, mould
The specific name of block is also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.It is single in above system
Member, the specific work process of module, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed device and method can pass through others
Mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the module or unit,
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be with
In conjunction with or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling or direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING of device or unit or
Communication connection can be electrical property, mechanical or other forms.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit
Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks
On unit.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can use formal implementation of hardware, and the form that also can use SFU software functional unit is realized.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation
All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on
The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation
Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium
It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code
Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM,
Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described
The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice
Subtract, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and
Telecommunication signal.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality
Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all
It is included within protection scope of the present invention.
Claims (10)
1. a kind of video batch crawling method based on YouTuBe is applied to electronic device characterized by comprising
List of videos to be crawled is formed by electrical form is pasted onto wait crawling webpage link address corresponding to video;
It is described wait crawl the webpage link address in list of videos using python circulation reading, and call youtube-dl plug-in unit
The batch downloading of the corresponding video to be crawled of the webpage link address is completed with ffmpeg plug-in unit.
2. the video batch crawling method according to claim 1 based on YouTuBe, which is characterized in that the calling
Youtube-dl plug-in unit and ffmpeg plug-in unit include: to complete the step of batch of video to be crawled is downloaded
By python and youtube-dl plug-in unit, the optimal video to be crawled of resolution ratio is obtained;
It is downloaded using youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio.
3. the video batch crawling method according to claim 2 based on YouTuBe, which is characterized in that described to pass through
Python and youtube-dl plug-in unit, the step of obtaining resolution ratio optimal video to be crawled include:
Youtube-dl plug-in unit is transferred using python to parse with the video information to source;
Using youtube-dl plug-in unit parse python reading video to be crawled video format and with the video lattice
Video corresponding to formula;
Video format needed for choosing the video using python and file store the maximum video of occupancy, to obtain
The optimal video of resolution ratio.
4. the video batch crawling method according to claim 2 based on YouTuBe, which is characterized in that the utilization
The step of youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio are downloaded include:
It is downloaded using the optimal video to be crawled of resolution ratio of the youtube-dl plug-in unit to acquisition;
The video downloaded is synthesized using ffmpeg plug-in unit, to complete the batch downloading of video.
5. the video batch crawling method according to claim 4 based on YouTuBe, which is characterized in that utilize
The video that crawls after youtube-dl plug-in download includes audio and video picture, and the audio and the video pictures divide
From;
The audio and the video pictures are synthesized using ffmpeg plug-in unit, form the audio and video picture synchronization
Video.
6. a kind of video batch based on YouTuBe crawls system characterized by comprising
List of videos acquiring unit to be crawled, for electrical form will to be pasted onto wait crawl webpage link address corresponding to video
Form list of videos to be crawled;
Video batch download unit, it is described wait with crawling the web page interlinkage in list of videos for being read using python circulation
Location, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete batch of the corresponding video to be crawled of the webpage link address
Amount downloading.
7. the video batch according to claim 6 based on YouTuBe crawls system, which is characterized in that the video batch
Measuring download unit includes:
Resolution ratio optimal video obtains module, for by python and youtube-dl plug-in unit, obtain resolution ratio it is optimal to
Crawl video;
Video download module to be crawled, for optimal to resolution ratio wait crawl using youtube-dl plug-in unit and ffmpeg plug-in unit
Video is downloaded.
8. the video batch according to claim 7 based on YouTuBe crawls system, which is characterized in that the resolution ratio
Optimal video obtains module
Plug-in unit transfers module, is parsed for transferring youtube-dl plug-in unit using python with the video information to source,
Video information parsing module, for parsing the video lattice that python reads video to be crawled using youtube-dl plug-in unit
Formula and with video corresponding to the video format;
Video file chooses module, occupies for video format needed for choosing the video using python and file storage
Maximum video is measured, to obtain the optimal video of resolution ratio.
9. a kind of electronic device, the electronic device include: memory, processor and storage in the memory and can be in institute
State the computer program run on processor, which is characterized in that the processor is realized when executing the computer program as weighed
Benefit requires the step of video batch crawling method described in 1 to 5 any one based on YouTuBe.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In realization is as described in any one of claim 1 to 5 based on the video of YouTuBe when the computer program is executed by processor
The step of batch crawling method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811458982.3A CN109614536A (en) | 2018-11-30 | 2018-11-30 | Video batch crawling method, system, device based on YouTuBe and can storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811458982.3A CN109614536A (en) | 2018-11-30 | 2018-11-30 | Video batch crawling method, system, device based on YouTuBe and can storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109614536A true CN109614536A (en) | 2019-04-12 |
Family
ID=66005226
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811458982.3A Pending CN109614536A (en) | 2018-11-30 | 2018-11-30 | Video batch crawling method, system, device based on YouTuBe and can storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109614536A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110365776A (en) * | 2019-07-17 | 2019-10-22 | 京东方科技集团股份有限公司 | Picture batch method for down loading, device, electronic equipment and storage medium |
CN112019917A (en) * | 2020-07-28 | 2020-12-01 | 厦门快商通科技股份有限公司 | Audio data extraction method, device, equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108415941A (en) * | 2018-01-29 | 2018-08-17 | 湖北省楚天云有限公司 | A kind of spiders method, apparatus and electronic equipment |
CN108536691A (en) * | 2017-03-01 | 2018-09-14 | 中兴通讯股份有限公司 | Web page crawl method and apparatus |
-
2018
- 2018-11-30 CN CN201811458982.3A patent/CN109614536A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108536691A (en) * | 2017-03-01 | 2018-09-14 | 中兴通讯股份有限公司 | Web page crawl method and apparatus |
CN108415941A (en) * | 2018-01-29 | 2018-08-17 | 湖北省楚天云有限公司 | A kind of spiders method, apparatus and electronic equipment |
Non-Patent Citations (2)
Title |
---|
灰灰: "youtube-dl配合ffmpeg的简单使用方法", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/23032097知乎》 * |
萌鼠喝酸奶: "python——批下载Youtube上的1080p及以上清晰度视频(python+youtube-dl+ffmpeg)", 《HTTPS://ITPCB.COM/A/275345算法网》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110365776A (en) * | 2019-07-17 | 2019-10-22 | 京东方科技集团股份有限公司 | Picture batch method for down loading, device, electronic equipment and storage medium |
CN110365776B (en) * | 2019-07-17 | 2021-05-04 | 京东方科技集团股份有限公司 | Picture batch downloading method and device, electronic equipment and storage medium |
CN112019917A (en) * | 2020-07-28 | 2020-12-01 | 厦门快商通科技股份有限公司 | Audio data extraction method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018226621A1 (en) | Methods and systems for an application system | |
US20140207826A1 (en) | Generating xml schema from json data | |
Wang et al. | OpenSceneGraph 3 Cookbook | |
CN107783967A (en) | Technology for the document translation of automation | |
CN102981941A (en) | Alarm handling method and alarm handling device | |
CN114020846A (en) | Processing method and device capable of changing NFT (network File transfer) works | |
CN109614536A (en) | Video batch crawling method, system, device based on YouTuBe and can storage medium | |
US20190108170A1 (en) | Information management and continuity | |
CN104244027A (en) | Control method and system used for live transmission and play process sharing of audio/video data | |
CN107066496A (en) | A kind of page access method of compatible different browsers and terminal device | |
CN109471893A (en) | Querying method, equipment and the computer readable storage medium of network data | |
CN109325480A (en) | The input method and terminal device of identity information | |
CN109478251A (en) | Processing method and accelerator | |
CN106033412B (en) | A kind of text conversion method and device | |
CN109639559A (en) | A kind of wechat H5 propagates method for monitoring and analyzing and relevant device | |
WO2018022266A1 (en) | Scalable vector graphics bundles | |
CN104133847B (en) | A kind of method and apparatus that sound control is carried out in browser | |
CN105824608B (en) | Processing, plug-in unit generation method and the device of process object | |
Chapman | Java for engineers and scientists | |
CN109960553A (en) | A kind of more window context rendering methods and system | |
CN108268254B (en) | Flash file function library calling method and device, electronic equipment and medium | |
Lathrop | The way computer graphics works | |
CN106156422A (en) | A kind of method and apparatus encapsulating standard in combination simulation modeling interface | |
CN106909570A (en) | A kind of data transfer device and device | |
CN110222777A (en) | Processing method, device, electronic equipment and the storage medium of characteristics of image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |