CN109614536A - Video batch crawling method, system, device based on YouTuBe and can storage medium - Google Patents

Video batch crawling method, system, device based on YouTuBe and can storage medium Download PDF

Info

Publication number
CN109614536A
CN109614536A CN201811458982.3A CN201811458982A CN109614536A CN 109614536 A CN109614536 A CN 109614536A CN 201811458982 A CN201811458982 A CN 201811458982A CN 109614536 A CN109614536 A CN 109614536A
Authority
CN
China
Prior art keywords
video
unit
youtube
plug
batch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811458982.3A
Other languages
Chinese (zh)
Inventor
马建强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811458982.3A priority Critical patent/CN109614536A/en
Publication of CN109614536A publication Critical patent/CN109614536A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The present invention relates to data acquisition technology fields, there is provided a kind of video batch crawling method based on YouTuBe, system, device and can storage medium, method therein includes: that will be pasted onto electrical form wait crawling webpage link address corresponding to video to form list of videos to be crawled;It is read using python circulation wait crawl the webpage link address in list of videos, and calls youtube-dl plug-in unit and ffmpeg plug-in unit to complete the batch downloading of the corresponding video to be crawled of webpage link address.Using the present invention, it is able to solve that current download efficiency is low, the problems such as cannot being downloaded in batches.

Description

Video batch crawling method, system, device based on YouTuBe and can storage medium
Technical field
The present invention relates to data acquisition technology fields, more specifically, be related to it is a kind of based on YouTuBe video batch climb Take method, system, device and can storage medium.
Background technique
YouTuBe is the world's largest video website, there is Clip Converter, YouTuBe video download online at present, And can only single foradownloaded video online, Chrome browser is downloaded using plug-in unit Tampermonkey (oily monkey), but its is soft The shortcomings that part is that highest only supports 720P video;Wherein, Android software download YouTube video, since Google forbids user The video above YouTube is downloaded, supports the app of downloading oil pipe video all by undercarriage inside Google Play, therefore only Installation Android app can be removed from the third party download website of the official website app and other safety, this mode is downloaded, low efficiency, It not can be carried out batch to download.
To solve the above-mentioned problems, this patent provides a kind of video batch crawling method based on YouTuBe, system, dress It sets and can storage medium.
Summary of the invention
In view of the above problems, the video batch crawling method that the object of the present invention is to provide a kind of based on YouTuBe is System, device and can storage medium, with solve current download efficiency is low, cannot be downloaded in batches the problems such as.
In a first aspect, the present invention provides a kind of video batch crawling method based on YouTuBe, it is applied to electronic device, Include:
Obtain list of videos to be crawled, wherein electronic watch will be pasted onto wait crawl webpage link address corresponding to video Trellis is at list of videos to be crawled;
Python circulation reading is described wait crawl the webpage link address in list of videos, and calls youtube-dl plug-in unit The batch downloading of video to be crawled is completed with ffmpeg plug-in unit.
Second aspect, the present invention also provides a kind of video batches in YouTuBe to crawl system characterized by comprising
List of videos acquiring unit to be crawled, for obtaining list of videos to be crawled, wherein will be wait crawl corresponding to video Webpage link address be pasted onto electrical form and form list of videos to be crawled;
Video batch download unit is read described wait with crawling the web page interlinkage in list of videos for python circulation Location, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch downloading of video to be crawled.
The third aspect, the present invention also provides a kind of electronic device, which includes: memory, processor and deposits The computer program that can be run in the memory and on the processor is stored up, the processor executes the computer journey The step of above-mentioned video batch crawling method based on YouTuBe is realized when sequence.
Fourth aspect, the present invention also provides a kind of computer readable storage medium, the computer readable storage medium is deposited Contain computer program, wherein the computer program realizes the above-mentioned video batch based on YouTuBe when being executed by processor The step of crawling method.
It can be seen from the above technical scheme that the video batch crawling method provided by the invention based on YouTuBe, system, Device and can storage medium, by carrying out batch downloading to YouToBe video using youtube-dl plug-in unit and ffmpeg plug-in unit Processing solves that current download efficiency is low, the problems such as cannot being downloaded in batches to reduce artificial participation.
To the accomplishment of the foregoing and related purposes, one or more aspects of the present invention includes the spy being particularly described below Sign.Certain illustrative aspects of the invention is described in detail in the following description and the annexed drawings.However, these aspect instructions are only It is that some of the various ways in the principles of the present invention can be used.In addition, the present invention is intended to include all such aspects with And their equivalent.
Detailed description of the invention
By reference to the explanation below in conjunction with attached drawing, and with a more complete understanding of the present invention, of the invention is other Purpose and result will be more clearly understood and understood.In the accompanying drawings:
Fig. 1 is the video batch crawling method flow chart based on YouTuBe according to the embodiment of the present invention;
Fig. 2 is to utilize youtube-dl plug-in unit and ffmpeg plug-in unit batch foradownloaded video method according to the embodiment of the present invention Flow diagram;
Fig. 3 is according to the optimal wait climb by python and youtube-dl plug-in unit acquisition resolution ratio of the embodiment of the present invention Take the flow diagram of video;
Fig. 4 is to crawl system logic structure block diagram according to the video batch based on YouTuBe of the embodiment of the present invention;
Fig. 5 is the electronic device logical construction schematic diagram according to the embodiment of the present invention.
Identical label indicates similar or corresponding feature or function in all the appended drawings.
Specific embodiment
In the following description, for purposes of illustration, it in order to provide the comprehensive understanding to one or more embodiments, explains Many details are stated.It may be evident, however, that these embodiments can also be realized without these specific details.
Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should also be noted that unless in addition having Body explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally The range of invention.
Simultaneously, it should be appreciated that for ease of description, the size of various pieces shown in attached drawing is not according to reality Proportionate relationship draw.
Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the present invention And its application or any restrictions used.
Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable In the case of, the technology, method and apparatus should be considered as part of specification.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, then in subsequent attached drawing does not need that it is further discussed.
The embodiment of the present invention can be applied to the electronic equipments such as computer system/server, can with it is numerous other general Or special-purpose computing system environment or configuration operate together.Suitable for what is be used together with electronic equipments such as computer system/servers Well-known computing system, environment and/or the example of configuration include but is not limited to: personal computer system, server calculate Machine system, thin client, thick client computer, hand-held or laptop devices, microprocessor-based system, set-top box, programmable-consumer Electronic product, NetPC Network PC, minicomputer system, large computer system and the distribution including above-mentioned any system Cloud computing technology environment, etc..
The electronic equipments such as computer system/server can be in the executable finger of the computer system executed by computer system It enables and being described under the general context of (such as program module).In general, program module may include routine, program, target program, group Part, logic, data structure etc., they execute specific task or realize specific abstract data type.Computer system/ The electronic equipments such as server can be implemented in distributed cloud computing environment, and in distributed cloud computing environment, task is by passing through What the remote processing devices of communication network links executed.In distributed cloud computing environment, it includes depositing that program module, which can be located at, On the Local or Remote computing system storage medium for storing up equipment.
Hereinafter, specific embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Embodiment 1
In order to illustrate the video batch crawling method provided by the invention based on YouTuBe, Fig. 1 is shown according to the present invention The video batch crawling method process based on YouTuBe of embodiment.
As shown in Figure 1, the video batch crawling method provided by the invention based on YouTuBe includes:
S110: it is arranged being pasted onto electrical form and forming video to be crawled wait crawling webpage link address corresponding to video Table;
In step s 110, list of videos be corresponding webpage link address is provided in the form of a spreadsheet, if under The certain videos for carrying the video of YouTuBe, by web page interlinkage corresponding to the video in need crawled be pasted onto electrical form On;As shown in following table 1:
The content of table 1 is exactly the video network address for needing to be crawled, and the video network address that institute's some need is downloaded is pasted onto electricity In sub-table, list of videos to be crawled is formed.
S120: it is read using python circulation wait crawl the webpage link address in list of videos, and calls youtube- Dl plug-in unit and ffmpeg plug-in unit are to complete to download to the corresponding batch for crawling video of webpage link address.
In the step s 120, Fig. 2 shows calling youtube-dl plug-in units according to an embodiment of the present invention and ffmpeg to insert Part with complete the corresponding video to be crawled of webpage link address batch downloading the step of, as shown in Fig. 2,
S121: by python and youtube-dl plug-in unit, the optimal video to be crawled of resolution ratio is obtained;
S122: it is downloaded using youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio.
Wherein, in step S121, the downloading for the format selectivity that the available YouToBe of video format is provided, at this The highest video of resolution ratio is selected to be downloaded in the embodiment of invention.The address video URL is obtained by youtube-dl plug-in unit Middle video format, there are many files of video format and a variety of resolution ratio for a video council.The format of one video have m4a, The resolution ratio of webm, mp4,3gp etc., a video can be judged from the size of file amount of storage, file amount of storage size Difference has 4.96MB, 11.26MB, 10.5MB, 43.83MB etc. respectively, and the file amount of storage of same audio content is bigger, video The resolution ratio of file is higher, that is to say, in above-mentioned video, amount of storage is the resolution ratio highest (optimal) of the video of 43.83MB, So the video of 43.83MB is exactly the required video downloaded.
In an embodiment of the present invention, the amount of storage size of video format and video file is selected by python.Figure 3 show it is according to an embodiment of the present invention by python and youtube-dl plug-in unit obtain the optimal video to be crawled of resolution ratio Process, as shown in figure 3, detailed process is as follows:
S1211: youtube-dl plug-in unit is transferred using python and is parsed with the video information to source;
S1212: the video format and video lattice of the youtube-dl plug-in unit parsing python video to be crawled read are utilized Video corresponding to formula;
S1213: video format required for this video is chosen using python and file stores the maximum view of occupancy Frequently, to obtain the optimal video of resolution ratio.
In the embodiment of invention, using the video of mp4 format, then the corresponding storage of the video for selecting mp4 format Measure maximum video file.Wherein, it should be noted that in practical applications, can set as needed in addition to mp4 format it The video of outer extended formatting (m4a, webm, 3gp etc.), and it is maximum to select file storage occupancy corresponding with this format Video, this video are exactly the optimal video of resolution ratio.
Wherein, Python is pure free software, and source code and interpreter CPython follow GPL (GNU General Public License) license.Python grammer simple and clear has one characteristic that pressure blank character (white space) makees For sentence retraction.In an embodiment of the present invention, using Python read list of videos and transfer youtube-dl plug-in unit and Ffmpeg plug-in unit.Wherein, youtube-dl plug-in unit is a simple order line download tool, supports up to a hundred global video networks It stands downloading, or even supports Chinese major video site resource.
In step S122, combined using youtube-dl plug-in unit and ffmpeg plug-in unit optimal to resolution ratio wait crawl Video is downloaded.
Specifically, using youtube-dl plug-in unit and ffmpeg plug-in unit combine video to be crawled optimal to resolution ratio into The included step of row downloading is as follows:
Step 1: being downloaded using the optimal video to be crawled of resolution ratio of the youtube-dl plug-in unit to acquisition;
Step 2: being synthesized using video of the ffmpeg plug-in unit to downloading, to complete the batch downloading of video.
In an embodiment of the present invention, it is downloaded video using youtube-dl inserter tool, uses ffmpeg plug-in unit It is synthesized.Ffmpeg plug-in unit be it is a set of can be used to record, converted digital audio, video, and opening for stream can be translated into Source computer program.It provides recording, conversion and the total solution for fluidizing audio-video.It contains very advanced Audio/video encoding and decoding library libavcodec.
It in an embodiment of the present invention, include that audio and video are drawn using the video that crawls after youtube-dl plug-in download Face, and audio and video picture separates;Audio and video picture is synthesized using ffmpeg plug-in unit, forms audio and view The video of frequency picture synchronization.Specifically, since the video pictures and audio of YouToBe video 1080p and the above resolution ratio are point From, so ffmpeg plug-in unit is also needed to merge video pictures with audio, therefore used in the embodiment of the present invention Youtube-dl plug-in unit and ffmpeg plug-in unit are downloaded processing.Wherein, youtube-dl plug-in unit and ffmpeg plug-in unit how phase It mutually combines and carries out video download, by changing its environmental variance in ffmpeg plug-in unit, so that two plug-in units are associated together, When downloading YouTuBe video, the video of available needs, the specific method is as follows:
A) ffmpeg plug-in unit is downloaded first;
B) solution presses out after downloading, it will be seen that a pile file, wherein and D disk > software > popular software > Python > Then this file is directly changed name and is changed to " ffmpeg " and then moves on to C packing catalogue by FFmpeg > ffmpeg ...;
C) system property > advanced system setting > environmental variance is opened;
D) Path is found in environmental variance > system variable, is clicked and is edited > create, then in that file just now The path bin (C: ffmpeg bin) copy to here;
E) Win+R is opened, cmd is inputted, carriage return is inputted to issue orders: ffmpeg-version
Ffmpeg plug-in unit is run in the above command cue row under any file.
In an embodiment of the present invention, it when python obtains the video of the optimal same content of resolution ratio, utilizes Python calls youtube-dl plug-in unit and ffmpeg plug-in unit to be downloaded the video of the optimal resolution of selection, wherein During this video of youtube-dl plug-in download, while format is carried out to the video downloaded using ffmpeg plug-in unit and is turned It changes, is converted into the desired video format of user.
In specific embodiment of the present invention, the address URL in circulation reading electrical form is carried out by python and is realized Batch is downloaded, it may be assumed that firstly, reading one of address URL in electrical form by python, and is called using python It is one point corresponding that youtube-dl plug-in unit and ffmpeg plug-in unit using the above method are downloaded the currently-read address URL The optimal video of resolution;Then, when reading second address URL by python, pass through youtube-dl plug-in unit and ffmpeg The optimal video of the corresponding resolution ratio in second address URL of plug-in download, until circulation downloads the last one, to complete batch Downloading.
Video batch crawling method provided in an embodiment of the present invention based on YouTuBe, will be wait crawl corresponding to video Webpage link address is pasted onto electrical form and forms list of videos to be crawled;The video to be crawled is read using python circulation Webpage link address in list, and call under the batch of youtube-dl plug-in unit and ffmpeg plug-in unit to complete video to be crawled It carries.The present embodiment be by carrying out batch download process to YouToBe video using youtube-dl plug-in unit and ffmpeg plug-in unit, To reducing artificial participation, solve that current download efficiency is low, the problems such as cannot being downloaded in batches.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
Embodiment 2
It corresponds to the above method, the present invention also provides a kind of, and the video batch based on YouTuBe crawls system, and Fig. 4 shows Go out the video batch according to an embodiment of the present invention based on YouTuBe and crawls logical construction.
As shown in figure 4, the present invention, which provides a kind of video batch based on YouTuBe, crawls system 400, comprising: wait crawl List of videos acquiring unit 410 and video batch download unit 420 realize the video based on YouTuBe in function and embodiment 1 The corresponding step of batch crawling method corresponds, and to avoid repeating, the present embodiment is not described in detail one by one.
List of videos acquiring unit 410 to be crawled, for will be pasted onto wait crawl webpage link address corresponding to video Electrical form forms list of videos to be crawled;
Video batch download unit 420, it is described wait crawl the webpage in list of videos for being read using python circulation Chained address, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch downloading of video to be crawled.
Moreover it is preferred that video batch download unit 420 includes: that resolution ratio optimal video obtains module 421, wait crawl Video download module 422.
Wherein, resolution ratio optimal video obtains module 421, for obtaining and dividing by python and youtube-dl plug-in unit The optimal video to be crawled of resolution;
Video download module 422 to be crawled, for optimal to resolution ratio using youtube-dl plug-in unit and ffmpeg plug-in unit Video to be crawled be downloaded.
Moreover it is preferred that it includes that plug-in unit transfers module 4211, video information solution that resolution ratio optimal video, which obtains module 421, It analyses module 4212 and video file chooses module 4213.
Wherein, plug-in unit transfers module 4211, believes for transferring youtube-dl plug-in unit using python the video in source Breath is parsed;
Video information parsing module 4212, for parsing regarding wait crawl for python reading using youtube-dl plug-in unit Video corresponding to the video format and video format of frequency;
Video file chooses module 4213, and video format needed for choosing this video for python and file storage account for The maximum video of dosage, to obtain the optimal video of resolution ratio.
Moreover it is preferred that video download module 422 to be crawled includes: video download module 4221 and Video Composition module 4222。
Wherein, video download module 4221, for using youtube-dl plug-in unit to obtain the optimal video of resolution ratio into Row downloading;
Video Composition module 4222, for the synthesizing to institute's foradownloaded video using ffmpeg plug-in unit, to complete to regard The batch of frequency is downloaded.
Moreover it is preferred that include audio and video picture using the video that crawls after youtube-dl plug-in download, and The separation of audio and video picture;
Audio and video picture is synthesized using ffmpeg plug-in unit, forms the video of audio and video picture synchronization.
Video batch provided in an embodiment of the present invention based on YouTuBe crawls system, and list of videos to be crawled obtains single Member 410 forms list of videos to be crawled for will be pasted onto electrical form wait crawling webpage link address corresponding to video;Depending on Frequency batch download unit 420, it is described wait crawl the webpage link address in list of videos for being read using python circulation, and Youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete the batch of video to be crawled and download.By being inserted using youtube-dl Part and ffmpeg plug-in unit carry out batch download process to YouToBe video, to reduce artificial participation, solve downloading effect at present Rate is low, the problems such as cannot being downloaded in batches.
Embodiment 3
Fig. 5 is the schematic diagram for the electronic device logical construction that one embodiment of the invention provides.As shown in figure 5, the embodiment Electronic device 50 include processor 51, memory 52 and be stored in the meter that can be run in memory 52 and on processor 51 Calculation machine program 53.Processor 51 realizes the video batch side of crawling in embodiment 1 based on YouTuBe when executing computer program 53 Each step of method, such as step S110 to S120 shown in FIG. 1.Alternatively, processor 51 executes the video based on YouTuBe batch Amount realizes the function of each module/unit in above-mentioned each Installation practice when crawling system, such as shown in Fig. 4: video to be crawled List acquiring unit 410 and video batch download unit 420.
Illustratively, computer program 53 can be divided into one or more module/units, one or more mould Block/unit is stored in memory 52, and is executed by processor 51, to complete the present invention.One or more module/units can To be the series of computation machine program instruction section that can complete specific function, the instruction segment is for describing computer program 53 in electricity Implementation procedure in sub-device 50.For example, computer program 53 can be divided into embodiment 2: list of videos to be crawled Acquiring unit 410 and video batch download unit 420, function has a detailed description in example 2, does not go to live in the household of one's in-laws on getting married one by one herein It states.
Electronic device 50 can be desktop PC, notebook, palm PC and cloud server etc. and calculate equipment.Electricity Sub-device 50 may include, but be not limited only to, processor 51, memory 52.It will be understood by those skilled in the art that Fig. 5 is only The example of electronic device 50 does not constitute the restriction to electronic device 50, may include components more more or fewer than diagram, or Person combines certain components or different components, such as electronic device can also be set including input-output equipment, network insertion Standby, bus etc..
Alleged processor 51 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
Memory 52 can be the internal storage unit of electronic device 50, such as the hard disk or memory of electronic device 50.It deposits Reservoir 52 is also possible to the plug-in type hard disk being equipped on the External memory equipment of electronic device 50, such as electronic device 50, intelligence Storage card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) Deng.Further, memory 52 can also both including electronic device 50 internal storage unit and also including External memory equipment.It deposits Reservoir 52 is for storing other programs and data needed for computer program and electronic equipment.Memory 52 can be also used for temporarily When store the data that has exported or will export.
Embodiment 4
The present embodiment provides a computer readable storage medium, computer journey is stored on the computer readable storage medium Sequence realizes the video batch crawling method based on YouTuBe in embodiment 1, to keep away when the computer program is executed by processor Exempt to repeat, which is not described herein again.Alternatively, realizing in embodiment 2 when the computer program is executed by processor based on YouTuBe's Video batch crawls the function of each module/unit in system, and to avoid repeating, which is not described herein again.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of device are divided into different functional unit or module, to complete above description All or part of function.Each functional unit in embodiment, module can integrate in one processing unit, be also possible to Each unit physically exists alone, and can also be integrated in one unit with two or more units, above-mentioned integrated unit Both it can use formal implementation of hardware, the form that also can use SFU software functional unit is realized.In addition, each functional unit, mould The specific name of block is also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.It is single in above system Member, the specific work process of module, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed device and method can pass through others Mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the module or unit, Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be with In conjunction with or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling or direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING of device or unit or Communication connection can be electrical property, mechanical or other forms.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can use formal implementation of hardware, and the form that also can use SFU software functional unit is realized.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice Subtract, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and Telecommunication signal.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims (10)

1. a kind of video batch crawling method based on YouTuBe is applied to electronic device characterized by comprising
List of videos to be crawled is formed by electrical form is pasted onto wait crawling webpage link address corresponding to video;
It is described wait crawl the webpage link address in list of videos using python circulation reading, and call youtube-dl plug-in unit The batch downloading of the corresponding video to be crawled of the webpage link address is completed with ffmpeg plug-in unit.
2. the video batch crawling method according to claim 1 based on YouTuBe, which is characterized in that the calling Youtube-dl plug-in unit and ffmpeg plug-in unit include: to complete the step of batch of video to be crawled is downloaded
By python and youtube-dl plug-in unit, the optimal video to be crawled of resolution ratio is obtained;
It is downloaded using youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio.
3. the video batch crawling method according to claim 2 based on YouTuBe, which is characterized in that described to pass through Python and youtube-dl plug-in unit, the step of obtaining resolution ratio optimal video to be crawled include:
Youtube-dl plug-in unit is transferred using python to parse with the video information to source;
Using youtube-dl plug-in unit parse python reading video to be crawled video format and with the video lattice Video corresponding to formula;
Video format needed for choosing the video using python and file store the maximum video of occupancy, to obtain The optimal video of resolution ratio.
4. the video batch crawling method according to claim 2 based on YouTuBe, which is characterized in that the utilization The step of youtube-dl plug-in unit and ffmpeg plug-in unit video to be crawled optimal to resolution ratio are downloaded include:
It is downloaded using the optimal video to be crawled of resolution ratio of the youtube-dl plug-in unit to acquisition;
The video downloaded is synthesized using ffmpeg plug-in unit, to complete the batch downloading of video.
5. the video batch crawling method according to claim 4 based on YouTuBe, which is characterized in that utilize The video that crawls after youtube-dl plug-in download includes audio and video picture, and the audio and the video pictures divide From;
The audio and the video pictures are synthesized using ffmpeg plug-in unit, form the audio and video picture synchronization Video.
6. a kind of video batch based on YouTuBe crawls system characterized by comprising
List of videos acquiring unit to be crawled, for electrical form will to be pasted onto wait crawl webpage link address corresponding to video Form list of videos to be crawled;
Video batch download unit, it is described wait with crawling the web page interlinkage in list of videos for being read using python circulation Location, and youtube-dl plug-in unit and ffmpeg plug-in unit are called to complete batch of the corresponding video to be crawled of the webpage link address Amount downloading.
7. the video batch according to claim 6 based on YouTuBe crawls system, which is characterized in that the video batch Measuring download unit includes:
Resolution ratio optimal video obtains module, for by python and youtube-dl plug-in unit, obtain resolution ratio it is optimal to Crawl video;
Video download module to be crawled, for optimal to resolution ratio wait crawl using youtube-dl plug-in unit and ffmpeg plug-in unit Video is downloaded.
8. the video batch according to claim 7 based on YouTuBe crawls system, which is characterized in that the resolution ratio Optimal video obtains module
Plug-in unit transfers module, is parsed for transferring youtube-dl plug-in unit using python with the video information to source,
Video information parsing module, for parsing the video lattice that python reads video to be crawled using youtube-dl plug-in unit Formula and with video corresponding to the video format;
Video file chooses module, occupies for video format needed for choosing the video using python and file storage Maximum video is measured, to obtain the optimal video of resolution ratio.
9. a kind of electronic device, the electronic device include: memory, processor and storage in the memory and can be in institute State the computer program run on processor, which is characterized in that the processor is realized when executing the computer program as weighed Benefit requires the step of video batch crawling method described in 1 to 5 any one based on YouTuBe.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization is as described in any one of claim 1 to 5 based on the video of YouTuBe when the computer program is executed by processor The step of batch crawling method.
CN201811458982.3A 2018-11-30 2018-11-30 Video batch crawling method, system, device based on YouTuBe and can storage medium Pending CN109614536A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811458982.3A CN109614536A (en) 2018-11-30 2018-11-30 Video batch crawling method, system, device based on YouTuBe and can storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811458982.3A CN109614536A (en) 2018-11-30 2018-11-30 Video batch crawling method, system, device based on YouTuBe and can storage medium

Publications (1)

Publication Number Publication Date
CN109614536A true CN109614536A (en) 2019-04-12

Family

ID=66005226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811458982.3A Pending CN109614536A (en) 2018-11-30 2018-11-30 Video batch crawling method, system, device based on YouTuBe and can storage medium

Country Status (1)

Country Link
CN (1) CN109614536A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110365776A (en) * 2019-07-17 2019-10-22 京东方科技集团股份有限公司 Picture batch method for down loading, device, electronic equipment and storage medium
CN112019917A (en) * 2020-07-28 2020-12-01 厦门快商通科技股份有限公司 Audio data extraction method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415941A (en) * 2018-01-29 2018-08-17 湖北省楚天云有限公司 A kind of spiders method, apparatus and electronic equipment
CN108536691A (en) * 2017-03-01 2018-09-14 中兴通讯股份有限公司 Web page crawl method and apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108536691A (en) * 2017-03-01 2018-09-14 中兴通讯股份有限公司 Web page crawl method and apparatus
CN108415941A (en) * 2018-01-29 2018-08-17 湖北省楚天云有限公司 A kind of spiders method, apparatus and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
灰灰: "youtube-dl配合ffmpeg的简单使用方法", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/23032097知乎》 *
萌鼠喝酸奶: "python——批下载Youtube上的1080p及以上清晰度视频(python+youtube-dl+ffmpeg)", 《HTTPS://ITPCB.COM/A/275345算法网》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110365776A (en) * 2019-07-17 2019-10-22 京东方科技集团股份有限公司 Picture batch method for down loading, device, electronic equipment and storage medium
CN110365776B (en) * 2019-07-17 2021-05-04 京东方科技集团股份有限公司 Picture batch downloading method and device, electronic equipment and storage medium
CN112019917A (en) * 2020-07-28 2020-12-01 厦门快商通科技股份有限公司 Audio data extraction method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
WO2018226621A1 (en) Methods and systems for an application system
US20140207826A1 (en) Generating xml schema from json data
Wang et al. OpenSceneGraph 3 Cookbook
CN107783967A (en) Technology for the document translation of automation
CN102981941A (en) Alarm handling method and alarm handling device
CN114020846A (en) Processing method and device capable of changing NFT (network File transfer) works
CN109614536A (en) Video batch crawling method, system, device based on YouTuBe and can storage medium
US20190108170A1 (en) Information management and continuity
CN104244027A (en) Control method and system used for live transmission and play process sharing of audio/video data
CN107066496A (en) A kind of page access method of compatible different browsers and terminal device
CN109471893A (en) Querying method, equipment and the computer readable storage medium of network data
CN109325480A (en) The input method and terminal device of identity information
CN109478251A (en) Processing method and accelerator
CN106033412B (en) A kind of text conversion method and device
CN109639559A (en) A kind of wechat H5 propagates method for monitoring and analyzing and relevant device
WO2018022266A1 (en) Scalable vector graphics bundles
CN104133847B (en) A kind of method and apparatus that sound control is carried out in browser
CN105824608B (en) Processing, plug-in unit generation method and the device of process object
Chapman Java for engineers and scientists
CN109960553A (en) A kind of more window context rendering methods and system
CN108268254B (en) Flash file function library calling method and device, electronic equipment and medium
Lathrop The way computer graphics works
CN106156422A (en) A kind of method and apparatus encapsulating standard in combination simulation modeling interface
CN106909570A (en) A kind of data transfer device and device
CN110222777A (en) Processing method, device, electronic equipment and the storage medium of characteristics of image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination