CN110968781A - Video page drama determining method and device - Google Patents

Video page drama determining method and device Download PDF

Info

Publication number
CN110968781A
CN110968781A CN201811163377.3A CN201811163377A CN110968781A CN 110968781 A CN110968781 A CN 110968781A CN 201811163377 A CN201811163377 A CN 201811163377A CN 110968781 A CN110968781 A CN 110968781A
Authority
CN
China
Prior art keywords
drama
video
video webpage
analysis
analysis rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811163377.3A
Other languages
Chinese (zh)
Other versions
CN110968781B (en
Inventor
陈国兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201811163377.3A priority Critical patent/CN110968781B/en
Publication of CN110968781A publication Critical patent/CN110968781A/en
Application granted granted Critical
Publication of CN110968781B publication Critical patent/CN110968781B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method and a device for determining a drama of a video page. The method comprises the following steps: acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result, and achieving the effect of improving the efficiency of extracting the drama of the video website page.

Description

Video page drama determining method and device
Technical Field
The invention relates to the field of video dramas, in particular to a method and a device for determining a video page drama.
Background
Extracting the drama from the video media page can help the user to know what the video media played in the current video media page is, and can also identify the contents crawled by the crawler more fully. The existing scheme for acquiring the drama of the video media page is divided according to the video media, and a special drama extraction processor is provided for each video media. One extraction processor can only process one media page, and when the media page is reprinted, the corresponding extraction processor needs to make corresponding adjustment. If a new media appears, a corresponding extraction processor needs to be added, and a lot of repeated work needs to be added, so that the drama extraction efficiency of the video website page is low.
Aiming at the problem of low efficiency of extracting the drama of a video website page in the related technology, an effective solution is not provided at present.
Disclosure of Invention
The invention mainly aims to provide a method and a device for determining a drama of a video page, which are used for solving the problem of low efficiency in extracting the drama of a video website page.
In order to achieve the above object, according to an aspect of the present invention, there is provided a video page scenario determination method, including: acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the acquired analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result.
Further, the method further comprises: and under the condition that the drama analysis rule of the media corresponding to the video webpage is not obtained, taking the title of the video webpage as the drama name of the video webpage.
Further, analyzing the drama of each video webpage according to the obtained analysis rule, and obtaining an analysis result comprises: analyzing the drama of the video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule; calculating a weighted result corresponding to each of the dramas; and determining the drama with the highest weighted value as the drama name of the video webpage according to the weighted result.
Further, the scenario of the video webpage is analyzed through a plurality of analysis rules, and the scenario corresponding to each analysis rule is obtained by at least one of the following: analyzing the label of the video webpage based on the label analysis rule to obtain a drama corresponding to the label analysis rule; analyzing the title of the video webpage based on the title analysis rule through a title analyzer to obtain a drama corresponding to the title analysis rule; and analyzing the keywords of the video webpage based on the keyword analysis rule through a keyword analyzer to obtain the drama corresponding to the keyword analysis rule.
Further, determining the scenario with the highest weighting value as the scenario name of the video webpage according to the weighting result includes: judging whether the drama corresponding to the label analysis rule, the drama corresponding to the title analysis rule and the drama corresponding to the keyword analysis rule have the same drama or not; under the condition that the same drama exists, adding the weights corresponding to the same drama to obtain a weight corresponding to each drama; and taking the drama with the highest weight as the drama name of the video webpage.
In order to achieve the above object, according to another aspect of the present invention, there is also provided a video page scenario determination apparatus, including: the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a video webpage set to be analyzed, and the video webpage set comprises one or more video webpage addresses to be analyzed; the second acquisition unit is used for sequentially acquiring a drama analysis rule of media corresponding to each video webpage to be analyzed in the video webpage set to be analyzed, and the analysis unit is used for analyzing the drama of each video webpage according to the acquired analysis rule to obtain an analysis result; and the determining unit is used for determining the drama name of each video webpage according to the analysis result.
Further, the apparatus further comprises: and the processing unit is used for taking the title of the video webpage as the drama name of the video webpage under the condition that the drama analysis rule of the media corresponding to the video webpage is not obtained.
Further, the parsing unit includes: the analysis module is used for analyzing the drama of the video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule; a calculation module for calculating a weighted result corresponding to each of the dramas; and the determining module is used for determining the drama with the highest weighted value as the drama name of the video webpage according to the weighted result.
In order to achieve the above object, according to another aspect of the present invention, there is also provided a storage medium including a stored program, wherein when the program runs, an apparatus on which the storage medium is located is controlled to execute the video page scenario determination method according to the present invention.
In order to achieve the above object, according to another aspect of the present invention, there is also provided a processor for executing a program, wherein the program executes the video page scenario determination method according to the present invention.
The method comprises the steps of acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result, so that the problem of low efficiency in extracting the drama of the video website page is solved, and the effect of improving the efficiency in extracting the drama of the video website page is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a flow chart of a video page drama determining method according to a first embodiment of the present invention;
fig. 2 is a flow chart of a video page drama determining method according to a second embodiment of the present invention; and
fig. 3 is a schematic diagram of a video page scenario determination apparatus according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention provides a method for determining a drama of a video page.
Fig. 1 is a flowchart of a video page scenario determination method according to a first embodiment of the present invention, as shown in fig. 1, the method comprising the steps of:
step S102: acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed;
step S104: sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed;
step S106: analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result;
step S108: and determining the drama name of each video webpage according to the analysis result.
The embodiment adopts the steps of acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result, so that the problem of low efficiency in extracting the drama of the video website page is solved, and the effect of improving the efficiency in extracting the drama of the video website page is achieved.
In the embodiment of the present invention, the extraction processor may store the scenario parsing rules of the media corresponding to the plurality of video webpages, for example, may store: the scenario of the super-cool video webpage is a label name under a video playing window, the scenario of the love art video webpage is a title name of the video playing page, and the like, respective scenario analysis rules are obtained from a plurality of video websites in advance and stored in an extraction processor of the embodiment of the application, when a current scenario of a certain video webpage is required to be obtained subsequently, the scenario analysis rule of a medium corresponding to the current video webpage can be searched from the extraction processor, if the scenario analysis rule can be obtained, the corresponding current video webpage has a definite scenario analysis rule, the scenario of the current video webpage can be analyzed according to the obtained analysis rule, and the scenario name of the current video webpage is obtained, so that the extraction of a plurality of media scenarios can be realized by using one extraction processor, for example, a plurality of video webpage sets to be analyzed can be obtained at one time, the method comprises the steps of obtaining a scenario analysis rule of each video webpage to be analyzed in sequence, and analyzing the scenario of each video webpage in sequence.
And if the drama resolution rule of the media corresponding to the current video webpage cannot be extracted when being searched in the extraction processor, taking the title of the current video webpage as the drama name of the current video webpage. If the episode parsing rule is not extracted, the title of the current video webpage may be generally used as the episode name.
Optionally, analyzing the drama of the current video webpage according to the obtained analysis rule, and obtaining an analysis result includes: analyzing the drama of the current video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule; calculating a weighted result corresponding to each of the dramas; and determining the scenario with the highest weighted value as the scenario name of the current video webpage according to the weighted result.
Optionally, parsing the drama of the current video webpage through a plurality of parsing rules, and obtaining the drama corresponding to each parsing rule includes at least one of: the tag of the current video webpage can be analyzed through a tag analyzer based on a tag analysis rule to obtain a drama corresponding to the tag analysis rule; analyzing the title of the current video webpage based on the title analysis rule through a title analyzer to obtain a drama corresponding to the title analysis rule; and analyzing the keywords of the current video webpage through a keyword analyzer based on the keyword analysis rule to obtain the drama corresponding to the keyword analysis rule.
Each parser stores a corresponding parsing rule, for example, the tag parser stores a tag parsing rule, the title parser stores a title parsing rule, and the keyword parser stores a keyword parsing rule, in addition, a new parser can be added to parse the drama of the video webpage through more parsing rules.
Optionally, determining the scenario with the highest weighting value as the scenario name of the current video webpage according to the weighting result includes: judging whether the drama corresponding to the label analysis rule, the drama corresponding to the title analysis rule and the drama corresponding to the keyword analysis rule have the same drama or not; under the condition that the same drama exists, adding the weights corresponding to the same drama to obtain a weight corresponding to each drama; and taking the drama with the highest weight as the drama name of the current video webpage.
Some video websites have a scenario parsing rule that is an explicit parsing method, and some video websites have a scenario parsing rule that is not unique but has multiple possibilities, for example, the scenario parsing rule may be a tag name at a certain position on the video website, a title of a video webpage, or a word with a high frequency of appearance on the webpage, for such websites, the extraction processor stores a plurality of resolvers of different types and weights corresponding to each resolver, for example, the weight of the tag resolver is 40%, the weight of the title resolver is 30%, the weight of the keyword resolver is 30%, the result obtained by parsing by each resolver is determined as the scenario name of the video webpage, for example, the result of parsing by the tag resolver is a, the parsing results of the title resolver and the keyword resolver are both B, the weight of B is 60%, and the weight is the highest, then B is taken as the drama name of the current video webpage.
The embodiment of the present invention also provides a preferred implementation manner, and the following describes the technical solution of the embodiment of the present invention with reference to the preferred implementation manner.
Fig. 2 is a flowchart of a video page scenario determination method according to a second embodiment of the present invention, as shown in fig. 2, the method comprising the steps of:
[ S01] extraction processor initialization. The extraction processor initializes the screenplay parsing rules which primarily contain the loading of all media. This step is performed only once at start-up.
[ S02] A media scenario parsing rule is obtained. And acquiring all the drama resolution rules of the media according to the media domain name to which the page to be resolved belongs. The scenario parsing rule comprises information such as a parser name, a parser configuration and a weight. If the media has a scenario resolution rule, the process proceeds to S04, otherwise, the process proceeds to S03.
[ S03] title is extracted as a drama. The rule is a default drama extraction rule, and the page title is directly used as the drama name.
[ S04] extracting possible dramas by rules. And (4) according to the analysis rule obtained in the step (0002), executing specific analyzers one by one to obtain the drama names, and labeling weight information. This step results in a weighted list of possible cast names.
[ S05] Theater is determined. And determining the final drama name according to the drama name and the weight information.
The extraction processor of the embodiment of the invention also comprises a media drama analysis rule configuration module which is used for providing the management function of the media drama analysis rule.
The embodiment of the invention provides a method for dividing and designing an extraction processor according to an extraction mode, which enlarges the application range of the extraction processor, solves the problem that a system needs to be synchronously adjusted and reissued when a media page is changed or new media appears, improves the application range of the system and reduces the maintenance cost.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
The embodiment of the invention provides a video page drama determining device, which can be used for executing the video page drama determining method.
Fig. 3 is a schematic diagram of a video page scenario determination apparatus according to an embodiment of the present invention, as shown in fig. 3, the apparatus comprising:
the system comprises a first acquisition unit 10, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a video webpage set to be analyzed, and the video webpage set comprises one or more video webpage addresses to be analyzed;
the second obtaining unit 20 is configured to sequentially obtain a scenario analysis rule of a media corresponding to each video webpage to be analyzed in the video webpage set to be analyzed;
the analyzing unit 30 is configured to analyze the drama of each video webpage according to the acquired analysis rule to obtain an analysis result;
and the determining unit 40 is used for determining the drama name of each video webpage according to the analysis result.
The embodiment adopts a first obtaining unit 10, configured to obtain a video webpage set to be analyzed, where the video webpage set includes one or more video webpage addresses to be analyzed; the second obtaining unit 20 is configured to sequentially obtain a scenario analysis rule of media corresponding to each video webpage to be analyzed in the video webpage set to be analyzed, and the analyzing unit 30 is configured to analyze the scenario of each video webpage according to the obtained analysis rule to obtain an analysis result; and the determining unit 40 is used for determining the drama name of each video webpage according to the analysis result. Therefore, the problem of low efficiency of extracting the drama of the video website page is solved, and the effect of improving the efficiency of extracting the drama of the video website page is achieved.
Optionally, the apparatus further comprises: and the processing unit is used for taking the title of the current video webpage as the drama name of the current video webpage under the condition that the drama analysis rule of the media corresponding to the current video webpage is not obtained.
Optionally, the parsing unit 30 includes: the analysis module is used for analyzing the drama of the current video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule; a calculation module for calculating a weighted result corresponding to each of the dramas; and the determining module is used for determining the drama with the highest weighted value as the drama name of the current video webpage according to the weighted result.
The video page drama determining device comprises a processor and a memory, wherein the acquiring unit, the analyzing unit, the determining unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the efficiency of extracting the drama of the video website page is improved by adjusting the kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present invention provides a storage medium on which a program is stored, the program implementing the video page scenario determination method when executed by a processor.
The embodiment of the invention provides a processor, which is used for running a program, wherein the video page drama determining method is executed when the program runs.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program which is stored on the memory and can run on the processor, wherein the processor executes the program and realizes the following steps: acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result. The device herein may be a server, a PC, a PAD, a mobile phone, etc.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed; sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed; analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result; and determining the drama name of each video webpage according to the analysis result.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for determining a scenario of a video page is characterized by comprising the following steps:
acquiring a video webpage set to be analyzed, wherein the video webpage set comprises one or more video webpage addresses to be analyzed;
sequentially acquiring a drama analysis rule of a medium corresponding to each video webpage to be analyzed in the video webpage set to be analyzed;
analyzing the drama of each video webpage according to the acquired analysis rule to obtain an analysis result;
and determining the drama name of each video webpage according to the analysis result.
2. The method of claim 1, further comprising:
and under the condition that the drama analysis rule of the media corresponding to the video webpage is not obtained, taking the title of the video webpage as the drama name of the video webpage.
3. The method according to claim 1, wherein analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result comprises:
analyzing the drama of the video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule;
calculating a weighted result corresponding to each of the dramas;
and determining the drama with the highest weighted value as the drama name of the video webpage according to the weighted result.
4. The method of claim 3, wherein parsing the transcript of the video web page through a plurality of parsing rules, resulting in a transcript corresponding to each parsing rule comprises at least one of:
analyzing the label of the video webpage based on the label analysis rule to obtain a drama corresponding to the label analysis rule;
analyzing the title of the video webpage based on the title analysis rule through a title analyzer to obtain a drama corresponding to the title analysis rule;
and analyzing the keywords of the video webpage based on the keyword analysis rule through a keyword analyzer to obtain the drama corresponding to the keyword analysis rule.
5. The method of claim 4, wherein determining the episode with the highest weighted value as the episode name of the video webpage according to the weighting result comprises:
judging whether the drama corresponding to the label analysis rule, the drama corresponding to the title analysis rule and the drama corresponding to the keyword analysis rule have the same drama or not;
under the condition that the same drama exists, adding the weights corresponding to the same drama to obtain a weight corresponding to each drama;
and taking the drama with the highest weight as the drama name of the video webpage.
6. A video page drama determining apparatus, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a video webpage set to be analyzed, and the video webpage set comprises one or more video webpage addresses to be analyzed;
the second acquisition unit is used for sequentially acquiring a drama analysis rule of media corresponding to each video webpage to be analyzed in the video webpage set to be analyzed;
the analysis unit is used for analyzing the drama of each video webpage according to the obtained analysis rule to obtain an analysis result;
and the determining unit is used for determining the drama name of each video webpage according to the analysis result.
7. The apparatus of claim 6, further comprising:
and the processing unit is used for taking the title of the video webpage as the drama name of the video webpage under the condition that the drama analysis rule of the media corresponding to the video webpage is not obtained.
8. The apparatus of claim 6, wherein the parsing unit comprises:
the analysis module is used for analyzing the drama of the video webpage through a plurality of analysis rules to obtain the drama corresponding to each analysis rule;
a calculation module for calculating a weighted result corresponding to each of the dramas;
and the determining module is used for determining the drama with the highest weighted value as the drama name of the video webpage according to the weighted result.
9. A storage medium comprising a stored program, wherein the program, when executed, controls an apparatus on which the storage medium is located to perform the video page scenario determination method of any one of claims 1 to 5.
10. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute the video page scenario determination method of any one of claims 1 to 5 when running.
CN201811163377.3A 2018-09-30 2018-09-30 Video page scenario determination method and device Active CN110968781B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811163377.3A CN110968781B (en) 2018-09-30 2018-09-30 Video page scenario determination method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811163377.3A CN110968781B (en) 2018-09-30 2018-09-30 Video page scenario determination method and device

Publications (2)

Publication Number Publication Date
CN110968781A true CN110968781A (en) 2020-04-07
CN110968781B CN110968781B (en) 2023-05-23

Family

ID=70029530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811163377.3A Active CN110968781B (en) 2018-09-30 2018-09-30 Video page scenario determination method and device

Country Status (1)

Country Link
CN (1) CN110968781B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102348136A (en) * 2011-05-13 2012-02-08 深圳市网合科技股份有限公司 Program source information acquisition apparatus and method thereof
CN103455572A (en) * 2013-08-20 2013-12-18 北京奇虎科技有限公司 Method and device for acquiring movie and television subjects from web pages
CN104978404A (en) * 2015-06-04 2015-10-14 无锡天脉聚源传媒科技有限公司 Video album name generating method and apparatus
WO2017032249A1 (en) * 2015-08-26 2017-03-02 腾讯科技(深圳)有限公司 Video file display method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102348136A (en) * 2011-05-13 2012-02-08 深圳市网合科技股份有限公司 Program source information acquisition apparatus and method thereof
CN103455572A (en) * 2013-08-20 2013-12-18 北京奇虎科技有限公司 Method and device for acquiring movie and television subjects from web pages
CN104978404A (en) * 2015-06-04 2015-10-14 无锡天脉聚源传媒科技有限公司 Video album name generating method and apparatus
WO2017032249A1 (en) * 2015-08-26 2017-03-02 腾讯科技(深圳)有限公司 Video file display method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GABRIELE DONZELLI等: "misinformation on vaccination:a quantitative analysis of youtube videos", 《HUMAN VACCINES& IMMUNOTHERAPEUTICS》 *
庄一嵘等: "IPTV视频智能搜索系统设计与实现", 《广东通信技术》 *

Also Published As

Publication number Publication date
CN110968781B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN106021583B (en) Statistical method and system for page flow data
CN108255886B (en) Evaluation method and device of recommendation system
CN110162778B (en) Text abstract generation method and device
CN110019298B (en) Data processing method and device
US20190087208A1 (en) Method and apparatus for loading elf file of linux system in windows system
CN110020236B (en) Webpage parsing method, device, storage medium, processor and equipment
CN110674360A (en) Method and system for constructing data association graph and tracing data
CN108874379B (en) Page processing method and device
CN114338413A (en) Method and device for determining topological relation of equipment in network and storage medium
CN108255891B (en) Method and device for judging webpage type
CN113868698A (en) File desensitization method and equipment
CN115437930B (en) Webpage application fingerprint information identification method and related equipment
US8751508B1 (en) Contextual indexing of applications
CN110532773B (en) Malicious access behavior identification method, data processing method, device and equipment
CN110968500A (en) Test case execution method and device
CN110968781B (en) Video page scenario determination method and device
CN109429100B (en) Method, device and system for storing page path
CN110929188A (en) Method and device for rendering server page
CN110990799A (en) Data processing method, device and system for anti-crawler and storage medium
CN111125087A (en) Data storage method and device
CN114710318A (en) Method, device, equipment and medium for limiting high-frequency access of crawler
CN109710833B (en) Method and apparatus for determining content node
CN109426540B (en) Element click condition detection method and device, storage medium and processor
US9471569B1 (en) Integrating information sources to create context-specific documents
CN107544968B (en) Method and device for determining website availability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant