CN112347334B - Active-passive combination-based audio and video website user entry identification method and system - Google Patents

Active-passive combination-based audio and video website user entry identification method and system Download PDF

Info

Publication number
CN112347334B
CN112347334B CN202011001392.5A CN202011001392A CN112347334B CN 112347334 B CN112347334 B CN 112347334B CN 202011001392 A CN202011001392 A CN 202011001392A CN 112347334 B CN112347334 B CN 112347334B
Authority
CN
China
Prior art keywords
audio
website
video
sub
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011001392.5A
Other languages
Chinese (zh)
Other versions
CN112347334A (en
Inventor
云晓春
李扬曦
张冬明
朱宇佳
李钊
张晓欧
杨嵘
窦凤虎
尹姜谊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202011001392.5A priority Critical patent/CN112347334B/en
Publication of CN112347334A publication Critical patent/CN112347334A/en
Application granted granted Critical
Publication of CN112347334B publication Critical patent/CN112347334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64723Monitoring of network processes or resources, e.g. monitoring of network load
    • H04N21/64738Monitoring network characteristics, e.g. bandwidth, congestion level
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method and a system for identifying an active and passive combined audio and video website user entrance, which belong to the technical field of Internet information, whether a domain name exists in a global program identification table is queried through the domain name input by a user, and if the domain name does not exist, candidate sub-page URLs of a domain name website homepage are obtained; judging whether audio and video programs exist in the page or not through active access, and extracting all sub-page URLs containing the audio and video programs; judging whether the website is an audio-video website or not according to the number of candidate sub-page URLs and the ratio relation between the candidate sub-page URLs and the number of sub-page URLs containing audio-video programs; if the website is an audio/video website, acquiring a program identifier, and updating website information and the program identifier into a global program identifier list; if not, updating the website information into a global program identification table; and returning the program identification of whether the program is the audio/video website or not to the user as a query result.

Description

Active-passive combination-based audio and video website user entry identification method and system
Technical Field
The invention belongs to the technical field of Internet information, and particularly relates to an active and passive combination-based audio and video website user entry identification method and system.
Background
Some enterprises or organizations with higher security requirements have stronger monitoring requirements on audio and video contents transmitted by the Internet. And monitoring and auditing the audio and video traffic entering and exiting the Internet to find out bad contents. For example, some entertainment video website providers need to review the audio-video content uploaded by the user to find harmful audio-video programs therein. With the deterioration of the security situation and the improvement of the opening degree of the internet, more and more network monitoring systems list audio and video into the monitored objects and deploy the monitored objects at the internet entrances and exits of operators, enterprises and the like. In general, a network monitoring system obtains network traffic of an internal network and the internet through a beam-splitting or mirror image method. The system can decode and analyze the audio and video content by reforming and analyzing the network flow, further record the information of the harmful audio and video flow and alarm.
Because the audio and video flow in the network flow occupies a larger area, and the audio and video decoding analysis rate can not reach the system flow capturing rate, the audio and video analysis system is a bottleneck. Meanwhile, a certain rule exists in the entry pages of a plurality of audio and video programs, and if the audio and video entry page rules can be analyzed and mined, the analysis efficiency of the audio and video flow can be greatly enhanced, and the construction investment is saved.
Existing audio and video traffic identification schemes are directed to similarity of content or titles. Patent CN103678702a proposes a method for video identification de-duplication, which divides a video set into a plurality of sub-video sets according to text data of a video, calculates similarity between a video subset and two videos by using a space vector model, and identifies an audio-video program by program content. Patent 103678527a proposes a method of video identification based on the similarity of video titles and video streams.
From the above, the prior art is based on a bypass monitoring system, which cannot solve the following problems:
1. network traffic is random. The monitoring system acquires audio and video traffic through a bypass mode, and because of the reasons of traffic mirroring, light splitting, transmission and the like, network traffic captured by the system has the phenomena of packet loss, disorder, inadequacy and the like, the program identification cannot be calculated through the methods of acquiring a fixed fragment calculation abstract, acquiring video title similarity and the like.
2. The identification efficiency of massive audio and video programs is low. The solutions are all carried out by calculating the abstract or the similarity of the video content, wherein the abstract is a fuzzy value and cannot be used as a unique identification of a program. If the summary comparison of all programs is performed, the efficiency is low, and the flow identification is performed while corresponding processing actions are generally adopted, so that the response of the actions is slow.
3. The change of the user configuration state cannot be accommodated. And the audio and video system identifies the harmful programs according to the configuration issued by the user. Whether the program is harmful or not is based on the user configuration, and is relative. The scheme of monitoring system audio and video deduplication should be able to be changed along with user configuration, and select the needed programs for identification, instead of processing all programs at the same time. The above solution does not fulfill this need.
In summary, a method for identifying the user portal of the video website to solve the above problems needs to be designed to adapt to the requirements of the application scenario of the network monitoring system.
Disclosure of Invention
In order to improve the processing efficiency of a network monitoring system on audio and video traffic and save computing resources, the invention provides an audio and video website user entry identification method based on active and passive combination, which can quickly discover video content accessed by a user from passive traffic and improve the detection efficiency of the audio and video content through URL feature identification of an audio and video entry page.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a method for identifying active and passive combined audio and video website user entry comprises the following steps:
1) Acquiring a domain name input by a user, inquiring whether the domain name exists in a global program identification table, and if so, returning an inquiry result to the user;
2) If the domain name does not exist in the global program identification table, accessing a homepage of the domain name website to obtain candidate sub-page URLs in the homepage;
3) Actively accessing candidate sub-page URLs, judging whether audio and video programs exist in the page, and extracting all sub-page URLs containing the audio and video programs;
4) Judging whether the website is an audio-video website or not according to the number of candidate sub-page URLs and the ratio relation between the candidate sub-page URLs and the number of sub-page URLs containing audio-video programs; if the website is an audio/video website, acquiring a program identifier of the audio/video website, and updating website information and the program identifier into a global program identifier list; if the website information is not the audio/video website, updating the website information into a global program identification table; the website information comprises a domain name and whether the website information is an audio/video website;
5) Returning the query result to the user; the query result comprises whether the query result is an audio-video website or not, and the query result also comprises a program identifier for the audio-video website.
Further, the candidate sub-page URLs are all links containing < a > tags in the home page source code.
Further, filtering candidate sub-page URLs, and reserving accessible sub-page URLs which are the same as the main domain name of a website homepage; and then actively accessing the filtered candidate sub-page URLs.
Further, the candidate sub-page URLs and sub-page URLs containing the audio-video programs are stored in the form of a list.
Further, the step of obtaining the program identifier of the audio-video website includes:
analyzing all sub-page URLs containing audio and video programs, and dividing the sub-page URLs into three parts, namely a protocol part scheme, a server address netloc and a file path;
comparing every two URLs once, and calculating the longest public subsequence of the path of the two URLs by using a dynamic programming method when the scheme is the same as netloc;
combining the longest public subsequence with the scheme and netloc, and replacing the unmatched part with the base to generate a matching rule so as to obtain a regular expression;
and reducing all the obtained regular expressions, and reducing redundant matching rules to obtain the program identification.
Further, whether the audio and video programs exist in the page is judged by judging whether specific elements exist in the page, and the specific elements exist when the audio and video programs exist in the page or flash animation embedding is carried out.
An active-passive combined audio-video website user portal identification system, comprising:
the user inquiry subsystem is used for inputting a domain name when a user inquires, initiating an inquiry request to the inquiry subsystem to acquire an inquiry result, wherein the inquiry result comprises whether an audio and video website is obtained or not, and the user inquiry subsystem also comprises a program identifier for the audio and video website;
the audio and video program identification subsystem comprises a query sub-module, a reporting sub-module and a global program identification table, wherein:
the inquiring sub-module is used for inquiring whether the domain name exists in the global program identification table according to the domain name input by the user, and sending an inquiring result to the user inquiring sub-system;
the reporting submodule is used for accessing a homepage of a domain name website when the domain name input by a user does not exist in the global program identification list, and acquiring candidate sub-page URLs in the homepage; actively accessing candidate sub-page URLs, judging whether audio and video programs exist in the page, and extracting all sub-page URLs containing the audio and video programs; judging whether the website is an audio-video website or not according to the number ratio relation between the candidate sub-page URL and the sub-page URL containing the audio-video program; if the website is an audio/video website, acquiring a program identifier of the audio/video website, and updating website information and the program identifier into a global program identifier list; if the website information is not the audio/video website, updating the website information into a global program identification table; the website information comprises a domain name and whether the website information is an audio/video website; and returning the query result to the query sub-module.
The technical scheme of the invention has the following beneficial effects:
1. by judging the accessibility of the URL and the condition of the main domain name, the candidate sub URL list is filtered (the general filtering number can reach 20% of the original number), so that the searching speed can be improved and the waiting time of a user can be reduced under large-scale inquiry.
2. The audio and video programs are marked through the entrance page of the audio and video programs, so that the problem that the audio and video program mark cannot be re-entered is solved, and the re-entrant property of the programs is ensured; the regular expression is used for identifying the audio and video program entrance, so that the audio and video program entrance can be quickly matched under large-scale flow, and the resource consumption and occupation are reduced.
3. The URL is decomposed by utilizing semantic information of the URL, so that the readability of the mark is improved; and calculating the longest public subsequence of the file path in the URL by using a dynamic programming algorithm to generate a regular expression, so that the program identification is facilitated, and the program query speed under large-scale flow is increased.
4. By means of active and passive combination, the method can adapt to the change of the configuration state of a user, supplements programs which do not exist in the global program identification table at any time, generates forward feedback, and improves recall rate of program identification inquiry hit.
According to the technical scheme, the application scene of the network monitoring equipment is considered, the audio and video entry page rule generation algorithm configured by the user is based, and the requirements of monitoring and examining the audio and video contents in Internet entrances and exits of operators, enterprises and the like are met.
Drawings
Fig. 1 is a frame diagram of an active and passive combined audio and video website user entry identification system provided by the invention.
Detailed Description
In order to make the technical scheme of the invention more understandable, specific examples are described below in detail with reference to the accompanying drawings.
The framework of the audio and video website user entry identification system provided by the invention is shown in fig. 1, and mainly comprises two subsystems: the system is responsible for receiving domain name information input by a user, initiating a query request to a query sub-module and receiving a query result to the user; the system is responsible for processing inquiry and reporting requests of a user system and is divided into an inquiry processing sub-module and a reporting processing sub-module; the former realizes the return of website information, and the latter maintains a website global program identification table; the two subsystems and the two sub-modules are connected through a communication interface.
The detailed steps of each sub-module are described below.
1. And a user inquiry subsystem. The implementation functions and steps are as follows:
step 101: the system acquires a domain name input by a user;
step 102: initiating a query request to a query processing sub-module, and if the returned website exists in the global program identification table, jumping to step 104; if not, jumping to step 103;
step 103: reminding a user that website information needs to be queried in real time, waiting for a query result, and jumping to step 104;
step 104: and returning a website query result to the user, wherein the query result comprises whether the website is an audio-video website or not, and also comprises a program identifier for the audio-video website, so that the current query is ended.
2. Audio and video program identification subsystem
(1) Query submodule
The inquiring submodule is responsible for inquiring whether the current domain name exists in the global program identification table, and the main steps are as follows:
step 2001: initiating a query to the global program identification table, if the domain name website exists in the global program identification table, jumping to step 2003; if not, jumping to step 2002;
step 2002: the reporting submodule initiates a domain name judgment and program identification request, and jumps to step 2011;
step 2003: and returning a query result, wherein the query result comprises whether the query result is an audio-video website or not, and a program identifier is further included for the audio-video website.
(2) Reporting submodule
The reporting submodule is responsible for maintaining a global program identification list, judging whether the website is an audio/video website according to the domain name provided in the step 2002, actively mining the program identification rule, and updating information to the global program identification list. The specific functions and steps are as follows:
step 2011: and extracting the homepage URL of the domain name website, accessing the website homepage, and acquiring all links (namely sub-page URLs) containing < a > tags in source codes in the website homepage to form a candidate sub-page URL list. The list is filtered to ensure that the primary domain name of each link in the list is the same as the primary domain name of the home page of the web site, while the link is accessible. Obtaining a sub-page URL list in which audio and video programs possibly exist;
step 2012: actively accessing the sub-page URLs, and judging whether an audio and video program exists in the page. According to the W3C HTML5 standard, if audio and video programs exist in the page or flash animation is embedded, specific elements appear in the page. By determining whether a specific element exists in a sub-page, whether an audio/video program exists in the sub-page can be determined. Outputting all sub-page URLs containing the audio and video programs to step 2013;
step 2013: and judging whether the website is an audio-video website or not according to the number of candidate sub-URLs and the ratio relation between the candidate sub-URLs and the number of sub-page URLs containing audio-video programs. The specific judging method comprises the following steps: if the number of the elements of the candidate sub URL list is not less than 20 and the number of sub page URLs containing audio and video programs is not less than 50%, the website is judged to be an audio and video website. If the website is not an audio/video website, then go to step 2015; if the website is an audio/video website, then go to step 2014;
step 2014: and excavating the URL mode according to the URL semantics and the dynamic programming method. Analyzing all sub-page URLs containing audio and video programs, and splitting the sub-page URLs into three parts: protocol part (schema), server address (netloc), and file path (path). A comparison is performed between every two URLs to see if URL rules can be generated: when the scheme and netloc are the same, the longest common subsequence of the two paths is calculated by using a dynamic programming method, and the unmatched part is replaced by a. And combining the schema and netloc to generate a matching rule and obtain a regular expression. And (4) calculating all sub-page URLs containing the audio and video programs in step 2012 in pairs to obtain a series of regular expressions. Reducing the regular expressions to reduce redundant matching rules, and obtaining program entry page identification rules of the audio and video website, wherein the rules are program identification information; the specific reduction method is as follows: taking regular expressions with the matching times accounting for 80% of the total matching number as a pattern string A to be matched, and taking the remaining 20% of regular expressions as a character string B to be matched: if B can be represented by A, then reducing the regular expression in B; if not, the matching rule in B is reserved, and next matching reduction is carried out. After execution, jump to step 2015;
step 2015: updating the website information (website domain name, whether the website is an audio/video website or not) and the program identification (for the audio/video website) to a global program identification table, and ending the reporting process; and then jumps to step 2003.
The following specifically exemplifies two examples:
example 1: inquiring the domain name existing in the global program identification table and returning identification information.
The query information "https:// www.bilibili.com/".
Inquiring that the domain name exists in the global program identification list through the inquiry interface, wherein the website is an audio/video website, and returning program identification information: {' https:// www.bilibili.com/video/BV 1:// cm.bilibilili.com/cm/api/fets/pc/sync/v 2 }
And returning the query result to the user.
Example 2: inquiring the domain name which does not exist in the global section target identification table, reporting to the reporting sub-module, and returning the program identification information.
The query information "https:// www.iqiyi.com/".
And inquiring that the domain name does not exist in the global section target identification table through an inquiring submodule, reporting the information, and reminding a user to wait for a judging result.
And accessing homepage information 'https:// www.iqiyi.com/', and acquiring sub-page URLs of the < a > tag to obtain a candidate sub-page URL list. And filtering the candidate sub-page URL list to ensure that each URL can be accessed and the main domain name is 'iqiyi', and obtaining a filtered sub-page URL list.
Actively accessing the filtered sub-page URL list, judging whether an audio and video program exists in the page through a specific element, and if so, listing the sub-page URL in the audio and video sub-URL list.
Calculating website program identification: html, { ' https:// www.iqiyi.com/v _, html, } ' https:// www.iqiyi.com/kszt #, html, } ' https:// live iqiyi.com/w/.
And updating the website information and the program identification to the global program identification table.
And returning the website information and the program identification to the user.
The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and those skilled in the art may modify or substitute the technical solution of the present invention, and the scope of the present invention is defined by the claims.

Claims (9)

1. The active and passive combined audio and video website user entry identification method is characterized by comprising the following steps of:
1) Acquiring a domain name input by a user, inquiring whether the domain name exists in a global program identification table, and if so, returning an inquiry result to the user;
2) If the domain name does not exist in the global program identification table, accessing a homepage of the domain name website to obtain candidate sub-page URLs in the homepage;
3) Actively accessing candidate sub-page URLs, judging whether audio and video programs exist in the page, and extracting all sub-page URLs containing the audio and video programs;
4) Judging whether the website is an audio-video website or not according to the number of candidate sub-page URLs and the ratio relation between the candidate sub-page URLs and the number of sub-page URLs containing audio-video programs; if the website is an audio/video website, acquiring a program identifier of the audio/video website, and updating website information and the program identifier into a global program identifier list; if the website information is not the audio/video website, updating the website information into a global program identification table; the website information comprises a domain name and whether the website information is an audio/video website; the step of obtaining the program identification of the audio and video website comprises the following steps: analyzing all sub-page URLs containing audio and video programs, and dividing the sub-page URLs into three parts, namely a protocol part scheme, a server address netloc and a file path; comparing every two URLs once, and calculating the longest public subsequence of the path of the two URLs by using a dynamic programming method when the scheme is the same as netloc; combining the longest public subsequence with the scheme and netloc, and replacing the unmatched part with the base to generate a matching rule so as to obtain a regular expression; reducing all the obtained regular expressions, and reducing redundant matching rules to obtain program identifications;
5) Returning the query result to the user; the query result comprises whether the query result is an audio-video website or not, and the query result also comprises a program identifier for the audio-video website.
2. The method of claim 1, wherein the candidate sub-page URLs are all < a > tag-containing links in the source code of the home page.
3. The method of claim 1, wherein candidate sub-page URLs are filtered first, retaining sub-page URLs that are the same as and accessible by a main domain name of a website homepage; and then actively accessing the filtered candidate sub-page URLs.
4. The method of claim 1, wherein the candidate sub-page URLs and sub-page URLs containing the audio-visual program are stored in list form.
5. The method of claim 1, wherein the method of reducing all regular expressions obtained is: taking regular expressions with the matching times accounting for 80% of the total matching number as a pattern string A to be matched, and taking the remaining 20% of regular expressions as a character string B to be matched: if B can be represented by A, the regular expression in B is reduced, otherwise, the matching rule in B is reserved, and next matching is reduced.
6. The method of claim 1, wherein determining whether an audio-video program is present in the page is performed by determining whether a specific element is present in the page, the specific element being present when the audio-video program is present in the page or flash animation is embedded.
7. The method of claim 1, wherein the web site is determined to be an audio-video web site if the number of candidate sub-page URLs is not less than 20 and the ratio of the number of sub-page URLs containing the audio-video program to the number of candidate sub-page URLs is not less than 50%.
8. An active and passive combined audio and video website user entry identification system, comprising:
the user inquiry subsystem is used for inputting a domain name when a user inquires, initiating an inquiry request to the inquiry subsystem to acquire an inquiry result, wherein the inquiry result comprises whether an audio and video website is obtained or not, and the user inquiry subsystem also comprises a program identifier for the audio and video website;
the audio and video program identification subsystem comprises a query sub-module, a reporting sub-module and a global program identification table, wherein
The inquiring sub-module is used for inquiring whether the domain name exists in the global program identification table according to the domain name input by the user, and sending an inquiring result to the user inquiring sub-system;
the reporting submodule is used for accessing a homepage of a domain name website when the domain name input by a user does not exist in the global program identification list, and acquiring candidate sub-page URLs in the homepage; actively accessing candidate sub-page URLs, judging whether audio and video programs exist in the page, and extracting all sub-page URLs containing the audio and video programs; judging whether the website is an audio-video website or not according to the number ratio relation between the candidate sub-page URL and the sub-page URL containing the audio-video program; if the website is an audio/video website, acquiring a program identifier of the audio/video website, and updating website information and the program identifier into a global program identifier list; if the website information is not the audio/video website, updating the website information into a global program identification table; the website information comprises a domain name and whether the website information is an audio/video website; returning the query result to the query sub-module; the step of obtaining the program identification of the audio and video website comprises the following steps: analyzing all sub-page URLs containing audio and video programs, and dividing the sub-page URLs into three parts, namely a protocol part scheme, a server address netloc and a file path; comparing every two URLs once, and calculating the longest public subsequence of the path of the two URLs by using a dynamic programming method when the scheme is the same as netloc; combining the longest public subsequence with the scheme and netloc, and replacing the unmatched part with the base to generate a matching rule so as to obtain a regular expression; and reducing all the obtained regular expressions, and reducing redundant matching rules to obtain the program identification.
9. The system of claim 8, wherein the reporting sub-module is further configured to filter candidate sub-page URLs, retain sub-page URLs that are the same as and accessible by a main domain name of a website homepage, and then actively access the filtered candidate sub-page URLs.
CN202011001392.5A 2020-09-22 2020-09-22 Active-passive combination-based audio and video website user entry identification method and system Active CN112347334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011001392.5A CN112347334B (en) 2020-09-22 2020-09-22 Active-passive combination-based audio and video website user entry identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011001392.5A CN112347334B (en) 2020-09-22 2020-09-22 Active-passive combination-based audio and video website user entry identification method and system

Publications (2)

Publication Number Publication Date
CN112347334A CN112347334A (en) 2021-02-09
CN112347334B true CN112347334B (en) 2023-05-23

Family

ID=74357713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011001392.5A Active CN112347334B (en) 2020-09-22 2020-09-22 Active-passive combination-based audio and video website user entry identification method and system

Country Status (1)

Country Link
CN (1) CN112347334B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101635826B (en) * 2008-07-21 2011-02-09 中国科学院计算技术研究所 Method for acquiring addresses of network video programs
US20130013583A1 (en) * 2011-05-30 2013-01-10 Lei Yu Online video tracking and identifying method and system
CN105635038B (en) * 2014-10-27 2018-08-21 任子行网络技术股份有限公司 A kind of method and system for screening audio and video website
CN106535006B (en) * 2016-12-12 2019-04-09 朝阳聚声泰(信丰)科技有限公司 A kind of audio-video identifying system Internet-based and its method
CN108959382B (en) * 2018-05-30 2021-06-18 维沃移动通信有限公司 Audio and video detection method and mobile terminal
CN110110252B (en) * 2019-05-17 2021-01-15 北京市博汇科技股份有限公司 Audio-visual program identification method, device and storage medium

Also Published As

Publication number Publication date
CN112347334A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
CN104125209B (en) Malice website prompt method and router
CN109033115B (en) Dynamic webpage crawler system
CN109036417B (en) Method and apparatus for processing voice request
WO2016173200A1 (en) Malicious website detection method and system
US20040128285A1 (en) Dynamic-content web crawling through traffic monitoring
MXPA03004447A (en) A system and process for network site fragmented search.
CN109905288B (en) Application service classification method and device
CN102857369B (en) Website log saving system, method and apparatus
US20040030681A1 (en) System and process for network site fragmented search
CN102436564A (en) Method and device for identifying falsified webpage
CN107977678B (en) Method and apparatus for outputting information
CN105743730A (en) Method and system used for providing real-time monitoring for webpage service of mobile terminal
CN114024728B (en) Honeypot building method and application method
CN102185830B (en) A kind of method and system of security filtration of network television browser
CN101727471A (en) Website content retrieval system and method
KR101503268B1 (en) Symantic client, symantic information management server, method for generaing symantic information, method for searching symantic information and computer program recording medium for performing the methods
CN112347334B (en) Active-passive combination-based audio and video website user entry identification method and system
CN106959975B (en) Transcoding resource cache processing method, device and equipment
CN102571757B (en) Method and system for providing web services
KR20190072619A (en) Hash-based dynamic restrictions on information resources
RU2530671C1 (en) Checking method of web pages for content in them of target audio and/or video (av) content of real time
CN107066510B (en) Information processing method and device
KR102483004B1 (en) Method for detecting harmful url
CN105190598A (en) Resource reference classification
CN106156024B (en) Information processing method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant