CN110413842B - Content auditing method, system, electronic equipment and medium based on public opinion situation perception - Google Patents

Content auditing method, system, electronic equipment and medium based on public opinion situation perception Download PDF

Info

Publication number
CN110413842B
CN110413842B CN201910690883.6A CN201910690883A CN110413842B CN 110413842 B CN110413842 B CN 110413842B CN 201910690883 A CN201910690883 A CN 201910690883A CN 110413842 B CN110413842 B CN 110413842B
Authority
CN
China
Prior art keywords
content
unit
examination
review
manual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910690883.6A
Other languages
Chinese (zh)
Other versions
CN110413842A (en
Inventor
李小松
李�浩
李元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaochuan Online Network Technology Co ltd
Original Assignee
Beijing Xiaochuan Online Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaochuan Online Network Technology Co ltd filed Critical Beijing Xiaochuan Online Network Technology Co ltd
Priority to CN201910690883.6A priority Critical patent/CN110413842B/en
Publication of CN110413842A publication Critical patent/CN110413842A/en
Application granted granted Critical
Publication of CN110413842B publication Critical patent/CN110413842B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a content auditing system based on public opinion situation perception, electronic equipment and a computer readable medium. The method provides and/or sets a content auditing system and a content UGC platform; the examination scheduling unit receives examination requests from the product client/server unit, and respectively completes machine examination and manual examination according to certain logic to realize initial examination; the review scheduling unit receives the public sentiment alarm from the public sentiment sensing real-time monitoring unit and adjusts the priority of the manual review unit on the specific content to realize high-priority review; the public opinion perception real-time monitoring unit receives the data of the streaming data processing unit and sends a notice to the examination scheduling unit according to the set public opinion alarm threshold value so as to initiate further examination. Therefore, the invention can reduce the workload of manual auditing and reduce the auditing time delay, in particular the time delay of hot UGC content and comments.

Description

Content auditing method, system, electronic equipment and medium based on public opinion situation perception
Technical Field
The invention relates to the field of computer information processing, in particular to a content auditing method, system and electronic equipment based on public opinion situation perception and a readable medium.
Background
The video content inspection generally adopts a mode of combining machine inspection and manual inspection, wherein the machine firstly performs initial inspection and then performs manual inspection of the bottom. At present, the accuracy of machine inspection is not enough, so that the workload proportion of manual inspection of the pocket bottom is large. Moreover, due to the limitations of the development of text and image technologies, the recognition accuracy and recall rate of the providers of various related technologies and services in the industry are difficult to achieve a degree that does not require human intervention at all, and in order to ensure legal compliance, the industry adopts a large amount of manual review to ensure that the basic leak rate is lower than a certain value, for example, 1%. Under the general principle of prior examination and subsequent examination, the delay from content release to the visibility of platform content is long, and the user experience is seriously influenced.
In order to solve the problems in the prior art, the invention discloses a content auditing method, a content auditing system, electronic equipment and a computer readable medium based on public opinion situation perception. The workload of manual auditing can be reduced, and auditing time delay, especially time delay of hot UGC content and comments, can be reduced.
Disclosure of Invention
The invention aims to solve the problems that in the prior art, the machine review accuracy is low, so that the workload of manual review is large, the delay is long, and the user experience is seriously influenced. The invention provides a content auditing system and method based on public opinion situation perception, electronic equipment and a computer readable medium. The efficiency of manual examination and check is improved, the workload of manual examination and check is reduced, and the examination and check time delay is reduced. By judging the second-level delay of machine review, a large amount of low-risk contents can pass the review quickly, the timely requirements of users on information sending and interaction are guaranteed, and high interactivity is achieved on the premise of compliance. The user experience is improved.
In order to solve the above technical problems, a first aspect of the present invention provides a content auditing method based on public opinion situation awareness, including the following steps:
s1, providing and/or setting the content auditing system 100 and the content UGC platform 200;
the content auditing system 100 comprises an auditing scheduling unit 101, a machine auditing unit 102, a manual auditing unit 103 and a public opinion perception real-time monitoring unit 104;
content UGC platform 200 includes product client/server unit 201 and streaming data processing unit 202;
wherein the low risk content is exposed to different users via the distribution system;
s2, the content UGC platform 200 provides the content examination requirement, and disposes the content according to the content violation risk and the business logic;
s3, the examining and dispatching unit 101 receives the examining request from the product client/server unit 201, and dispatches the machine examining unit 102 and the manual examining unit 103 according to a certain logic to complete machine examination and manual examination, respectively, so as to implement initial examination;
s4, the examination scheduling unit 101 receives the public sentiment alarm from the public sentiment sensing real-time monitoring unit 104, and adjusts the priority of the manual examination unit 103 for the specific content, thereby implementing high-priority examination.
S5, the public opinion perception real-time monitoring unit 104 receives the data of the streaming data processing unit 202, and sends a notification to the review scheduling unit 101 according to the set public opinion alarm threshold to initiate further review.
Optionally, the user publishes the content on the product client/server unit 201, and the platform is also responsible for personalized recommendation of the user content, attracting the user to consume and/or interact with the content, and further enhancing the generation of the UGC content.
Optionally, the streaming data processing unit 202 collects a log of the user's behavior at the product client and transmits it to the downstream system in real time.
Optionally, the machine review unit 102 completes the technical compliance review of the various forms of content of the user;
the manual review unit 103 performs review on the content that the machine cannot accurately discriminate.
In order to solve the above technical problem, a second aspect of the present invention provides a content auditing system based on public opinion situation awareness, including:
a content auditing system 100 and a content UGC platform 200;
the content auditing system 100 comprises an auditing scheduling unit 101, a machine auditing unit 102, a manual auditing unit 103 and a public opinion perception real-time monitoring unit 104;
content UGC platform 200 includes product client/server unit 201 and streaming data processing unit 202;
the review scheduling unit 101 receives a review request from the product client/server unit 201, and schedules the machine review unit 102 and the manual review unit 103 to respectively complete machine review and manual review according to a certain logic, so as to realize initial review;
the review scheduling unit 101 receives the public sentiment alarm from the public sentiment sensing real-time monitoring unit 104, and adjusts the priority of the manual review unit 103 for specific content, thereby realizing high-priority review.
The public opinion perception real-time monitoring unit 104 receives the data of the streaming data processing unit 202 and sends a notification to the review scheduling unit 101 according to the set public opinion alarm threshold to initiate further review.
The content UGC platform 200 provides content review requirements and disposes the content according to business logic and content violation risks;
wherein low risk content is exposed to different users via the distribution system.
Optionally, the user publishes the content on the product client/server unit 201, and the platform is also responsible for personalized recommendation of the user content, attracting the user to consume and/or interact with the content, and further enhancing the generation of the UGC content.
Optionally, the streaming data processing unit 202 collects a log of the user's behavior at the product client and transmits it to the downstream system in real time.
Optionally, the machine review unit 102 completes the technical compliance review of the various forms of content of the user;
the manual review unit 103 performs review on the content that the machine cannot accurately discriminate.
In order to solve the above technical problem, a third aspect of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the above method when executing the program.
In order to solve the above technical problem, a fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, the program being executable by a processor to implement the above method.
The invention provides and/or sets a content auditing system and a content UGC platform, and a content scheduling unit respectively finishes machine auditing and manual auditing according to certain logic; the public opinion perception real-time monitoring unit receives data from a streaming data processing unit in a content UGC platform and sends a notice to the content scheduling unit according to a set public opinion alarm threshold value.
By the method, the system can reduce the workload of manual auditing and simultaneously reduce the auditing time delay, particularly the time delay of hot UGC content and comments. The content scheduling unit can identify a small part of a large amount of UGC content which is really likely to generate adverse effects, and the pre-dependency on manual examination in the UGC process is greatly reduced; there may be a risk of about 10-15% of the UGC content, but the proportion of true exposures achieved is less than 1%, and the proportion of exposures achieved above 1000 is further reduced to within 0.5%. By judging the second-level delay of machine review, a large amount of low-risk contents can pass the review quickly, the timely requirements of users on information sending and interaction are guaranteed, and high interactivity is achieved on the premise of compliance. The flow efficiency of manual examination is greatly optimized, and the common manual examination is preposed before the content is really exposed, so that the value of one content cannot be effectively judged, and the examination delay of different content values basically converges; through the evaluation of the public opinion perception real-time monitoring unit and the content scheduling unit on the content risk level, the priority of the manual review queue of the specific content can be adjusted in real time, so that the high-quality review of the important content is ensured.
Drawings
In order to make the technical problems solved by the present invention, the technical means adopted and the technical effects obtained more clear, the following will describe in detail the embodiments of the present invention with reference to the accompanying drawings. It should be noted, however, that the drawings described below are only illustrations of exemplary embodiments of the invention, from which other embodiments can be derived by those skilled in the art without inventive step.
Fig. 1 is a flowchart illustrating a content auditing method based on public opinion situation awareness according to an embodiment of the present invention.
Fig. 2 is a schematic diagram illustrating a content auditing system based on public opinion situation awareness according to an embodiment of the present invention.
Fig. 3 is a block diagram of an exemplary embodiment of an electronic device according to an embodiment of the present invention.
FIG. 4 is a schematic diagram of one embodiment of a computer-readable medium according to the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the invention may be embodied in many specific forms, and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art.
The structures, properties, effects or other characteristics described in a certain embodiment may be combined in any suitable manner in one or more other embodiments, while still complying with the technical idea of the invention.
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
Fig. 1 is a flowchart illustrating a content auditing method based on public opinion situation awareness according to an embodiment of the present invention. A content auditing method based on public opinion situation perception comprises the following steps:
s1, providing and/or setting the content auditing system 100 and the content UGC platform 200;
the content auditing system 100 comprises an auditing scheduling unit 101, a machine auditing unit 102, a manual auditing unit 103 and a public opinion perception real-time monitoring unit 104;
content UGC platform 200 includes product client/server unit 201 and streaming data processing unit 202;
wherein low risk content is exposed to different users via the distribution system.
The following examples detail the UGC platform.
Ugc (user Generated content) user Generated content, that is, the user displays the original content (text, video, voice, etc.) through the internet platform or provides the content to other users. The ugc (user Generated content) website mainly refers to an online social website, an online forum and the like based on user Generated content, such as twitter, google +, facebook and the like, which have become network platforms for mainstream information sharing. The UGC data can supplement engine results while extending readability.
For example, there may be a risk of about 10-15% of the UGC content, but the proportion of true exposures achieved is less than 1%, and the proportion of exposures achieved above 1000 is further reduced to within 0.5%.
The content auditing system 100 receives content auditing requirements from a product side, for example, posts and comments of a content community, mainly a user UGC, returns whether the content violates rules and a product processing strategy, deletes the violations, and is visible to the user and visible to the user, and can be exposed in a large scale.
S2, the content UGC platform 200 provides the content examination requirement, and disposes the content according to the content violation risk and the business logic.
The content UGC platform 200 is a demanding party for content review, and handles content according to business logic mainly according to content violation risk, wherein low-risk content is exposed to users of unequal number through a distribution system.
And the content UGC platform is used for respectively acquiring related information and/or videos of known keywords, including information, videos, pictures and/or sounds and the like from one or more predetermined user generated content UGC platforms. The known keywords may be set by the user himself, for example, sensitive words to be examined, target words, features, pictures or video frames, etc.
Product client/server unit 201: and the right/skin similar UGC platform is used for releasing the content on the platform, and the platform is simultaneously responsible for personalized recommendation of the content of the user and attracting the user to consume and interact with the content, so that the production of the UGC content is further improved.
The streaming data processing unit 202 may collect a behavior log of the user at the product client, for example, behaviors of the user on exposure, clicking, playing, commenting, praise and sharing of the content, and transmit the behavior log to a downstream system in real time.
In Web applications, such as user access, user click, these data are all streamed. Processing streaming data in batches, for example, streaming data of batch 1, streaming data of batch 2 … …, storing the data in different levels of memory cache kernel in batches, and selecting to clear the data if the streaming data is accessed or registered. Establishing a tree index structure of streaming data in the memory cache layer, identifying the data, such as access condition, registration condition, click frequency, attribute characteristics and the like, modeling, sorting, analyzing and screening the data. Meanwhile, the flow data is cleaned and integrated. By adopting the streaming data processing unit 202, the internal memory cache kernel is embedded, and the received data is processed, so that the access pressure of large-scale data is effectively relieved, the read-write pressure of streaming data is reduced, the real-time processing speed is increased, and the timeliness is improved.
S3, the examining and dispatching unit 101 receives the examining request from the product client/server unit 201, and dispatches the machine examining unit 102 and the manual examining unit 103 according to a certain logic to complete machine examination and manual examination, respectively, so as to implement initial examination.
The following example details how the first review is implemented.
The audit scheduling unit 101: the method comprises the steps of receiving an examination request from a product client/server unit 201, scheduling a machine examination unit 102 according to a certain logic, and finishing machine examination and manual examination work respectively by a manual examination unit 103 to finish primary examination. For content with a lower risk determined by machine review, the risk may still actually exist and the product client/server element 201 may be notified to look ahead of time visible to the user.
For example, the machine review unit 102 can achieve the determination of the second-level delay, so that a large amount of low-risk content can pass the review quickly, the timely requirements of the user on information transmission and interaction are ensured, and the high interactivity under the premise of compliance is achieved
The machine review unit 102: the technical compliance examination of various forms of contents of the user is completed, and the examination can be divided into the following steps according to the types of illegal harmfulness: political involvement, violence and terrorism, pornography, low customs, advertisements, illegal harmfulness and the like; the media types are classified into text models, image models and video models.
The manual review unit 103: the method is used for reviewing the contents which cannot be accurately distinguished by the machine, and relates to a system for examining content scheduling and rechecking correctness from the aspect of system function.
More specifically, a decomposition model is established to decompose the request. The requests are analyzed using a logical algorithm, decomposing each request into (x, y, z) three-dimensional operators. And establishing a logical relationship, and operating the three-bit operator to form a mapping (x ', y ', z ') of the three-bit operator. The logical scheduling relationship of the request is
logαn=(1/xn+1/yn+1/zn)×(xn’+yn’+zn’),
Where n is an integer representing the nth request, αnRepresenting a logical scheduling factor. Setting a machine check/manual check factor beta according to the machine learning result or the result of detecting for multiple times, when alpha isn≥Beta, part of the content of the request is scheduled to be machine review; when alpha isn<Beta, the portion of the content of the request is scheduled for manual review.
Optionally, the user publishes the content on the product client/server unit 201, and the platform is also responsible for personalized recommendation of the user content, attracting the user to consume and/or interact with the content, and further enhancing the generation of the UGC content.
S4, the examination scheduling unit 101 receives the public sentiment alarm from the public sentiment sensing real-time monitoring unit 104, and adjusts the priority of the manual examination unit 103 for the specific content, thereby implementing high-priority examination.
The following example details how the public opinion awareness real-time monitoring unit works.
Receiving the public sentiment alarm, such as high exposure content, from the public sentiment sensing real-time monitoring unit 104, and adjusting the priority of the manual review unit for the specific content to realize high-priority review.
And carrying out public opinion analysis on the sample by using a public opinion analysis model and obtaining a corresponding public opinion score. And intervening the logic scheduling relation of the request by the public opinion score. The public opinion score is expressed as Ω.
The logic scheduling relationship after the intervention of the public opinion score is as follows:
Ω∩logαn=(1/xn+1/yn+1/zn)×(xn’+yn’+zn’)。
and then, the priority of the manual review unit for the specific content is readjusted to realize high-priority review.
S5, the public opinion perception real-time monitoring unit 104 receives the data of the streaming data processing unit 202, and sends a notification to the review scheduling unit 101 according to the set public opinion alarm threshold to initiate further review.
Specifically, the sample is trained using a neural network model. The neural network model may be cnn, rnn, dnn, lstm or any other neural network. Training is carried out by using a training sample which can be a text, an image, a video or a sound element, and the like, a deep learning model is used for training to obtain a public opinion analysis model, and the sample is scored by using the public opinion analysis to obtain a public opinion score.
And setting a public sentiment alarm threshold according to the content to be processed, comparing the result of the logic scheduling relation processing of the public sentiment score interference prognosis with the public sentiment alarm threshold, and automatically carrying out the next examination if the result is greater than the public sentiment alarm threshold, wherein the examination can be carried out manually, and particularly can be carried out manually and carefully. By this step, the time delay can be reduced and the next examination can be automatically performed.
Optionally, the streaming data processing unit 202 collects a log of the user's behavior at the product client and transmits it to the downstream system in real time.
Optionally, the public opinion perception real-time monitoring unit 104: receives the data from the streaming data processing unit 202 and sends a notification to 101 to initiate further review based on a set public sentiment alert threshold, e.g., exposure number exceeds 2000.
Those skilled in the art will appreciate that all or part of the steps for implementing the above-described embodiments are implemented as programs executed by data processing apparatuses (including computers), i.e., computer programs. When the computer program is executed, the method provided by the invention can be realized. Furthermore, the computer program may be stored in a computer readable storage medium, which may be a readable storage medium such as a magnetic disk, an optical disk, a ROM, a RAM, or a storage array composed of a plurality of storage media, such as a magnetic disk or a magnetic tape storage array. The storage medium is not limited to centralized storage, but may be distributed storage, such as cloud storage based on cloud computing.
Embodiments of the apparatus of the present invention are described below, which may be used to perform method embodiments of the present invention. The details described in the device embodiments of the invention should be regarded as complementary to the above-described method embodiments; reference is made to the above-described method embodiments for details not disclosed in the apparatus embodiments of the invention.
Fig. 2 is a schematic diagram illustrating a content auditing system based on public opinion situation awareness according to an embodiment of the present invention. The system comprises:
a content auditing system 100 and a content UGC platform 200;
the content auditing system 100 comprises an auditing scheduling unit 101, a machine auditing unit 102, a manual auditing unit 103 and a public opinion perception real-time monitoring unit 104;
content UGC platform 200 includes product client/server unit 201 and streaming data processing unit 202;
the review scheduling unit 101 receives the review request from the product client/server unit 201, and schedules the machine review unit 102 and the manual review unit 103 to complete machine review and manual review respectively according to a certain logic, so as to implement initial review.
The review scheduling unit 101 receives the public sentiment alarm from the public sentiment sensing real-time monitoring unit 104, and adjusts the priority of the manual review unit 103 for specific content, thereby realizing high-priority review.
The public opinion perception real-time monitoring unit 104 receives the data of the streaming data processing unit 202 and sends a notification to the review scheduling unit 101 according to the set public opinion alarm threshold to initiate further review.
The content UGC platform 200 provides content review requirements and disposes the content according to business logic and content violation risks;
wherein low risk content is exposed to different users via the distribution system.
Optionally, the user publishes the content on the product client/server unit 201, and the platform is also responsible for personalized recommendation of the user content, attracting the user to consume and/or interact with the content, and further enhancing the generation of the UGC content.
Optionally, the streaming data processing unit 202 collects a log of the user's behavior at the product client and transmits it to the downstream system in real time.
Optionally, the machine review unit 102 completes the technical compliance review of the various forms of content of the user;
the manual review unit 103 performs review on the content that the machine cannot accurately discriminate.
In the content auditing system based on public opinion situation awareness according to the embodiment, the implementation principle and technical effect of the content auditing method based on public opinion situation awareness implemented by using the modules are the same as those of the related method embodiments, and the detailed description of the related method embodiments can be referred to, and is not repeated herein.
Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
In the following, embodiments of the electronic device of the present invention are described, which may be regarded as an implementation in physical form for the above-described embodiments of the method and apparatus of the present invention. Details described in the embodiments of the electronic device of the invention should be considered supplementary to the embodiments of the method or apparatus described above; for details which are not disclosed in embodiments of the electronic device of the invention, reference may be made to the above-described embodiments of the method or the apparatus.
Fig. 3 is a block diagram of an exemplary embodiment of an electronic device according to an embodiment of the present invention. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
Referring to fig. 3, the electronic device 310 includes a processor 311, an internal bus 312, a network interface 313, a memory 314, and a non-volatile memory 315, and may include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program, and forms the content auditing system 100 and the content UGC platform 200 on a logic level. Of course, besides the software implementation, the present invention does not exclude other implementations, such as logic devices or combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices. For example, the processor 111 may perform the steps shown in fig. 1.
Internal bus 312 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
It should be appreciated that although not shown, other hardware and/or software modules may be used in the electronic device 310, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
FIG. 4 is a schematic diagram of one computer-readable medium embodiment of the invention. As shown in fig. 4, the computer program may be stored on one or more computer readable media. The computer readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. The computer program, when executed by one or more data processing devices, enables the computer-readable medium to implement the above-described method of the invention, namely: a client request processing method and device based on multi-server docking, electronic equipment and a computer readable medium are provided. After receiving the request at the client, the user sends the request to a specific server for processing according to different types of the server. The method is convenient for modular development or later-stage transplantation, the codes are concise and clear, and the efficiency of information receiving and searching and the user experience level are greatly improved.
Compared with the traditional content auditing method, the method has the following advantages:
1. a small part of UGC content which is really likely to generate adverse effect is identified, and the pre-dependency on manual examination in the UGC process is greatly reduced; there may be a risk of about 10-15% of the UGC content, but the proportion of true exposures achieved is less than 1%, and the proportion of exposures achieved above 1000 is further reduced to within 0.5%.
2. By judging the second-level delay of machine review, a large amount of low-risk contents can pass the review quickly, the timely requirements of users on information sending and interaction are guaranteed, and high interactivity is achieved on the premise of compliance.
3. The flow efficiency of manual examination is greatly optimized, and the common manual examination is preposed before the content is really exposed, so that the value of one content cannot be effectively judged, and the examination delay of different content values basically converges; through the public opinion perception real-time monitoring unit 104 and the evaluation of the review scheduling unit 101 on the content risk level, the priority of the manual review queue of the specific content can be adjusted in real time, so that the high-quality review of the important content is ensured.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments of the present invention described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a computer-readable storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a data processing device (which can be a personal computer, a server, or a network device, etc.) execute the above-mentioned method according to the present invention.
In summary, the present invention can be implemented as a method, an apparatus, an electronic device, or a computer-readable medium executing a computer program. Some or all of the functions of the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP).
While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (4)

1. A content auditing method based on public opinion situation perception comprises the following steps:
s1, providing and/or setting a content auditing system (100) and a content UGC platform (200);
the content auditing system (100) comprises an auditing scheduling unit (101), a machine auditing unit (102), a manual auditing unit (103) and a public opinion perception real-time monitoring unit (104);
the content UGC platform (200) comprises a product client/server unit (201) and a streaming data processing unit (202);
wherein the low risk content is exposed to different users via the distribution system;
s2, the content UGC platform (200) provides the content examination requirement, and disposes the content according to the content violation risk and the business logic;
s3, the examination scheduling unit (101) receives the examination request from the product client/server unit (201), and schedules the machine examination unit (102) and the manual examination unit (103) to respectively complete machine examination and manual examination according to a certain logic so as to realize initial examination;
s4, the review scheduling unit (101) receives the public sentiment alarm from the public sentiment perception real-time monitoring unit (104), and adjusts the priority of the manual review unit (103) to specific content to realize high-priority review;
s5, the public opinion perception real-time monitoring unit (104) receives the data of the streaming data processing unit (202), and sends a notice to the examination scheduling unit (101) according to the set public opinion alarm threshold value to initiate further examination;
the user publishes the content on the product client/server unit (201), and the platform is simultaneously responsible for personalized recommendation of the user content, attracting the user to consume and/or interact with the content, and further promoting the generation of UGC content;
the streaming data processing unit (202) collects behavior logs of users at product clients and transmits the behavior logs to a downstream system in real time;
the machine examination unit (102) completes technical compliance examination of various forms of content of the user;
the manual review unit (103) is used for reviewing the contents which cannot be accurately judged by the machine;
establishing a decomposition model, decomposing the requests, analyzing the requests by using a logical algorithm, decomposing each request into (x, y, z) three-dimensional operators, establishing a logical relationship, operating the three-dimensional operators to form a mapping (x ', y ', z ') of the three-dimensional operators, and obtaining a logical scheduling relationship of the requests
logαn=(1/xn+1/yn+1/zn)×(xn’+yn’+zn’),
Where n is an integer representing the nth request, αnRepresenting a logic scheduling factor, setting a machine examination/manual examination factor beta according to the machine learning result or the result of detecting for multiple times, when alpha isn≥Beta, part of the content of the request is scheduled to be machine review; when alpha isn<Beta, the portion of the content of the request is scheduled for manual review.
2. A content auditing system based on public opinion situation awareness, comprising:
a content auditing system (100) and a content UGC platform (200);
the content auditing system (100) comprises an auditing scheduling unit (101), a machine auditing unit (102), a manual auditing unit (103) and a public opinion perception real-time monitoring unit (104);
the content UGC platform (200) comprises a product client/server unit (201) and a streaming data processing unit (202);
the examination scheduling unit (101) receives an examination request from the product client/server unit (201), and schedules the machine examination unit (102) and the manual examination unit (103) to respectively complete machine examination and manual examination according to a certain logic so as to realize initial examination;
the review scheduling unit (101) receives the public sentiment alarm from the public sentiment perception real-time monitoring unit (104), and adjusts the priority of the manual review unit (103) on specific content to realize high-priority review;
the public opinion perception real-time monitoring unit (104) receives data of the streaming data processing unit (202), and sends a notice to the examination scheduling unit (101) according to a set public opinion alarm threshold value to initiate further examination;
the content UGC platform (200) provides content review requirements, and disposes the content according to business logic and content violation risks;
the platform is simultaneously responsible for personalized recommendation of user content, attracts the user to consume and/or interact with the content, further promotes generation of UGC content, and the streaming data processing unit (202) collects behavior logs of the user at the product client and transmits the behavior logs to a downstream system in real time;
the machine examination unit (102) completes technical compliance examination of various forms of content of the user;
the manual review unit (103) is used for reviewing the contents which cannot be accurately judged by the machine;
establishing a decomposition model, decomposing the requests, analyzing the requests by using a logical algorithm, decomposing each request into (x, y, z) three-dimensional operators, establishing a logical relationship, operating the three-dimensional operators to form a mapping (x ', y ', z ') of the three-dimensional operators, and obtaining a logical scheduling relationship of the requests
logαn=(1/xn+1/yn+1/zn)×(xn’+yn’+zn’),
Where n is an integer representing the nth request, αnRepresenting a logic scheduling factor, setting a machine examination/manual examination factor beta according to the machine learning result or the result of detecting for multiple times, when alpha isn≥Beta, part of the content of the request is scheduled to be machine review; when alpha isn<Beta, the portion of the content of the request is scheduled for manual review.
3. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the method of claim 1.
4. A computer-readable storage medium, on which a computer program is stored which is executable by a processor to implement the method as claimed in claim 1.
CN201910690883.6A 2019-07-29 2019-07-29 Content auditing method, system, electronic equipment and medium based on public opinion situation perception Active CN110413842B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910690883.6A CN110413842B (en) 2019-07-29 2019-07-29 Content auditing method, system, electronic equipment and medium based on public opinion situation perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910690883.6A CN110413842B (en) 2019-07-29 2019-07-29 Content auditing method, system, electronic equipment and medium based on public opinion situation perception

Publications (2)

Publication Number Publication Date
CN110413842A CN110413842A (en) 2019-11-05
CN110413842B true CN110413842B (en) 2021-07-27

Family

ID=68363932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910690883.6A Active CN110413842B (en) 2019-07-29 2019-07-29 Content auditing method, system, electronic equipment and medium based on public opinion situation perception

Country Status (1)

Country Link
CN (1) CN110413842B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10944805B1 (en) * 2020-08-05 2021-03-09 Agora Lab, Inc. Scalable multi-level collaborative content moderation
CN112182502A (en) * 2020-09-07 2021-01-05 支付宝(杭州)信息技术有限公司 Compliance auditing method, device and equipment
CN114374857A (en) * 2020-10-15 2022-04-19 腾讯科技(深圳)有限公司 Content distribution method, device, server and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1870750A (en) * 2006-04-29 2006-11-29 北京北大方正电子有限公司 Alarm processing system and method
CN106447239A (en) * 2016-11-21 2017-02-22 北京字节跳动科技有限公司 Auditing method and device for data release
CN107506454A (en) * 2017-08-29 2017-12-22 央视国际网络无锡有限公司 A kind of computer version and multi-media information security automatic early-warning system
CN107943864A (en) * 2017-11-10 2018-04-20 阿基米德(上海)传媒有限公司 Safely controllable intelligent recommendation system under a kind of content of multimedia media
CN108055289A (en) * 2018-01-30 2018-05-18 深圳市富途网络科技有限公司 A kind of method and system audited to user-generated content based on internet

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8108429B2 (en) * 2004-05-07 2012-01-31 Quest Software, Inc. System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1870750A (en) * 2006-04-29 2006-11-29 北京北大方正电子有限公司 Alarm processing system and method
CN106447239A (en) * 2016-11-21 2017-02-22 北京字节跳动科技有限公司 Auditing method and device for data release
CN107506454A (en) * 2017-08-29 2017-12-22 央视国际网络无锡有限公司 A kind of computer version and multi-media information security automatic early-warning system
CN107943864A (en) * 2017-11-10 2018-04-20 阿基米德(上海)传媒有限公司 Safely controllable intelligent recommendation system under a kind of content of multimedia media
CN108055289A (en) * 2018-01-30 2018-05-18 深圳市富途网络科技有限公司 A kind of method and system audited to user-generated content based on internet

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Log auditing through model-checking;M. Roger;《 Proceedings. 14th IEEE Computer Security Foundations Workshop, 2001》;IEEE;20020807;220-234 *
用户评论的质量检测与控制研究综述;林煜明;《软件学报》;20131128(第3期);506-527 *

Also Published As

Publication number Publication date
CN110413842A (en) 2019-11-05

Similar Documents

Publication Publication Date Title
US20200356615A1 (en) Method for determining news veracity
JP6167493B2 (en) Method, computer program, storage medium and system for managing information
US8756178B1 (en) Automatic event categorization for event ticket network systems
US9201928B2 (en) Assessing quality of reviews based on online reviewer generated content
CN110413842B (en) Content auditing method, system, electronic equipment and medium based on public opinion situation perception
KR20180099812A (en) Identifying entities using the deep learning model
US9524526B2 (en) Disambiguating authors in social media communications
WO2017075017A1 (en) Automatic conversation creator for news
TW201443812A (en) Social media impact assessment (2)
US20220261821A1 (en) Reputation Management And Machine Learning Systems And Processes
US11275994B2 (en) Unstructured key definitions for optimal performance
US11068476B2 (en) Determining whether to take an action by applying a metric calculated using natural language processing tokens
CN112131322A (en) Time series classification method and device
US11288293B2 (en) Methods and systems for ensuring quality of unstructured user input content
US10621261B2 (en) Matching a comment to a section of a content item based upon a score for the section
JP2007172173A (en) Information providing method and device and program and computer-readable recording medium
Gezici et al. Neural sentiment analysis of user reviews to predict user ratings
US20210390256A1 (en) Methods and systems for multiple entity type entity recognition
US20230186212A1 (en) System, method, electronic device, and storage medium for identifying risk event based on social information
US10237226B2 (en) Detection of manipulation of social media content
US11222143B2 (en) Certified information verification services
WO2019242453A1 (en) Information processing method and device, storage medium, and electronic device
US20200019975A1 (en) Reputation management
US20200169588A1 (en) Methods and systems for managing distribution of online content based on content maturity
JP2019194793A (en) Information processing apparatus and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20211129

Granted publication date: 20210727

PP01 Preservation of patent right
PD01 Discharge of preservation of patent

Date of cancellation: 20220602

Granted publication date: 20210727

PD01 Discharge of preservation of patent