CN114282541A

CN114282541A - Live broadcast platform information security detection method and device, equipment, medium and product thereof

Info

Publication number: CN114282541A
Application number: CN202111591498.XA
Authority: CN
Inventors: 朱旺南; 陈海峰; 李锦春
Original assignee: Guangzhou Jinhong Network Media Co ltd; Guangzhou Cubesili Information Technology Co Ltd
Current assignee: Guangzhou Jinhong Network Media Co ltd; Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2021-12-23
Filing date: 2021-12-23
Publication date: 2022-04-05

Abstract

The application discloses a live broadcast platform information safety detection method and a device, equipment, a medium and a product thereof, wherein the method comprises the following steps: receiving a public release message submitted by an application program pre-agreed with a live broadcast platform, wherein the public release message comprises characteristic information of the application program, a release user and a public release text; calling a filtering rule corresponding to the characteristic information of the application program from a preset rule base, wherein the filtering rule comprises the characteristic information of the application program, classification dictionary designation information adopted by the application program and a classification dictionary penalty type; filtering the public release text according to a filtering rule, and determining one or more corresponding penalty rule types; and calling a penalty processing interface corresponding to the penalty type of the highest level to process the public release message, and returning a corresponding notification message to the release user of the application program. The method and the device can realize decoupling and cooperation of various configured resources and service logics in the information security detection process, and can improve the information security of the live broadcast platform.

Description

Live broadcast platform information security detection method and device, equipment, medium and product thereof

Technical Field

The present application relates to the field of network live broadcast technologies, and in particular, to a live broadcast platform information security detection method, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.

Background

In the network live broadcast ecology, a large amount of text service information such as data, titles, barrage, comments and the like exist, along with the increasing and drastic increase of users and information quantity, the junk information and the network supervision content are also flooded, and the user communication is hindered and the ecological health of the platform is also influenced.

Filtering spam as a primary way to improve network information availability has become an important task in the internet field. The live spam filtering technology includes regular filtering and machine learning filtering, but the most basic filtering technology is keyword-based filtering.

In the traditional business keyword filtering, a plurality of dictionaries (such as A1, B1 and C1) are hung by one business, and different decision logics are executed on the hit results of different dictionaries. One more service is provided with a plurality of dictionaries (such as A1, B2 and C2, wherein B2& B1 and C2& C1 are dictionaries of the same type, such as pornography dictionaries). As services increase, each service may have a pornography dictionary with numerous duplicate contents, or keywords entered according to different criteria.

For example, in a review service, when filtering keywords, service operation divides a penalty keyword library, a delete review keyword library, a hide review keyword library, and the like according to scene needs, and keyword filtering is respectively matched with the above word libraries. Thus, the service subsequently adds words to each word stock according to different handling means. The advantage is that directly perceived easily can add the vocabulary entry as required newly, and its shortcoming is more obvious, promptly: 1. various types of words can exist in a single word bank, for example, words with severe pornographic attributes and malicious \35881, abuse and other attributes exist in a punishment bank; 2. a single word may be repeatedly added to multiple word repositories; 3. when a new service is added, the word stock is not universal and can not be reused (for example, the comment and punishing keywords of the bullet screen have intersection but are not completely consistent), and the new service needs to maintain a new dictionary; 4. when a penalty means is added, the lexicon may be split and combined again.

The other traditional keyword filtering mode is used for commenting a business scene, directly monitoring the types of pornography, vulgary and the like, word banks are respectively established according to the monitoring types, and each word bank corresponds to different handling means. And when filtering the keywords, executing a corresponding treatment means according to the type matching result. Its advantages are that the division according to the supervision range is also intuitive and easy to understand, and its disadvantages are more obvious, for example: 1. the supervision word library established for the business scene is only used under the business and cannot be commonly used under each business scene (for example, 35881 which cannot occur in comments, and curse words which may allow occurrence in barrage). Such different services require the maintenance of different \35881-libraries of expurrenes. 2. The thesaurus and the treatment means are not common (such as < 3 > 35881 in comments, < 3 > curse, the treatment means is deletion, and < 35881 in private chat, and the < 3 > curse treatment means is not punishment).

In a network live broadcast ecology, a live broadcast platform operates a plurality of application program products, various application scenes exist, service scenes with text information are very many, and meanwhile, text filtering under different scenes has different classification criteria and punishment modes; in addition, with the daily accumulation of entries, the word bank becomes a complex task for content operation, the entries need to be classified and business related manually, and the configuration of various filtering rules can be mastered skillfully only by a large number of hands and a certain accumulation of work. In addition, professional division of labor occurs in business operation and content security operation, the business belongs to two different teams, the business does not know word penalty scales and standards, and the content security does not know which keyword dictionaries need to be hooked and what hit penalty is executed. Therefore, these features in reality require that the live broadcast platform provides a more efficient management system based on its own features, and therefore, the update of the related art becomes an expectation in the industry.

Disclosure of Invention

A primary object of the present application is to solve at least one of the above problems and provide a live broadcast platform information security detection method and a corresponding apparatus, computer device, computer readable storage medium, and computer program product.

In order to meet various purposes of the application, the following technical scheme is adopted in the application:

the method for detecting the information security of the live broadcast platform, which is suitable for one of the purposes of the application, comprises the following steps:

receiving a public release message submitted by an application program pre-agreed with a live broadcast platform, wherein the public release message comprises characteristic information of the application program, a release user and a public release text;

calling a filtering rule corresponding to the characteristic information of the application program from a preset rule base, wherein the filtering rule comprises the characteristic information of the application program, classification dictionary designation information adopted by the application program and a classification dictionary penalty type;

filtering the publicly released text according to the filtering rules, and determining one or more corresponding penalty types when the keywords in the publicly released text hit the entries of one or more classification dictionaries of the specified information;

and calling a penalty rule processing interface corresponding to the penalty rule type with the highest grade in the determined penalty rule types to process the public release message, and returning a corresponding notification message to the release user of the application program.

In an embodiment, the step of receiving the public release message submitted by the application program pre-agreed with the live broadcast platform includes the following steps:

concurrently receiving message publishing requests of application programs pre-agreed with a live broadcast platform, and orderly adding the message publishing requests to a request queue;

controlling the request queue to flow out the message publishing request according to a preset dequeuing rule, and starting a consuming thread corresponding to the message publishing request;

and analyzing the message publishing request by the consuming thread to obtain the public publishing message in the message publishing request.

In an embodiment, the filtering the publication text according to the filtering rule includes the following steps:

calling a named entity recognition model which is trained to be in a convergence state in advance to extract a plurality of named entities from the public release text;

matching the named entities with all preset classification dictionaries, determining the named entities hitting any entry of any one of the classification dictionaries as hit keywords obtained by publicly releasing text word segmentation, and obtaining mapping relation data between the hit keywords and the classification dictionaries;

and determining the corresponding penalty type according to the filtering rule corresponding to the classification dictionary in the mapping relation data between the hit keyword and the classification dictionary.

In a further embodiment, the named entity recognition model performs the following steps:

vectorizing the public release text to obtain corresponding coding information;

extracting the features of the coded information to obtain a text feature vector representing deep semantic information of the coded information;

and carrying out sequence marking on the publicly issued text according to the text feature vector, and extracting each named entity in the publicly issued text.

In an embodiment, invoking a penalty processing interface corresponding to a highest-level penalty type of the determined penalty types to process the public release message and return a corresponding notification message to the releasing user of the application program, includes the following steps:

comparing a plurality of penalty types, and determining the penalty type with the highest level;

calling a penalty rule processing interface corresponding to the penalty rule type of the highest level;

the penalty processing interface processes the publicly issued text according to preset service logic, wherein the process comprises shielding keywords of the hit entry or deleting the full text of the publicly issued text and returning a processing result;

and generating a corresponding notification message according to the processing result, and sending the notification message to the instant messaging interface of the publishing user.

In an extended embodiment, the method further comprises the following steps:

responding to remote calling of the first service interface, pushing a rule base editing page to corresponding terminal equipment, wherein the page comprises classification dictionary list information and penalty type list information;

acquiring one or more filter rules submitted based on the rule base editing page, and storing the filter rules in the rule base;

responding to the remote call of the second service interface, pushing a classification dictionary editing page to corresponding terminal equipment, wherein the page contains classification dictionary list information;

acquiring one or more entries submitted based on the classification dictionary editing page and a classification dictionary specified from the classification dictionary list information, and storing the entries in the classification dictionary.

Adapt to one of the purpose of this application and provide a live broadcast platform information safety inspection device, include: the system comprises a message acquisition module, a rule matching module, a safety detection module and a penalty processing module, wherein the message acquisition module is used for receiving a public release message submitted by an application program pre-agreed with a live broadcast platform, and the public release message comprises characteristic information of the application program, a release user and a public release text; the rule matching module is used for calling a filtering rule corresponding to the characteristic information of the application program from a preset rule base, wherein the filtering rule comprises the characteristic information of the application program, the classification dictionary designation information adopted by the application program and the classification dictionary penalty type; the safety detection module is used for filtering the publicly released text according to the filtering rules, and determining one or more corresponding penalty types when the keywords in the publicly released text hit the entries of one or more classification dictionaries of the specified information; and the penalty processing module is used for calling a penalty processing interface corresponding to the penalty type with the highest grade in the determined penalty types to process the public release message and return a corresponding notification message to the release user of the application program.

In an embodiment, the message obtaining module includes: the concurrent receiving submodule is used for concurrently receiving the message publishing request of the application program pre-agreed with the live broadcast platform and orderly adding the message publishing request to the request queue; the request dequeuing submodule is used for controlling the request queue to flow out the message publishing request according to a preset dequeuing rule and starting a consumption thread corresponding to the message publishing request; and the message consumption sub-module is used for analyzing the message publishing request by the consumption thread to obtain the public publishing message in the message publishing request.

In an embodied embodiment, the security detection module includes: the entity recognition submodule is used for calling a named entity recognition model which is trained to be in a convergence state in advance to extract a plurality of named entities from the public release text; the dictionary matching submodule is used for matching the named entity with each preset classification dictionary, determining the named entity which hits any entry of any one of the classification dictionaries as a hit keyword obtained by publicly releasing text word segmentation, and obtaining mapping relation data between the hit keyword and the classification dictionaries; and the penalty determination submodule is used for determining the corresponding penalty type according to the filtering rule corresponding to the classification dictionary in the mapping relation data between the hit keyword and the classification dictionary.

In a further embodiment, the named entity recognition model is configured to operate as a control device, the control device comprising: the coding processing unit is used for vectorizing the public release text to obtain corresponding coding information; the characteristic extraction unit is used for extracting the characteristics of the coded information to obtain a text characteristic vector representing deep semantic information of the coded information; and the entity extraction unit is used for carrying out sequence marking on the publicly issued text according to the text characteristic vector and extracting each named entity in the publicly issued text.

In an embodiment, the penalty processing module includes: the penalty rule preference sub-module is used for comparing a plurality of penalty rule types and determining the penalty rule type with the highest grade; the interface calls the submodule, is used for calling and operating the penalty rule processing interface corresponding to penalty rule type of the highest grade; the interface response submodule is used for processing the publicly issued text by the penalty processing interface according to preset service logic, wherein the processing comprises shielding keywords of hit terms or deleting the full text of the publicly issued text and returning a processing result; and the notification delivery sub-module is used for generating a corresponding notification message according to the processing result and sending the notification message to the instant messaging interface of the publishing user.

In the embodiment of extension, the live broadcast platform information safety inspection device of this application still includes: the first service module is used for responding to the remote call of the first service interface and pushing a rule base editing page to the corresponding terminal equipment, and the page contains the classified dictionary list information and the penalty type list information; the first updating module is used for acquiring one or more filtering rules submitted based on the rule base editing page and storing the filtering rules in the rule base; the second service module is used for responding to remote calling of the second service interface and pushing a classification dictionary editing page to the corresponding terminal equipment, and the page contains classification dictionary list information; and the second updating module is used for acquiring one or more entries submitted based on the classification dictionary editing page and a classification dictionary appointed by the classification dictionary list information and storing the entries in the classification dictionary.

The computer device comprises a central processing unit and a memory, wherein the central processing unit is used for calling and running a computer program stored in the memory to execute the steps of the live broadcast platform information security detection method.

A computer-readable storage medium is provided, which stores, in the form of computer-readable instructions, a computer program implemented according to the method for detecting information security of a live broadcast platform, where the computer program is invoked by a computer to execute the steps included in the method.

A computer program product, provided to adapt to another object of the present application, comprises computer programs/instructions which, when executed by a processor, implement the steps of the method described in any of the embodiments of the present application.

Compared with the prior art, the application has the following advantages: the method processes the user issued messages in the application program of the live broadcast platform by the multi-step labor division cooperation, comprises the steps of uniformly receiving the user issued messages of various application programs pre-agreed by the live broadcast platform, calling the corresponding filtering rules of the application programs through a uniform rule base, calling a uniform classification dictionary according to the corresponding filtering rules for matching so as to determine the corresponding penalty types, and carrying out the collection and centralized processing on the penalty types according to the uniform mechanism centralized processing, the processing steps, the filtering rules, the classification dictionaries, the penalty processing mechanisms and the like are mutually independent and matched, logical decoupling and cooperation are realized, the live broadcast platform is allowed to realize flexible and centralized information safety detection technical service, the information safety detection capability of the live broadcast platform is improved, and the information safety in the ecology of the live broadcast platform is ensured.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flowchart of an exemplary embodiment of a live broadcast platform information security detection method according to the present application;

FIG. 2 is a flowchart illustrating a process of processing concurrent message publishing requests through a request queue according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating a process for determining a penalty type using a named entity recognition model in an embodiment of the present application;

FIG. 4 is a schematic diagram of a network architecture of an exemplary named entity recognition model of the present application;

FIG. 5 is a flow diagram illustrating an exemplary named entity recognition model recognition process according to the present application;

FIG. 6 is a flowchart illustrating a process of performing a penalty process according to a penalty type in an embodiment of the present application;

FIG. 7 is a flowchart illustrating a process of opening filtering rules and classification dictionary maintenance functions through two service interfaces in an embodiment of the present application;

fig. 8 is a schematic block diagram of a live broadcast platform information security detection apparatus according to the present application;

fig. 9 is a schematic structural diagram of a computer device used in the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, and may also be a smart tv, a set-top box, and the like.

The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.

It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.

One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.

Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.

Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.

The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, it is proposed based on the same inventive concept, and therefore, concepts of the same expression and concepts of which expressions are different but are appropriately changed only for convenience should be equally understood.

The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.

The information security detection method for the live broadcast platform can be programmed into a computer program product and is realized by deploying the computer program product in a server for operation, so that the method can be executed by accessing an open interface after the computer program product is operated and performing man-machine interaction with a process of the computer program product through a graphical user interface.

Referring to fig. 1, in an exemplary embodiment of the method for detecting information security of a live broadcast platform, the method includes the following steps:

step S1100, receiving a public release message submitted by an application program pre-agreed with a live broadcast platform, wherein the public release message comprises characteristic information of the application program, a release user and a public release text:

in an exemplary application scenario, a computer program product implemented according to the technical solution of the present application is deployed in a server of a server cluster of a live webcast platform, and provides open services for a large number of terminal application programs implemented by the live webcast platform according to a pre-protocol, where the application programs respectively serve different services, but can transparently transmit a public release message submitted from a terminal device of the server according to the protocol. In short, an application can be regarded as an application pre-agreed with the live broadcast platform of the present application as long as the application can submit the public release message to the server of the present application and receive corresponding feedback according to the specification of the present application.

For example, in one of the live broadcast applications, text information input by a user of the client device in a live broadcast chat interface of a live broadcast room presented by the application is used as a publicly published text, and is transmitted to the server after being processed by the service logic of the client device, and the service logic implemented by the technical scheme of the present application is subjected to subsequent processing. For an application running on a client device that is pre-agreed with the live platform, it encapsulates the user's input as a publicly published message, to which is added the current application's characteristic information (AppID), the publishing user (which may be represented as the current user's user ID), and the publicly published text containing the user input. And then the data can be directly submitted to the server of the application, or can be forwarded to the server of the application through other background servers for supporting the implementation of the business logic of the current application.

For another example, in an application program pre-agreed by the live broadcast platform for providing a short video service, a user of the client device inputs a comment text in a comment page of a short video played by the application program, and the comment text can be used as a public published text to encapsulate the public published message, and the public published message is directly or indirectly transmitted to the server of the application for processing.

As for the server of the present application, the server can concurrently respond to the public published messages submitted by the application programs of various pre-agreements, so as to implement concurrent response to the massive public published messages generated by the application programs.

Step S1200, invoking a filtering rule corresponding to the feature information of the application program from a preset rule base, where the filtering rule includes the feature information of the application program, the classification dictionary designation information adopted by the application program, and a classification dictionary penalty type:

in order to serve for information security detection of mass publicly-published messages generated by a large number of application programs, a rule base is pre-constructed for storing filtering rules which can be edited by a background, and the filtering rules can be used for executing keyword filtering of publicly-published texts in the publicly-published messages so as to ensure information security.

In order to match the filtering rules, the application is also pre-constructed with a plurality of classification dictionaries, each of which is used for storing keywords corresponding to a security attribute, wherein the security attributes comprise 'pornography', 'violence', '35881', 'abuse', 'low custom', and the like, which are all preset for ensuring the healthy development of the information ecology of the live broadcast platform.

The filtering rules and the classification dictionaries are uniformly set, and the information safety detection of all application programs of the whole live broadcast platform pre-agreement is centrally and uniformly served. Accordingly, for each application program, the classification dictionary and the corresponding penalty rules adopted by the application program can be configured in advance, and one or more filtering rules associated with the application program can be constructed. Accordingly, each filtering rule may be implemented to include feature information of an application program, classification dictionary designation information corresponding to a classification dictionary used by the application program, and a penalty type corresponding to a published text submitted by the application program after hitting the classification dictionary, the penalty type being used to define a processing action to be performed on a corresponding published message after hitting the classification dictionary, and these elements are organized as mapping relationship data and stored in the rule base for calling.

Those skilled in the art can construct the rule base according to the above-disclosed construction principle, and provide a background management page accessing the filter rules in the rule base, so as to edit and update the filter rules in the rule base through the background management page, or delete one or more of the filter rules, or add a new filter rule to the rule base. For example, when a pre-agreed application program is newly added to the live broadcast platform, the corresponding filtering rule for the application program can be newly added to the rule base.

When information security detection is required to be carried out on the publicly released text in any publicly released message, firstly, according to the characteristic information of an application program carried in the publicly released message, a filtering rule containing the characteristic information is searched from the rule base, and a plurality of filtering rules corresponding to one application program are available and are respectively used for correspondingly appointing a plurality of classification dictionaries, and concentrated processing is called out one by one, so that information filtering based on key words is carried out on the publicly released text according to the filtering rules.

Step S1300, filtering the publicly released text according to the filtering rule, and determining one or more penalty types corresponding to the keywords in the publicly released text when the keywords in the publicly released text hit the entries of the one or more classification dictionaries of the specific information:

as described above, each filtering rule includes classification dictionary specifying information for specifying the classification dictionary specified by the filtering rule, and therefore, after the corresponding classification dictionary is specified based on the specifying information, it is detected in the classification dictionary whether the published text includes any one or more entries in the classification dictionary, that is, whether one or more keywords hitting the classification dictionary specified by the specifying information exist in the published text, and once the determination is made, it is indicated that the published text includes violation information of the security attribute category corresponding to the classification dictionary, and therefore, a penalty is required to be applied thereto. Since the penalty is represented as a penalty type in the filtering rule, a corresponding interface call is subsequently made according to the penalty type to perform a penalty action for the publicly issued message.

When detecting whether each filtering rule contains the operation of hitting the vocabulary entry of a certain classification dictionary, the method can be implemented by using the principles of precise matching, fuzzy matching, semantic matching and the like, technically, the method can be implemented by adopting regular rule matching or adopting a keyword extraction technology based on a neural network model in the natural language processing technology, and therefore, the method can be flexibly changed by the technical personnel in the field.

For multiple filtering rules corresponding to the same application program, after filtering is performed according to each filtering rule, if at least a part of the filtering rules match the penalty type, the penalty types can be collectively transmitted to the next step for centralized processing, so as to achieve decoupling and cooperation of the filtering logic and the processing logic.

Step S1400, invoking a penalty processing interface corresponding to the highest level penalty type in the determined penalty types to process the public release message, and returning a corresponding notification message to the release user of the application program:

the penalty rule types are preset to be various and correspond to different penalty levels respectively, and in the application, the same service logic is implemented uniformly, so that the penalty rule types output from the previous step are only applicable to the penalty rule type with the highest processing level. That is, whether a publication triggers multiple penalties, it is always executed with the highest of the penalty levels. These processing levels can be from low to high, such as "mask", "intercept", "discount", "penalty", etc., for which the keywords of the corresponding trigger penalty can be directly replaced with other mask characters; aiming at the interception item, the public release message is directly filtered out and is not pushed to the corresponding release address and page; for the 'deduction' item, deducting the integral of a preset value from the personal integral account of the issuing user; the authority of the publishing user to publish information may be modified for the "penalty" item. And so on, may be flexibly implemented by those skilled in the art.

The method comprises the steps of realizing the penalty service logic corresponding to each penalty type as corresponding penalty processing interfaces in advance, directly calling the penalty processing interface corresponding to the penalty type when a specific penalty type is required to be applied, and executing the corresponding penalty service logic by the penalty processing interface to an initiator of the public issuing message, namely the issuing user. When the called penalty processing interface finishes processing, a corresponding notification message can be constructed and returned to the issuing user to prompt the warning.

In the present application, the step S1300 of performing filtering may be implemented as a filtering engine of unified service, and each penalty processing interface of performing processing may also be implemented as each corresponding standardized service interface, so that not only the two processing logics are decoupled, but also the two processing logics are allowed to cooperate with each other through interface calling, thereby satisfying the purpose of the present application.

Through the exemplary embodiment, it can be known that the method has rich positive advantages, and specifically, the method processes the user release message in the application program of the live broadcast platform through a plurality of steps and work division cooperation, including uniformly receiving the user release message of various application programs pre-agreed by the live broadcast platform, calling the corresponding filtering rules of the application programs through a uniform rule base, calling a uniform classification dictionary according to the corresponding filtering rules for matching to determine the corresponding penalty types, and performing centralized processing on the penalty types according to the uniform mechanism, wherein each processing step, the filtering rules, the classification dictionary, the penalty processing mechanisms and the like are independent and cooperate with each other to realize logical decoupling and cooperation, allow the live broadcast platform to realize flexible and centralized information security detection technical service, and improve the information security detection capability of the live broadcast platform, and the information safety in the ecology of the live broadcast platform is ensured.

Referring to fig. 2, in an embodiment, the step S1100 of receiving the public announcement message submitted by the application program pre-agreed with the live broadcast platform includes the following steps:

step S1110, concurrently receiving a message publishing request of an application pre-agreed with the live broadcast platform, and orderly adding the message publishing request to a request queue:

the server of the application can construct a standardized receiving response module for uniformly receiving message publishing requests submitted by various application programs of a pre-protocol of a live broadcast platform, wherein the message publishing requests are used for packaging the public publishing messages. In order to solve the conflict caused by the concurrency of the requests, a request queue is also constructed for orderly processing a large number of message issuing requests of multiple sources.

Step S1120, controlling the request queue to flow out the message publishing request according to a preset dequeuing rule, and starting a consuming thread corresponding to the message publishing request:

the request queue can apply a first-in first-out rule, and preferably issues the message issuing request which is firstly queued, so that each message issuing request in the queue continuously flows out of the queue, and a consumption thread corresponding to the message issuing request is started to consume the message issuing request.

Step S1130, the consuming thread analyzes the message publishing request to obtain the public publishing message therein:

the consumption thread is firstly responsible for analyzing the corresponding message publishing request to obtain the public publishing message therein and further obtain the public publishing text in the public publishing message.

The embodiment realizes a standardized concurrent response mechanism, serves message issuing requests submitted by massive issuing users of a large number of pre-agreed application programs in a live broadcast platform, enables multi-source information to be orderly processed in parallel, assists in realizing unified and centralized detection of the information of the whole live broadcast platform, and can realize the centralized and unified information security detection service of the whole platform by only needing to unify the response requests and perform information security detection according to filtering rules without independently establishing own information security detection logic and establishing a special dictionary corresponding to own services and establishing own special dictionaries for various penalties.

Referring to fig. 3, in an embodiment, the step S1300 of filtering the publication text according to the filtering rule includes the following steps:

step S1310, invoking a named entity recognition model trained to a convergence state in advance to extract a plurality of named entities from the published text:

in the embodiment, considering the existence of the full-platform mass public release message, in order to improve the detection accuracy and the detection efficiency, a named entity recognition model based on a neural network architecture is adopted to implement the filtering of the public release text according to the filtering rule.

Specifically, the named entity recognition model can adopt a network architecture as shown in fig. 4, which is composed of a file feature extraction network and a conditional random field network. The text feature extraction network can adopt basic models such as LSTM (long-short time memory model), Lattice LSTM, transform, Bert and the like to act, and is used for extracting features of the embedded vectors of the public release texts input into the text feature extraction network to obtain corresponding deep semantic information of the embedded vectors; the conditional random field network, namely a CRF network, is used for carrying out part-of-speech tagging on the publicly issued text based on the deep semantic information so as to obtain a plurality of named entities, namely keywords which are matched with each other, according to tagging results.

The named entity recognition model is trained to be in a convergence state in advance, so that the named entity recognition model has the capability of extracting a plurality of named entities from the input public published text, the extracted named entities serve as keywords to be matched, keyword matching is carried out on the extracted named entities and a classification dictionary determined according to the filtering rule, a matching result is obtained, and a penalty rule is applied according to the matching result.

Step S1320, matching the named entity with each preset classification dictionary, determining the named entity hitting any entry of any one of the classification dictionaries as a hit keyword obtained by publicly releasing text segmentation, and obtaining mapping relationship data between the hit keyword and the classification dictionary:

referring to fig. 4, a plurality of named entities identified by the conditional random field model are matched with the classification dictionaries corresponding to the filtering rules one by one according to the filtering rules corresponding to the application program specified in the public release message, and the precise matching is generally adopted here.

Step S1330, determining a penalty type corresponding to the filtering rule corresponding to the classification dictionary in the mapping relationship data between the hit keyword and the classification dictionary:

since the application program submitting the public release text may be configured with a plurality of filtering rules, each filtering rule may specify a different classification dictionary and a different penalty type, and different keywords in the previous step may hit different classification dictionaries corresponding to different filtering rules, a plurality of pieces of the mapping relationship data may be obtained, which respectively correspond to a plurality of different classification dictionaries, and similarly, respectively correspond to different penalty types. A set of these penalty types is obtained for subsequent alternative processing.

The method is based on a semantic matching principle, a named entity recognition model based on a neural network architecture is applied to carry out word segmentation on the publicly issued text to obtain the named entities in the publicly issued text, then the named entities are matched with a preset classification dictionary to use the named entities as candidate matching keywords, penalty types corresponding to the classification dictionary with entries matched with the candidate matching keywords are extracted, whether the keywords violate rules or not is recognized based on deep semantic information, the intelligent degree of information safety detection can be improved, and the information safety detection efficiency is improved.

Referring to fig. 5, in a further embodiment, the named entity recognition model performs the following steps:

step S2100, vectorizing the publication text to obtain corresponding encoding information:

this step is intended to encode the publicly released text, to implement vectorization, and to obtain a corresponding embedded vector. There are many techniques for encoding text in the prior art, which can be flexibly implemented by those skilled in the art. And encoding to enable the embedded vector to contain encoding information corresponding to each character of the public release text, wherein the encoding information of each character contains a word vector of the character and word vectors of all possible word segmentations of the character.

Step S2200, extracting the characteristics of the coded information to obtain a text characteristic vector representing the deep semantic information:

the text feature extraction module is preferably implemented by using Lattice LSTM, and the module refers to context to perform representation learning on the embedded vector of the public release text obtained by pre-coding to obtain a corresponding text feature vector.

Step S2300, carrying out sequence annotation on the publicly released text according to the text feature vector, and extracting each named entity:

and inputting the text feature vector into a conditional random field module (CRF) for part-of-speech tagging, predicting by combining a probability matrix output by Lattice LSTM and a state transition matrix of CRF under the action of the conditional random field module to finish part-of-speech tagging, and extracting a plurality of named entities in the public release text according to part-of-speech tagging results.

In this embodiment, LSTM can also be replaced by a transform kernel based model such as Bert. In addition, although the models can also independently serve as the task of part-of-speech tagging, the combination of the conditional random field can remarkably improve the accuracy of named entity extraction, and therefore, the named entity extraction is recommended.

In the embodiment, with the help of a specific neural network architecture, on the basis of vectorizing the open publication text to obtain the effectively-represented embedded vector, the accuracy of deep semantic information extraction and named entity identification is improved, so that the total amount of data samples required by the training process of the corresponding named entity identification model can be reduced, the model is easier to be trained to a convergence state, the model training efficiency is improved, and the model training cost is saved.

Referring to fig. 6, in an embodied embodiment, the step S1400 of invoking a penalty handling interface corresponding to a highest level penalty type in the determined penalty types to process the public issue message and return a corresponding notification message to the issuing user of the application program includes the following steps:

step S1410, comparing a plurality of penalty types, and determining the penalty type with the highest grade:

in each embodiment of step S1300, a set of penalty types is finally output, and the penalty types are matched with priorities in advance, that is, penalty levels, according to a default rule, when a plurality of penalty types compete with each other, the penalty type with the highest level is determined as the penalty type that needs to be executed finally, so as to avoid performing multiple penalties for the same openly issued message.

Step S1420, invoking a penalty processing interface corresponding to the penalty type running the highest level:

as described above, each penalty type is preset with a corresponding penalty processing interface, so that a processing service logic corresponding to the penalty type is realized.

Step S1430, the penalty processing interface processes the publicly issued text according to the preset service logic, including shielding the keywords of the hit entry or deleting the full text of the publicly issued text, and returning the processing result:

the penalty processing interface performs corresponding processing on the publicly released text according to the preset service logic, specifically, for example, a keyword hitting a vocabulary entry therein may be shielded or the publicly released text full text may be deleted, and then a corresponding processing result is returned. In the example, the information publishing authority of the publishing user of the public publishing text is not controlled as other embodiments, so that the service logic of penalty processing can be simplified, and the information security detection efficiency is improved.

Step S1440, generating a corresponding notification message according to the processing result, and sending the notification message to the instant messaging interface of the publishing user:

according to the notification message returned by the penalty processing interface, the notification message can be sent to an instant messaging interface of a publishing user submitting the public publishing text, for example, in a live broadcast room, the notification message can be a message notification interface corresponding to the user, so that the corresponding notification message is displayed and read in a message notification board of an application program diagram of the publishing user.

The embodiment simplifies the service logic of penalty processing, selects one from the matching conditions of naming a plurality of classification dictionaries by a plurality of keywords, simplifies the implementation logic of the whole system, is convenient to develop and implement, and can improve the development efficiency of software engineering.

Referring to fig. 7, in the extended embodiment, the requirement of professional division of labor is met, and the decoupling relationship of each service logic implemented in the present application is strengthened, so that the live broadcast platform information security detection method of the present application further includes the following steps:

step S3100, responding to the remote call of the first service interface, pushing a rule base editing page to the corresponding terminal equipment, wherein the page comprises classification dictionary list information and penalty type list information:

in order to facilitate a first administrative user responsible for maintaining the rule base for full time to maintain the filtering rules in the rule base in a centralized manner, a first service interface is constructed, and after the first administrative user accesses and remotely calls the first service interface through a page, the rule base editing page pushed to the terminal equipment by the server can be obtained. In the editing page, each data item of each stored filtering rule in the rule base can be displayed in a list form, that is, application program characteristic information, classification dictionary designation information, classification dictionary penalty type and the like corresponding to each filtering rule are displayed; and at the same time, the first management user is allowed to perform operations of adding, modifying, deleting and the like of the data table therein. A pull-down list can be provided corresponding to one item of the classification dictionary designation information, and the pull-down list lists the classification dictionary list information of each preset classification dictionary, thereby facilitating the management user to quickly select the classification dictionary therein. Similarly, the data item corresponding to the penalty rule type can also provide a corresponding drop-down list for displaying various penalty rule types, thereby providing the penalty rule type list information.

Step S3200, obtaining one or more filtering rules submitted based on the rule base editing page, and storing the filtering rules in the rule base:

after the first administrative user performs the editing processing, including adding, deleting or modifying, on one or more filtering rules in the rule base editing page, a submission instruction may be triggered to submit to the server of the present application, and the server of the present application stores the submission instruction in the rule base.

Step S3300, responding to the remote call to the second service interface, pushing a classification dictionary editing page to the corresponding terminal equipment, wherein the page contains the classification dictionary list information:

in order to facilitate a second management user who is responsible for maintaining the illegal entry for full time to maintain the entries in each classification dictionary in a centralized manner, a second service interface is constructed, and the second management user can obtain a classification dictionary editing page pushed to the terminal equipment by the server after remotely calling the second service interface through page access. Because a plurality of classification dictionaries exist, a pull-down list can be displayed in the editing page to provide classification dictionary list information, after a user selects one of the classification dictionaries, each entry in the selected classification dictionary is correspondingly listed, and a second management user is allowed to perform edibility operations such as addition, modification, deletion and the like on each entry so as to maintain the entry.

Step S3400, acquiring one or more entries submitted based on the classification dictionary editing page and a classification dictionary specified from the classification dictionary list information, and storing the entries in the classification dictionary:

after the second management user carries out the editing processing including adding, deleting or modifying on one or more entries in a certain classification dictionary in the classification dictionary editing page, a submission instruction can be triggered and submitted to the server of the application, and the server of the application stores the submission instruction in the corresponding classification dictionary to finish updating the classification dictionary.

It should be noted that, between steps S3100, S3200 and steps S3300, S3400, the steps may be executed concurrently, and are not limited by the exemplary sorting of the embodiment, so that the two service interfaces may serve the management users with different division of labor in parallel.

In the embodiment, different inlets are provided for different data through two service interfaces with different division of labor, so as to respectively maintain the filtering rules and the classification dictionary in a specialized manner, and for a large-scale platform, the business logic of division of labor processing is necessary.

Referring to fig. 8, an apparatus for detecting information security of a live broadcast platform according to an object of the present application includes: the system comprises a message acquisition module 1100, a rule matching module 1200, a security detection module 1300, and a penalty processing module 1400, wherein the message acquisition module 1100 is configured to receive a public release message submitted by an application program pre-agreed with a live broadcast platform, where the public release message includes feature information of the application program, a release user, and a public release text; the rule matching module 1200 is configured to invoke a filtering rule corresponding to the feature information of the application program from a preset rule base, where the filtering rule includes the feature information of the application program, the classification dictionary designation information used by the application program, and a classification dictionary penalty type; the security detection module 1300 is configured to perform filtering on the publicly-published text according to the filtering rule, and determine one or more penalty types corresponding to the keywords in the publicly-published text when the keywords in the publicly-published text hit the entries of the one or more classification dictionaries of the specific information; the penalty processing module 1400 is configured to invoke a penalty processing interface corresponding to a highest-level penalty type in the determined penalty types to process the public release message, and return a corresponding notification message to the release user of the application program.

In an embodiment, the message obtaining module 1100 includes: the concurrent receiving submodule is used for concurrently receiving the message publishing request of the application program pre-agreed with the live broadcast platform and orderly adding the message publishing request to the request queue; the request dequeuing submodule is used for controlling the request queue to flow out the message publishing request according to a preset dequeuing rule and starting a consumption thread corresponding to the message publishing request; and the message consumption sub-module is used for analyzing the message publishing request by the consumption thread to obtain the public publishing message in the message publishing request.

In an embodied embodiment, the security detection module 1300 includes: the entity recognition submodule is used for calling a named entity recognition model which is trained to be in a convergence state in advance to extract a plurality of named entities from the public release text; the dictionary matching submodule is used for matching the named entity with each preset classification dictionary, determining the named entity which hits any entry of any one of the classification dictionaries as a hit keyword obtained by publicly releasing text word segmentation, and obtaining mapping relation data between the hit keyword and the classification dictionaries; and the penalty determination submodule is used for determining the corresponding penalty type according to the filtering rule corresponding to the classification dictionary in the mapping relation data between the hit keyword and the classification dictionary.

In an embodiment, the penalty processing module 1400 includes: the penalty rule preference sub-module is used for comparing a plurality of penalty rule types and determining the penalty rule type with the highest grade; the interface calls the submodule, is used for calling and operating the penalty rule processing interface corresponding to penalty rule type of the highest grade; the interface response submodule is used for processing the publicly issued text by the penalty processing interface according to preset service logic, wherein the processing comprises shielding keywords of hit terms or deleting the full text of the publicly issued text and returning a processing result; and the notification delivery sub-module is used for generating a corresponding notification message according to the processing result and sending the notification message to the instant messaging interface of the publishing user.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer-readable storage medium of the computer device stores an operating system, a database and computer-readable instructions, the database can store control information sequences, and the computer-readable instructions, when executed by the processor, enable the processor to implement a live broadcast platform information security detection method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may store computer readable instructions, and when the computer readable instructions are executed by the processor, the processor may execute the live platform information security detection method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The storage in this embodiment stores program codes and data required for executing all modules/sub-modules in the live broadcast platform information security detection apparatus of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.

The present application further provides a storage medium storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the steps of the live platform information security detection method according to any embodiment of the present application.

The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

In summary, the present application can construct an information security detection mechanism of a multi-application product serving a live broadcast platform, perform information security detection on user release messages submitted by each application, allow flexible configuration and management of various resource data and processing rules involved in the information security detection process, and in particular, decouple classification dictionaries and filtering rules required by the information security detection, allow professional division of labor processing, improve the flexibility of system maintenance, and facilitate maintenance of information ecology of the live broadcast platform.

Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A live broadcast platform information security detection method is characterized by comprising the following steps:

2. The live platform information security detection method of claim 1, wherein the step of receiving the publicly-published message submitted by the application program pre-agreed with the live platform comprises the steps of:

3. The live broadcast platform information security detection method according to claim 1, wherein filtering the publicly released text according to the filtering rule includes the steps of:

4. The live broadcast platform information security detection method of claim 3, wherein the named entity recognition model performs the following steps:

vectorizing the public release text to obtain corresponding coding information;

5. The live broadcast platform information security detection method according to claim 1, wherein a penalty rule processing interface corresponding to a highest level penalty rule type among the determined penalty rule types is invoked to process the public release message, and a corresponding notification message is returned to the release user of the application program, including the steps of:

6. The live broadcast platform information security detection method according to any one of claims 1 to 5, characterized by further comprising the steps of:

7. The utility model provides a live broadcast platform information security detection device which characterized in that includes:

the system comprises a message acquisition module, a message distribution module and a message distribution module, wherein the message acquisition module is used for receiving a public release message submitted by an application program pre-agreed with a live broadcast platform, and the public release message comprises characteristic information of the application program, a release user and a public release text;

the rule matching module is used for calling a filtering rule corresponding to the characteristic information of the application program from a preset rule base, wherein the filtering rule comprises the characteristic information of the application program, classification dictionary designation information adopted by the application program and a classification dictionary penalty type;

the safety detection module is used for filtering the publicly released text according to the filtering rules, and determining one or more corresponding penalty types when the keywords in the publicly released text hit the entries of one or more classification dictionaries of the specified information;

and the penalty processing module is used for calling a penalty processing interface corresponding to the penalty type with the highest grade in the determined penalty types to process the public release message and return a corresponding notification message to the release user of the application program.

8. A computer device comprising a central processor and a memory, characterized in that the central processor is adapted to invoke execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 6.

9. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 6, which, when invoked by a computer, performs the steps comprised by the corresponding method.

10. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method as claimed in any one of claims 1 to 6.