CN113377956A - Method, device, electronic equipment and medium for predicting black product attack trend - Google Patents

Method, device, electronic equipment and medium for predicting black product attack trend Download PDF

Info

Publication number
CN113377956A
CN113377956A CN202110658165.8A CN202110658165A CN113377956A CN 113377956 A CN113377956 A CN 113377956A CN 202110658165 A CN202110658165 A CN 202110658165A CN 113377956 A CN113377956 A CN 113377956A
Authority
CN
China
Prior art keywords
category
black
black product
analyzed
product information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110658165.8A
Other languages
Chinese (zh)
Inventor
吕博良
程佩哲
张�诚
金驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110658165.8A priority Critical patent/CN113377956A/en
Publication of CN113377956A publication Critical patent/CN113377956A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a method for predicting black product attack tendency, which can be applied to the technical field of information security. The method comprises the steps of obtaining information issued in at least one black product trading platform to obtain M original black product information; processing the original black product information into structured black product information to be analyzed, wherein M pieces of the original black product information correspond to obtain M pieces of the black product information to be analyzed; classifying the M black product information to be analyzed based on a preset rule to obtain at least one black product information category; and sequencing the at least one black product intelligence category according to the quantity of the black product intelligence to be analyzed contained in each black product intelligence category so as to predict the attack trend of the black products according to the sequencing result. The present disclosure also provides an apparatus, device, storage medium, and program product for predicting a black production attack tendency.

Description

Method, device, electronic equipment and medium for predicting black product attack trend
Technical Field
The present disclosure relates to the field of information security, and more particularly, to a method, apparatus, electronic device, medium, and program product for predicting black product attack tendency.
Background
The black products, i.e., network black products, may include "hacking," "stealing accounts," "phishing websites," and the like. With the rapid development of mobile internet technology, black products bring great threats to the security and business stability of information systems in various industries. At present, most of the measures for threat perception and risk prevention and control are mainly attack behavior identification and interception. However, when an attack or external threat is applied, it can be discovered and identified, usually with data or economic loss.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a method, apparatus, device, medium, and program product for predicting a blackout attack tendency that can be guarded against in advance before an attack action is implemented.
According to a first aspect of the present disclosure, a method for predicting black production attack trends is provided. The method comprises the following steps: acquiring information issued in at least one black product trading platform to obtain M original black product informations, wherein M is an integer greater than 1; processing the original black product information into structured black product information to be analyzed, wherein M pieces of the original black product information correspond to obtain M pieces of the black product information to be analyzed; classifying the M black product information to be analyzed based on a preset rule to obtain at least one black product information category; and sequencing the at least one black product intelligence category according to the quantity of the black product intelligence to be analyzed contained in each black product intelligence category so as to predict the attack trend of the black products according to the sequencing result.
According to the embodiment of the disclosure, the classifying the M black production intelligence to be analyzed based on the predetermined rule to obtain at least one black production intelligence class includes: clustering M black product information to be analyzed to obtain S second categories, wherein S is an integer greater than or equal to 1; and obtaining the at least one black production intelligence category based on the S second categories.
According to an embodiment of the present disclosure, the obtaining the at least one black production intelligence category based on the S second categories includes: classifying the M black product information to be analyzed based on keywords set by prior experience to obtain N first classes, wherein N is an integer greater than 1; merging the first category and the second category, the similarity of which satisfies a threshold condition, into a third category; and obtaining the at least one black production intelligence category based on the third category obtained by combination and the first category and the second category which are remained and can not be combined.
According to an embodiment of the present disclosure, the merging the first category and the second category, of which the similarity satisfies a threshold condition, into a third category includes: acquiring a first keyword set for the first category; acquiring at least one second keyword corresponding to the second category, wherein the at least one second keyword is at least one word with the highest occurrence frequency in the second category; characterizing the similarity of the first category and the second category by a similarity between the first keyword and the at least one second keyword; and when the similarity meets the threshold condition, merging the first category and the second category to obtain the third category.
According to an embodiment of the present disclosure, before the merging the first category and the second category, whose similarity satisfies a threshold condition, into a third category, the method further includes: performing word frequency statistics on an information set formed by all the black product information to be analyzed in each second category to obtain a word frequency statistical result; and selecting at least one word with the highest word frequency in the word frequency statistical result to obtain the at least one second keyword.
According to an embodiment of the present disclosure, after the sorting the at least one category of black product intelligence by the number of the black product intelligence to be analyzed contained by each of the categories of black product intelligence, the method further comprises: and outputting the information of a group of black product information categories with the highest ranking of the number of the black product information to be analyzed in the at least one black product information category.
According to an embodiment of the present disclosure, the outputting information of a group of black product information categories with top rank of the number of black product information to be analyzed in the at least one black product information category includes: when the output black-product information category is the first category, outputting the first keyword corresponding to the first category; when the output black-product intelligence category is the second category, outputting the at least one second keyword corresponding to the second category; and when the output black production intelligence category is the third category, outputting the first keyword corresponding to the first category before the third category is obtained through combination and the at least one second keyword corresponding to the second category before the third category is obtained through combination.
According to the embodiment of the disclosure, the clustering the M black production intelligence to be analyzed to obtain S second categories includes: converting each word in the black information to be analyzed into a word vector; accumulating word vectors corresponding to all words in each black product information to be analyzed to obtain a text vector corresponding to the black product information to be analyzed; and clustering the M black product information to be analyzed based on the text vectors corresponding to the M black product information to be analyzed.
According to the embodiment of the disclosure, the classifying the M black production intelligence to be analyzed based on the predetermined rule to obtain at least one black production intelligence class includes: and classifying the M black product information to be analyzed according to categories based on keywords set by prior experience to obtain N first categories. Wherein each of the first categories is used as one of the black production intelligence categories.
According to the embodiment of the disclosure, the classifying M black product information to be analyzed according to the keywords set based on the prior experience comprises: acquiring preset N first keywords, wherein N is an integer greater than 1; and dividing the black product intelligence to be analyzed matched with each first keyword into categories corresponding to the first keywords.
According to an embodiment of the present disclosure, the processing the raw black production intelligence into structured black production intelligence to be analyzed comprises: identifying the data type of the original black information; acquiring text information described in natural language in the original black production information by adopting a processing mode corresponding to the data type; and structuring the text information to obtain the black product information to be analyzed.
In another aspect of the disclosed embodiments, an apparatus for predicting black production attack tendency is provided. The device comprises an acquisition module, a structured processing module, a category division module and a sorting module. The acquisition module is used for acquiring information issued in at least one black product trading platform to obtain M original black product information; wherein M is an integer greater than 1. The structuralized processing module is used for processing the original black product information into structuralized black product information to be analyzed, wherein M original black product information correspondingly obtains M black product information to be analyzed. The classification module is used for classifying the M black production information to be analyzed based on a preset rule to obtain at least one black production information classification. The sequencing module is used for sequencing the at least one black product information category according to the quantity of the black product information to be analyzed contained in each black product information category so as to predict the attack trend of the black products according to the sequencing result.
According to an embodiment of the present disclosure, the category classification module includes a clustering submodule. The clustering submodule is used for clustering the M black production intelligence to be analyzed to obtain S second categories, and S is an integer greater than or equal to 1.
According to the embodiment of the disclosure, the category classification module further comprises a matching sub-module and a category integration sub-module. The matching submodule is used for carrying out category division on the M black product information to be analyzed based on keywords set by prior experience to obtain N first categories, wherein N is an integer larger than 1. The category integration sub-module is configured to merge the first category and the second category, of which the similarity satisfies a threshold condition, into a third category, and obtain the at least one black production intelligence category based on the third category obtained through the merging and the remaining first category and the second category that cannot be merged.
According to an embodiment of the present disclosure, the category classification module further includes an indexing sub-module. The indexing submodule is used for carrying out word frequency statistics on an information set formed by all the black production intelligence to be analyzed in each second category to obtain a word frequency statistical result, and selecting at least one word with the highest word frequency in the word frequency statistical result to obtain at least one second keyword.
According to an embodiment of the present disclosure, the apparatus further comprises an output module. The output module is used for outputting the information of a group of black production information categories with the quantity of the black production information to be analyzed ranked at the top in the at least one black production information category.
A third aspect of the present disclosure provides an electronic device. The electronic device includes: one or more processors, and a memory. The memory is used to store one or more programs. Wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above-described method.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described method.
A fifth aspect of the disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above method.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a method, apparatus, electronic device, medium, and program product for predicting black product attack trends according to embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method for predicting black product attack trends in accordance with an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart for structured processing of raw black production intelligence in a method according to an embodiment of the disclosure;
FIG. 4 schematically shows various rule schematics for categorizing M blackproduction intelligence to be analyzed in a method according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a flow chart for categorizing M black birth intelligence to be analyzed according to a priori experience according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow chart for clustering M black birth intelligence to be analyzed according to an embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow chart for categorizing M blackbirth intelligence to be analyzed according to an embodiment of the present disclosure;
FIG. 8 schematically illustrates a schematic diagram of outputting black birth intelligence categories according to an embodiment of the present disclosure;
fig. 9 schematically shows a block diagram of the structure of an apparatus for predicting a black product attack trend according to an embodiment of the present disclosure;
fig. 10 schematically shows a block diagram of a structure of a classification module in an apparatus for predicting a black production attack tendency according to an embodiment of the present disclosure;
fig. 11 schematically shows a block diagram of an apparatus for predicting a black production attack tendency according to another embodiment of the present disclosure;
FIG. 12 is a block diagram schematically illustrating a black production intelligence extraction module in the apparatus shown in FIG. 11;
FIG. 13 is a block diagram schematically illustrating a black production intelligence analysis module in the apparatus shown in FIG. 11;
FIG. 14 schematically illustrates a flow chart for predicting black product attack trends using the apparatus shown in FIG. 11; and
fig. 15 schematically shows a block diagram of an electronic device adapted to implement a method for predicting black production attack trends according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
At present, the threat perception and risk prevention and control of financial enterprises mainly take account dimension and transaction dimension, and attack behaviors or external threats can be discovered and identified only after the attacks or the external threats are generated, so that data or economic losses are generally caused.
The inventor finds that some black industry affairs form a black trade industry chain in the real world in the process of long-term information safety related work in the financial field. For example, some lawbreakers may devote the black product tools, some lawbreakers may purchase the black product tools to perform an attack, and some lawbreakers may specifically steal data and then perform data buying and selling, etc. These lawbreakers often publish transaction information for sale and purchase in some blackjack trading platforms (e.g., forums, or social networking chat groups) through some hidden means (e.g., agreed-upon blackjack phrases). Based on the discovery, the inventor assumes that if the system can be connected to the blackwork trading platforms, the information published in the blackwork trading platforms can be acquired in time, and the information is analyzed and processed to obtain the attack trend characteristics of the current blackwork, the following behavior trends of the blackwork can be pre-judged, and the attack purpose and the attack mode which can be mastered by lawless persons can be identified, so that the perception capability of a user on potential attacks can be enhanced, possible bugs in the system can be compensated in time, and the system can be managed and controlled before the attacks are implemented.
In view of this, embodiments of the present disclosure provide a method, apparatus, system, medium, and program product for predicting black product attack tendency. The method comprises the steps of firstly obtaining information issued in at least one black product trading platform to obtain M original black product information. And then processing the original black product information into structured black product information to be analyzed, wherein M pieces of original black product information correspondingly obtain M pieces of black product information to be analyzed. And then classifying the M black product information to be analyzed based on a preset rule to obtain at least one black product information category. And then sequencing at least one black product information category according to the quantity of the black product information to be analyzed contained in each black product information category so as to predict the attack trend of the black products according to the sequencing result. In this way, the threat perception of the black product information dimension is introduced, attack prejudgment is carried out from the black product information dimension, and a possible attack target and/or attack mode are identified and prejudged, so that prevention and response can be carried out before attack behaviors are not implemented.
Fig. 1 schematically shows an application scenario diagram of a method, an apparatus, an electronic device, a medium, and a program product for predicting black product attack tendency according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a terminal device 101, an external website server 102, a background server 103, and a user terminal 104. The background server 103 may be in communication connection with the terminal device 101 and/or the external website server 102 via a network (e.g., the internet). Further, the background server 103 may be communicatively connected to the user terminal 104 via a network (e.g., an intranet).
The backend server 103 may execute the method of the present disclosure, obtain original black-product intelligence from the corresponding black-product transaction platform through a communication connection with the terminal device 101 and/or the external website server 102, and then output the result of processing and analyzing the original black-product intelligence to the user terminal 104 for displaying to the user 105. The terminal device 101 and the user terminal 105 may be the same terminal or different terminals.
The backend server 103 can obtain the original black production transaction intelligence through communication with the terminal device 101. For example, the terminal device 101 logs in to a forum for conducting black products transactions, or a social network chat group, or a website page, etc., and then crawls information from the forum. For example, a forum for black products transaction, or a social network chat group, or a login qualification of a website page may be obtained by identity masquerading, and then login is performed on the terminal device 101. After logging in, when the messages are issued in the blackjack trading platforms, the messages in the blackjack trading platforms can be crawled and sent to the background server 103 for processing.
The backend server 103 may also obtain raw black product transaction intelligence through communication with the external website server 102. For example, the external website server 102 may be a website server of some social websites, some forums of which are discovered through long-term research and tracking by information security personnel, or information related to black products in social groups. Alternatively, the external website server 102 may be, for example, a website dedicated to serve black products transactions and discovered through long-term tracking by information security personnel. The background server 103 may be connected to the external website server 102 by some robot or API interface, and crawl information published in the blackout trading platform from some specific service or program in the website server 102.
The method for predicting the black product attack tendency provided by the embodiment of the disclosure can be generally executed by the background server 103. Accordingly, the apparatus, electronic device, medium, and program product for predicting black product attack tendency provided by the embodiments of the present disclosure may be generally disposed in the backend server 103.
It is understood that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be used in other devices, systems, environments or scenarios.
It should be noted that the method, the apparatus, the electronic device, the medium, and the program product for predicting the black product attack trend determined in the embodiments of the present disclosure may be used in the financial field, and may also be used in any field other than the financial field, and the present disclosure does not limit the application field.
In addition, in the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all conform to the regulations of related laws and regulations, necessary security measures are taken, and the customs of the public order is not violated.
The method for predicting the black production attack tendency of the disclosed embodiment will be described in detail below through fig. 2 to 8 based on the scenario described in fig. 1.
Fig. 2 schematically shows a flow chart of a method for predicting black product attack trends according to an embodiment of the disclosure.
As shown in fig. 2, the method for predicting the black production attack tendency of the embodiment may include operations S210 to S240, and the method may be performed by the server 103.
First, in operation S210, information published in at least one blackout trading platform is obtained to obtain M pieces of original blackout information, where M is an integer greater than 1. Information sources generally include blacklist trading platforms (e.g., subject-specific posts or group discussions, etc.) among social media platforms such as forums, QQ groups, WeChat groups, telegrams, etc. The released information mainly comprises but is not limited to black product attack tools, sensitive information buying and selling, donkey (Money Mule) account buying and selling, Money washing, drainage water injection and other information. The information published in the black product trading platforms can be acquired by using methods such as a crawler, a robot, an API (application program interface) interface and the like, and tools such as a QQ group robot, a telegram robot, a forum crawler and the like are used for collecting original black product information for threat analysis.
Then, in operation S220, the raw black product information is processed into structured black product information to be analyzed, wherein M raw black product information correspondingly obtains M black product information to be analyzed.
The original black-production intelligence usually has a lot of information such as pictures, icons, symbols, or XML structures. According to the embodiment of the disclosure, the interference data such as symbols and pictures can be filtered by adopting the modes such as regular matching, key identification replacement, OCR image-text recognition, XML label extraction and the like, so that the effective natural language text information can be obtained. And then, structuring the natural text information to obtain the black product information to be analyzed.
Next, in operation S230, the M black production intelligence to be analyzed are classified into categories based on a predetermined rule, so as to obtain at least one black production intelligence category. For example, classification may be performed according to attack targets, attack means, attack channels, and/or the type of traffic involved, etc. Or, the natural language processing technology can be used for automatically classifying or clustering the M black-product intelligence to be analyzed.
Then, in operation S240, at least one black product information category is ranked according to the quantity of the black product information to be analyzed contained in each black product information category, so as to predict the attack trend of the black product according to the ranking result.
And classifying the obtained black product information into categories and sequencing the information according to the quantity of the information in each category, so that the black product information category with the top sequencing belongs to the category with higher heat in the black product industry chain at present. According to the attack targets, attack means, attack channels and the like of the black product information types with the top ranking, the current hotspot attack trend of the black products can be reflected, so that a user can be guided to check the bugs existing in the system in time, and the prevention and the response are carried out before the attack behavior is implemented.
According to the embodiment of the disclosure, the potential threat of the black product is discovered by actively detecting the information issued in the black product transaction platform, so that the user can be helped to discover the risk vulnerability before the black product attack, and accordingly, the processing actions such as adjustment, investigation and the like are performed, and the threat perception and the coping and protection capability to the black product attack are improved.
Fig. 3 schematically shows a flowchart for structuring the raw black intelligence in operation S220 in a method according to an embodiment of the present disclosure.
As shown in fig. 3, the process of structuring the raw black production intelligence in operation S220 according to the embodiment may include operations S301 to S303.
First, in operation S301, a data type of original black production intelligence is identified. For example, the data type of the original black intelligence is distinguished by suffix name, including but not limited to picture, text, XML tag language, etc.
Then, in operation S302, text information described in natural language in the original black production intelligence is obtained by using a processing method corresponding to the data type. For example, if the data type is a picture, the text information in the picture can be extracted through OCR image-text recognition. For another example, if the data type is text, the special symbols and icons may be removed by regular matching. For example, if the data type is a tag language such as XML (e.g., crawled web page data), the tag language may be extracted to retain text information.
Then, in operation S303, the text message is structured to obtain the black production information to be analyzed.
In one embodiment, the structured features of the black birth intelligence to be analyzed may include field information such as message sending time, sending account name, message details, and links contained in the message. For example, when the blackout intelligence to be analyzed is recorded in tabular form, a blackout intelligence to be analyzed may form a record in the table with the fields in the record corresponding to the fields in the structured features of the blackout intelligence to be analyzed. Wherein, each piece of original black information correspondingly forms a record. For example, when crawling information from a group chat, an utterance is an original black intelligence, which is processed into a record accordingly. When a large amount of original black product information is crawled, a large amount of records, namely a series of black product information to be analyzed, can be obtained through the processing of operation S220.
Fig. 4 schematically shows various rule schematics for class classification of M black production intelligence to be analyzed in operation S230 in a method according to an embodiment of the present disclosure.
As shown in fig. 4, M black production intelligence to be analyzed may be classified by one of operation S231, operation S232, or operation S233 in operation S230 according to an embodiment of the present disclosure, wherein rules for classifying the categories in operation S231, operation S232, or operation S233 are different from each other.
Specifically, in operation S231, the M black product intelligence to be analyzed may be classified based on the keywords set by the prior experience, and the intelligence to be analyzed matching the keywords are put together to obtain N first classes, where N is an integer greater than 1. In one embodiment, each of the first categories may be used as one of the categories of black production intelligence obtained in operation S230.
In operation S232, S second categories may be obtained by clustering M black production intelligence to be analyzed, where S is an integer greater than or equal to 1. In one embodiment, each of the second categories may be used as one category of black production intelligence obtained in operation S230.
The different classification rules in operation S231 and operation S232 may be selected based on the consideration of whether the classification result is controllable and helpful to find new risks in practical applications.
Specifically, when the classification is performed based on the keywords set based on the a priori experience in operation S231, the classification result may be artificially guided by the setting of the keywords. For example, a potential attacked object (for example, the name of the organization where the user is located and its substitute word) concerned by the user can be used as a keyword, so that the attack tendency for the potential attacked object can be known after classification.
The categories formed by clustering in operation S232 help to discover potential, unknown attack risks. Due to the fact that the categories are formed through clustering, the attack tendency information can be prevented from being omitted due to manual intervention. Thus, the current overall attack trend of the whole black-producing industry chain can be known. However, if the current hotspot attack target in the black product industry chain is not a potential attacked object concerned by the user, a large amount of black product intelligence not concerned by the user can be easily screened out through clustering.
Therefore, no matter the classification is performed based on the prior experience in operation S231 or the classification is performed by clustering in operation S232, the advantages and disadvantages of the classification are provided, and the classification can be selected according to the requirements in practical use. Alternatively, in other embodiments, the two manners may be combined, for example, operation S233.
In operation S233, on one hand, the M black product information to be analyzed may be classified based on the keywords set by the prior experience to obtain N first classes; on the other hand, S second categories are obtained by clustering M black-product information to be analyzed; next, the N first categories and the S second categories are de-merged, for example, similar categories of the two categories divided by different rules are merged together, and dissimilar categories are retained, so as to obtain an integrated category, thereby obtaining at least one category of black production intelligence in operation S230. In this way, in operation S233, the classification result can be guided according to the prior experience, and a new risk can be found through clustering.
Fig. 5 schematically shows a flowchart for categorizing M black birth intelligence to be analyzed according to a priori experience according to an embodiment of the present disclosure.
As shown in fig. 5, the process of performing category classification based on a priori experience in operation S231 according to the embodiment may include operations S501 to S502.
In operation S501, N preset first keywords are obtained, where N is an integer greater than 1. The first keyword can be, for example, an attack specific term, a target object of a black product attack, an attack channel name of a black product, a kind of a black product plan, and the like. If it is necessary to detect whether a blackout attacks a certain bank system, the target object of the blackout attack may be the name, code, or code of the bank; the name of the attack channel of the black product can be, for example, a mobile phone bank, an e-commerce platform, an internet bank and the like; the kind of the dark product may be, for example, embezzled, deducted money, face information buying and selling, etc.
In operation S502, the black production intelligence to be analyzed, which is matched with each first keyword, is classified into categories corresponding to the first keywords. When a first keyword can match one or more blackout intelligence to be analyzed, the blackout intelligence to be analyzed can be regarded as belonging to a category, wherein the first keyword can reflect the attack trend of the category, for example, as an index of the category.
Since different first keywords reflect different attack trends, matching of different first keywords to the black information to be analyzed may be repeated. For example, when performing category classification, a first keyword is used to match with M black-product intelligence to be analyzed each time.
FIG. 6 schematically illustrates a flow chart for clustering M black birth intelligence to be analyzed according to an embodiment of the disclosure.
As shown in fig. 6, the process of clustering in operation S232 according to this embodiment may include operations S601 to S603.
First, in operation S601, each word in the black information to be analyzed is converted into a word vector. For example, the text in the black information to be analyzed can be segmented, and then sent to the word2vec model to obtain the vector space representation of each word.
Then, in operation S602, word vectors corresponding to all words in each black product information to be analyzed are accumulated to obtain a text vector corresponding to the black product information to be analyzed. For example, the segmented blackout intelligence n1 to be analyzed contains k words, which are denoted as w1, w2, … and wk, and the vector representations of these words are denoted as v1, v2, … and vk, respectively, so the text vector of the blackout intelligence n1 to be analyzed can be denoted as v1+ v2+ … + vk.
Next, in operation S603, the M black production intelligence to be analyzed is clustered based on the text vectors corresponding to the M black production intelligence to be analyzed. The M black productions to be analyzed are classified into different attack trend clusters (i.e., a second category) using a clustering algorithm, such as a neighbor propagation clustering algorithm.
After operation S603, word frequency statistics may be performed on an information set formed by all black production intelligence to be analyzed in each second category to obtain a word frequency statistical result, and then at least one word with the highest word frequency in the word frequency statistical result is selected to obtain at least one second keyword. The at least one second keyword may reflect the attack tendency of the second category, and may be regarded as an index as an attack tendency keyword of the cluster. In one embodiment, the 3 words with the highest frequency of occurrence in each cluster may be selected as the attack tendency keyword of the cluster, and the attack tendency keyword is used as an index.
Fig. 7 schematically shows a flowchart for categorizing M black birth intelligence to be analyzed according to an embodiment of the present disclosure.
As shown in fig. 7, the process of performing category division in operation S233 according to the embodiment may include operation S231, operation S232, and operations S703 to S704.
Specifically, in operation S233, the classes are first classified according to different rules through operations S231 and S232 to obtain N first classes and S second classes, and then the classes obtained by the two different rules are de-duplicated and integrated through operations S703 to S704. For operations S231 and S232, reference may be made to the related descriptions of fig. 5 and fig. 6 above.
In operation S703, the first category and the second category, of which the similarity satisfies the threshold condition, are merged into a third category. Only one of the identical black product intelligence to be analyzed in the first category and the second category can be retained in the merging process, so that the influence of the repeated black product intelligence to be analyzed in the merging on the amount of intelligence in the merged category can be avoided, and the negative influence on the sorting in operation S240 is reduced.
Each first category may be compared with all other second categories one by one during merging, and when the similarity satisfies a threshold condition, the corresponding first category and second category are merged. Then, the remaining first category is compared with all other remaining second categories one by one. And so on until all the first categories are traversed.
In the specific comparison, for example, a first keyword set for the first category and at least one second keyword corresponding to the second category (i.e., the highest frequency of occurrence in the second category to that of the highest frequency of occurrence in the second category) may be obtainedOne less word). Then, the similarity between the first category and the second category is characterized by the similarity between the first keyword and at least one second keyword. For example, assuming that the second category corresponds to 3 second keywords, when calculating the similarity, the first keyword and the 3 second keywords may be vectorized respectively (for example, by word2vec model), and then the similarity (for example, cosine similarity) between the first keyword and the 3 second keywords is calculated respectively, so as to obtain sim1、sim2And sim3Then, the mean sim is calculated as follows:
Figure BDA0003112764250000141
the mean sim may be used to measure the similarity of the first class and the second class. When the mean value sim is larger than the threshold (for example, 0.9), the similarity between the first category and the second category for comparison is considered to satisfy the threshold condition, and the categories belong to the same category, so that the intelligence to be analyzed in the categories can be merged and integrated to obtain a third category, and the intelligence quantity in the third category is updated accordingly.
In operation S704, at least one black production intelligence category is obtained based on the combined third category and the remaining first and second categories that cannot be combined. As described above, the obtained at least one black-product intelligence category can guide the classification result according to prior experience classification and can find new risks through clustering.
FIG. 8 schematically shows a schematic diagram of outputting black birth intelligence categories according to an embodiment of the disclosure.
Referring to fig. 8 in combination with fig. 1, according to the embodiment of the disclosure, after the operation S240 sorts the information quantity contained in each black product information category, information of a group of black product information categories in which the quantity of black product information to be analyzed is sorted in the at least one black product information category, for example, information of 10 black product information categories with the largest quantity of information, may be output.
When the black product information category is output, the attack trend keyword reflecting each black product information category can be correspondingly output as an index. For example, referring to fig. 8, when the output category of the black production intelligence is the first category, the first keyword corresponding to the first category may be output accordingly. When the output black production intelligence category is the second category, at least one second keyword corresponding to the second category can be correspondingly output. When the output black production intelligence category is a third category, a first keyword corresponding to the first category before the third category is obtained by combination and at least one second keyword corresponding to the second category before the third category is obtained by combination can be correspondingly output.
Therefore, after a group of black product information category information with the front information quantity is output according to the sorting result, a user can quickly judge the attack trend of the current black product according to the result. For example, the information of potential hotspot attack targets, hotspot attack means or channels of the black products is known. The system can also help the user to discover unrecognized leaks possibly existing in the system, stop the leaks timely, or destroy the attack actions to be carried out by lawbreakers as soon as possible.
Based on the method for predicting the black product attack trend introduced above, the embodiment of the disclosure also provides a device for predicting the black product attack trend. The apparatus will be described in detail below with reference to fig. 8 and 9.
Fig. 9 schematically shows a block diagram of the structure of an apparatus for predicting a black product attack trend according to an embodiment of the present disclosure.
As shown in fig. 9, the apparatus 900 for predicting black production attack tendency of this embodiment may include an obtaining module 910, a structuring processing module 920, a category dividing module 930, and a sorting module 940. According to other embodiments of the present disclosure, the apparatus 900 may further include an output module 950. The apparatus 900 may be used to implement the methods described with reference to fig. 2-8.
The obtaining module 910 is configured to obtain information published in at least one blackout trading platform to obtain M pieces of original blackout information; wherein M is an integer greater than 1. In an embodiment, the obtaining module 910 may be configured to perform the operation S210 described above.
The structuralization processing module 920 is configured to process the raw black production information into structuralized black production information to be analyzed, where M raw black production information correspondingly obtains M black production information to be analyzed. In an embodiment, the structuring processing module 920 may be configured to perform the operation S220 described above.
The category classification module 930 is configured to classify the M black production intelligence to be analyzed based on a predetermined rule to obtain at least one black production intelligence category. In an embodiment, the obtained category classification module 930 may be configured to perform operation S230 described above.
The ranking module 940 is configured to rank the at least one black product intelligence category according to the quantity of the black product intelligence to be analyzed contained in each black product intelligence category, so as to predict the attack trend of the black products according to the ranking result. In an embodiment, the sorting module 940 may be configured to perform the operation S240 described above.
The output module 950 is used for outputting the information of a group of black product information categories with the highest rank of the number of black product information to be analyzed in at least one black product information category. In one embodiment, the output module 950 can output the black production intelligence type information in the manner illustrated in fig. 8.
Fig. 10 schematically shows a block diagram of the structure of the category classification module 930 in the apparatus 900 for predicting black product attack tendency according to an embodiment of the present disclosure.
As shown in FIG. 10, the category classification module 930 may include a clustering sub-module 931, a matching sub-module 932, a category integration sub-module 933, and an indexing sub-module 934.
The clustering submodule 931 is configured to cluster the M black production intelligence to be analyzed to obtain S second categories, where S is an integer greater than or equal to 1. In an embodiment, the clustering sub-module 931 may be configured to perform the operation S232 described above.
The matching submodule 932 is configured to perform category division on the M black product information to be analyzed based on a keyword set by prior experience to obtain N first categories, where N is an integer greater than 1. In an embodiment, the matching sub-module 932 may be configured to perform the operation S231 described above.
The category integrating submodule 933 is configured to merge the first category and the second category, of which the similarity satisfies the threshold condition, into a third category, and obtain at least one category of black production intelligence based on the merged third category and the remaining first category and second category that cannot be merged. In an embodiment, the category integration submodule 933 may be configured to perform the operations S703 to S704 described above.
The indexing sub-module 934 is configured to perform word frequency statistics on an information set formed by all black product information to be analyzed in each second category to obtain a word frequency statistical result, and select at least one word with the highest word frequency in the word frequency statistical result to obtain at least one second keyword. Wherein, for the second category obtained by clustering, the at least one second keyword can be used for representing the attack trend of the second category. The index sub-module 934 may further be configured to, for each first category, use the first keyword set for the first category as an index to characterize the attack trend of the first category.
Based on the method for predicting the black product attack trend described above, another embodiment of the present disclosure also provides an apparatus for predicting the black product attack trend. The apparatus will be described in detail below with reference to fig. 11 to 14. It should be noted that the device and the process of using the device described in fig. 11 to 14 are only an exemplary application and do not limit the disclosure.
Fig. 11 schematically shows a block diagram of an apparatus 1100 for predicting black product attack tendency according to another embodiment of the present disclosure.
As shown in fig. 11, the apparatus 1100 for predicting black production attack tendency according to the embodiment may include a black production intelligence extraction module 1 and a black production intelligence analysis module 2.
The black product information extraction module 1 is used for cleaning and extracting the information data after acquiring the original black product information to acquire black product information characteristics and information elements.
And the black product information analysis module 2 is used for carrying out big data distribution and rule matching on the cleaned black product information data and outputting a checking result.
Fig. 12 is a block diagram schematically showing the configuration of the black production intelligence extraction module 1 in the apparatus shown in fig. 11.
As shown in fig. 12, the black production intelligence extraction module 1 may include a black production intelligence crawling unit 101 and a black production intelligence cleaning unit 102. The main function of the black product information crawling unit 101 is to collect raw black product information. The black product information cleaning unit 102 is mainly used for preprocessing and cleaning original black product information, removing non-key information in the black product information, obtaining cleaned data, and transmitting the data to the black product information analysis module 2 for analysis.
Black-product information crawling unit 101: the method mainly aims to obtain original black-product information. The information sources generally include the published information of the black products in social media platforms such as forums, QQ groups, WeChat groups, telegrams and the like, and mainly include but are not limited to the information of attack tools on the upstream of the black products, sensitive information buying and selling, donkey sub-account buying and selling, money laundering, drainage water flooding and the like. And acquiring black-product information from the channel by using methods such as a crawler, a robot, an API (application programming interface) interface and the like, and collecting the black-product information by using tools such as a QQ swarm robot, a telegram robot, a forum crawler and the like for threat analysis.
Black product information cleaning unit 102: the method is mainly responsible for cleaning the collected original black-product information data and removing interference information and invalid information in the data so as to more effectively mine threat information in the data. In most cases, the obtained original black information contains a large amount of information such as pictures, icons, symbols, XML structures and the like, and interference data such as symbols, pictures and the like are filtered by adopting modes such as regular matching, key identification replacement, OCR image-text recognition, XML label extraction and the like, so that effective natural language text information is obtained. The unit firstly judges the black production intelligence data type including but not limited to pictures, texts, XML tag language and the like through suffix names; and if the data type is a picture, performing OCR (optical character recognition) image-text recognition to extract character information in the picture, if the data type is a text, reading in the data to remove special symbols and icons, and if the data type is a label language such as XML (extensive Makeup language), extracting the label language and keeping the text information.
Fig. 13 is a block diagram schematically showing the configuration of the black production intelligence analysis module 2 in the apparatus shown in fig. 11.
As shown in fig. 13, the blackout intelligence analysis module 2 may include an intelligence feature structuring unit 201, an intelligence information matching unit 202, an algorithm analysis unit 203, and a risk output unit 204.
The intelligence feature structuring unit 201 is used for receiving the data type extracted by the black production intelligence cleaning unit 102 and performing structuring processing to obtain the black production intelligence to be analyzed. Examples are as follows: the XML data structure of the intelligence feature structuring unit 201 cannot be directly calculated by an algorithm, and needs to be converted into structured data. According to the requirement of black-production upstream intelligence analysis, converting XML semi-structured data into structured data, wherein the structural characteristics comprise message sending time, sending account names (such as a first order sending account name, a second order sending account name for message forwarding and the like), message detailed contents, links contained in the message and the like, and are used for subsequent modeling.
Intelligence information matching unit 202: the information processing system is responsible for receiving and processing the black product information to be analyzed output after the information feature structuring unit 201 performs structuring processing, and performing matching screening on the black product information to be analyzed according to keywords preset by prior experience (for example, a white list formed by setting first keywords). The black product intelligence categories such as specific attack specific terms, target enterprises of the black product attack, channel names of the black product attack, types of black product plans and the like can be screened through the setting of the first keyword. This can be used to quickly identify risk targets and attack trends from the mass of data. The obtained black production intelligence categories are then passed to the risk output unit 204.
The algorithm analysis unit 203 is responsible for converting the black product intelligence to be analyzed generated by the intelligence feature extraction unit 201 into text vector representation, and classifying the black product attack dialect with higher similarity through a clustering algorithm. Specifically, the information after word segmentation by the feature extraction unit 201 is sent to the word2vec model to obtain vector space representation of each word, and the vector representations of each word in the information are added to obtain vector representation of the information. If intelligence n1 contains k words after word segmentation, w1, w2, … and wk, and the vector representations of the words are respectively marked as v1, v2, … and vk, the vector representation of n1 is denoted as v1+ v2+ … + vk. Then, the information is divided into different attack trend clusters by using a clustering algorithm, such as a neighbor propagation clustering algorithm, word frequency statistics is carried out on the information in each attack cluster, 3 words (namely 3 second keywords) with the highest frequency in each cluster are selected as attack trend keywords of the cluster, the attack trend keywords are used as indexes, and the total amount of the information in each attack cluster is transmitted to the risk output unit 204 for analysis.
The risk output unit 204 is responsible for analyzing and integrating the first keywords such as the special attack term, the target enterprise of the black product attack, the channel name of the black product attack, the type of the black product plan and the like obtained by the information matching unit 202, the attack trend keyword (i.e., the second keyword) and the total amount of the information obtained by the algorithm analysis unit 203, outputting the high-frequency black product words, and judging the attack trend.
In the integration process, the similarity sim between the first keyword and the three attack tendency keywords in each cluster may be first calculated (for example, as shown in formula (1)). If the similarity sim is larger than 0.9, the two categories are classified into the same category, and the information in the categories is merged and integrated to update the total information. And then sorting according to the intelligence quantity of each category, and taking the black production intelligence category with the quantity of top 10 for output and display.
Fig. 14 schematically shows a flow chart for predicting a black product attack trend using the apparatus shown in fig. 11.
As shown in fig. 14, the process of predicting the black production attack tendency by using the apparatus 1100 may be roughly divided into the following steps S01 to S04.
Step S01: collecting original information data issued in a black-product trading platform by using a robot or a web crawler;
step S02: and (4) preprocessing original information, removing irregular characters, expressions and pictures, and extracting a specified text file.
Step S03: and carrying out rule matching and natural language analysis on the cleaned information data, carrying out word frequency and part of speech analysis, and screening the highest-frequency attack public sentiment.
Step S04: and outputting and analyzing the high-frequency words and key information matching results, and judging the attack trend.
According to the embodiment of the present disclosure, any plurality of the acquisition module 910, the structural processing module 920, the category division module 930, the ranking module 940, the output module 950, the blackout intelligence extraction module 1, and the blackout intelligence analysis module 2 may be combined into one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to the embodiment of the present disclosure, at least one of the obtaining module 910, the structuring processing module 920, the category dividing module 930, the sorting module 940, the outputting module 950, the blackout intelligence extracting module 1, and the blackout intelligence analyzing module 2 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementation manners of software, hardware, and firmware, or by a suitable combination of any of them. Alternatively, at least one of the acquisition module 910, the structuring processing module 920, the category dividing module 930, the ranking module 940, the output module 950, the blackout intelligence extraction module 1, and the blackout intelligence analysis module 2 may be at least partially implemented as a computer program module that, when executed, may perform a corresponding function.
FIG. 15 schematically illustrates a block diagram of an electronic device 1500 suitable for implementing a method for predicting black production attack trends in accordance with an embodiment of the present disclosure.
As shown in fig. 15, an electronic device 1500 according to an embodiment of the present disclosure includes a processor 1501 which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1502 or a program loaded from a storage section 1508 into a Random Access Memory (RAM) 1503. Processor 1501 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset(s) and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), and/or the like. The processor 1501 may also include on-board memory for caching purposes. Processor 1501 may include a single processing unit or multiple processing units for performing different acts of a method flow in accordance with embodiments of the present disclosure.
In the RAM 1503, various programs and data necessary for the operation of the electronic apparatus 1500 are stored. The processor 1501, the ROM 1502, and the RAM 1503 are connected to each other by a bus 1504. The processor 1501 executes various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1502 and/or RAM 1503. Note that the programs may also be stored in one or more memories other than the ROM 1502 and RAM 1503. The processor 1501 may also execute various operations of the method flows according to the embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, electronic device 1500 may also include input/output (I/O) interface 1505, input/output (I/O) interface 1505 also being connected to bus 1504. The electronic device 1500 may also include one or more of the following components connected to the I/O interface 1505: an input portion 1506 including a keyboard, a mouse, and the like; an output portion 1507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1508 including a hard disk and the like; and a communication section 1509 including a network interface card such as a LAN card, a modem, or the like. The communication section 1509 performs communication processing via a network such as the internet. A drive 1510 is also connected to the I/O interface 1505 as needed. A removable medium 1511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1510 as necessary, so that a computer program read out therefrom is mounted into the storage section 1508 as necessary.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1502 and/or RAM 1503 described above and/or one or more memories other than the ROM 1502 and RAM 1503.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the method for predicting the black production attack trend provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 1501. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 1509, and/or installed from the removable medium 1511. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 1509, and/or installed from the removable medium 1511. The computer program, when executed by the processor 1501, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (14)

1. A method for predicting black product attack trends, comprising:
acquiring information issued in at least one black product trading platform to obtain M original black product informations, wherein M is an integer greater than 1;
processing the original black product information into structured black product information to be analyzed, wherein M pieces of the original black product information correspond to obtain M pieces of the black product information to be analyzed;
classifying the M black product information to be analyzed based on a preset rule to obtain at least one black product information category; and
and sequencing the at least one black product intelligence category according to the quantity of the black product intelligence to be analyzed contained in each black product intelligence category so as to predict the attack trend of the black products according to the sequencing result.
2. The method of claim 1, wherein the categorizing the M black production intelligence to be analyzed based on a predetermined rule to obtain at least one black production intelligence category comprises:
clustering M black product information to be analyzed to obtain S second categories, wherein S is an integer greater than or equal to 1; and
and obtaining the at least one black production intelligence category based on the S second categories.
3. The method of claim 2, wherein said deriving said at least one black intelligence category based on S of said second categories comprises:
classifying the M black product information to be analyzed based on keywords set by prior experience to obtain N first classes, wherein N is an integer greater than 1;
merging the first category and the second category, the similarity of which satisfies a threshold condition, into a third category; and
and obtaining the at least one black production intelligence category based on the third category obtained by combination and the first category and the second category which are remained and can not be combined.
4. The method of claim 3, wherein the merging the first and second categories into a third category whose similarity satisfies a threshold condition comprises:
acquiring a first keyword set for the first category;
acquiring at least one second keyword corresponding to the second category, wherein the at least one second keyword is at least one word with the highest occurrence frequency in the second category;
characterizing the similarity of the first category and the second category by a similarity between the first keyword and the at least one second keyword; and
and when the similarity meets the threshold condition, merging the first category and the second category to obtain the third category.
5. The method of claim 4, wherein prior to the merging of the first and second categories into a third category for which similarity satisfies a threshold condition, the method further comprises:
performing word frequency statistics on an information set formed by all the black product information to be analyzed in each second category to obtain a word frequency statistical result; and
and selecting at least one word with the highest word frequency in the word frequency statistical result to obtain the at least one second keyword.
6. The method of claim 5, wherein after said sorting said at least one category of black production intelligence by the amount of black production intelligence to be analyzed contained by each said category of black production intelligence, the method further comprises:
and outputting the information of a group of black product information categories with the highest ranking of the number of the black product information to be analyzed in the at least one black product information category.
7. The method of claim 6, wherein said outputting information for a top ranked set of black production intelligence categories of said quantity of black production intelligence to be analyzed in said at least one of said black production intelligence categories comprises:
when the output black-product information category is the first category, outputting the first keyword corresponding to the first category;
when the output black-product intelligence category is the second category, outputting the at least one second keyword corresponding to the second category; and
and when the output black-product intelligence category is the third category, outputting the first keyword corresponding to the first category before the third category is obtained by combination and the at least one second keyword corresponding to the second category before the third category is obtained by combination.
8. The method according to any one of claims 2 to 7, wherein the clustering of the M black production intelligence to be analyzed to obtain S second categories comprises:
converting each word in the black information to be analyzed into a word vector;
accumulating word vectors corresponding to all words in each black product information to be analyzed to obtain a text vector corresponding to the black product information to be analyzed; and
and clustering the M black product information to be analyzed based on the text vectors corresponding to the M black product information to be analyzed.
9. The method of claim 1, wherein the categorizing the M black production intelligence to be analyzed based on a predetermined rule to obtain at least one black production intelligence category comprises:
classifying the M black product information to be analyzed into N first classes based on keywords set by prior experience;
wherein each of the first categories is used as one of the black production intelligence categories.
10. The method according to claim 3 or 9, wherein the categorizing the M black birth intelligence to be analyzed based on the keywords set by the a priori experience comprises:
acquiring preset N first keywords, wherein N is an integer greater than 1;
and dividing the black production intelligence to be analyzed matched with each first keyword into categories corresponding to the first keywords.
11. The method of claim 1, wherein said processing said raw black production intelligence into structured black production intelligence to be analyzed comprises:
identifying the data type of the original black information;
acquiring text information described in natural language in the original black production information by adopting a processing mode corresponding to the data type; and
and structuring the text information to obtain the black product information to be analyzed.
12. An apparatus for predicting black product attack trends, comprising:
the system comprises an acquisition module, a processing module and a control module, wherein the acquisition module is used for acquiring information issued in at least one black product trading platform to obtain M original black product information; wherein M is an integer greater than 1;
the structural processing module is used for processing the original black product information into structural black product information to be analyzed, wherein M pieces of the original black product information correspond to obtain M pieces of the black product information to be analyzed;
the category division module is used for carrying out category division on the M black product information to be analyzed based on a preset rule to obtain at least one black product information category;
and the sequencing module is used for sequencing the at least one black product information category according to the quantity of the black product information to be analyzed contained in each black product information category so as to predict the attack trend of the black products according to the sequencing result.
13. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-11.
14. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 11.
CN202110658165.8A 2021-06-11 2021-06-11 Method, device, electronic equipment and medium for predicting black product attack trend Pending CN113377956A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110658165.8A CN113377956A (en) 2021-06-11 2021-06-11 Method, device, electronic equipment and medium for predicting black product attack trend

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110658165.8A CN113377956A (en) 2021-06-11 2021-06-11 Method, device, electronic equipment and medium for predicting black product attack trend

Publications (1)

Publication Number Publication Date
CN113377956A true CN113377956A (en) 2021-09-10

Family

ID=77574157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110658165.8A Pending CN113377956A (en) 2021-06-11 2021-06-11 Method, device, electronic equipment and medium for predicting black product attack trend

Country Status (1)

Country Link
CN (1) CN113377956A (en)

Similar Documents

Publication Publication Date Title
CN108874777B (en) Text anti-spam method and device
US20200151392A1 (en) System and method automated analysis of legal documents within and across specific fields
US9817810B2 (en) SVO-based taxonomy-driven text analytics
CN111343161B (en) Abnormal information processing node analysis method, abnormal information processing node analysis device, abnormal information processing node analysis medium and electronic equipment
CN110929145B (en) Public opinion analysis method, public opinion analysis device, computer device and storage medium
CN110795568A (en) Risk assessment method and device based on user information knowledge graph and electronic equipment
US20220200959A1 (en) Data collection system for effectively processing big data
CN112422574A (en) Risk account identification method, device, medium and electronic equipment
CN114244611B (en) Abnormal attack detection method, device, equipment and storage medium
CN108509561B (en) Post recruitment data screening method and system based on machine learning and storage medium
CN113450075A (en) Work order processing method and device based on natural language technology
CN113657773B (en) Method and device for voice operation quality inspection, electronic equipment and storage medium
US20190171774A1 (en) Data filtering based on historical data analysis
CN116821903A (en) Detection rule determination and malicious binary file detection method, device and medium
CN113888760B (en) Method, device, equipment and medium for monitoring violation information based on software application
CN113377956A (en) Method, device, electronic equipment and medium for predicting black product attack trend
CN115080744A (en) Data processing method and device
CN111429110B (en) Store standardized auditing method, store standardized auditing device, store standardized auditing equipment and store medium
US20220179908A1 (en) Information security device and method thereof
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
Chen et al. Retrieving potential cybersecurity information from hacker forums
Canelón et al. Unstructured data for cybersecurity and internal control
CN113037555A (en) Risk event marking method, risk event marking device and electronic equipment
CN113869904A (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN116775889B (en) Threat information automatic extraction method, system, equipment and storage medium based on natural language processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination