CN116455623A - Computer information security sharing system and method based on big data identification technology - Google Patents

Computer information security sharing system and method based on big data identification technology Download PDF

Info

Publication number
CN116455623A
CN116455623A CN202310357987.1A CN202310357987A CN116455623A CN 116455623 A CN116455623 A CN 116455623A CN 202310357987 A CN202310357987 A CN 202310357987A CN 116455623 A CN116455623 A CN 116455623A
Authority
CN
China
Prior art keywords
data
information
user
webpage
web page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310357987.1A
Other languages
Chinese (zh)
Inventor
王舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Zhenshan Technology Co ltd
Original Assignee
Harbin Zhenshan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Zhenshan Technology Co ltd filed Critical Harbin Zhenshan Technology Co ltd
Priority to CN202310357987.1A priority Critical patent/CN116455623A/en
Publication of CN116455623A publication Critical patent/CN116455623A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1433Saving, restoring, recovering or retrying at system level during software upgrading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/1396Protocols specially adapted for monitoring users' activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a computer information security sharing system based on big data identification technology, which relates to the field of computer information security, and comprises: the system comprises a data acquisition module, a behavior recognition module, a data replacement module and a data cleaning module; the data acquisition module acquires webpage data information and judges webpage safety; the behavior recognition module recognizes a user behavior mode, and the data replacement module performs data replacement; the data cleaning module is used for checking and cleaning the data downloaded from the webpage cache; the invention designs the weight judging method by comparing the correlation of the sensitive words in the webpage, more accurately judges whether the paragraphs containing a plurality of sensitive words transmit complete meanings, captures the sight focusing point of the user through the eye tracker and judges whether the user reads the effective information containing the sensitive words; and an alternative uploading mechanism of the operation data is designed, so that the grasping of the user data is blurred, and the safety of a computer is protected.

Description

Computer information security sharing system and method based on big data identification technology
Technical Field
The invention relates to the field of computer information security, in particular to a computer information security sharing system and method based on big data identification technology.
Background
Big data is a massive data resource which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability to adapt to mass, high growth rate and diversity, and has four characteristics of big data scale, quick data flow, various data types and low value density;
the information sharing is to realize the activity of communication sharing among different layers and departments of information and information products according to the technology and transmission technology of an information system on the basis of information standardization and standardization, and aims to share information resources with other people through the Internet, optimize resource allocation, save social cost, improve the utilization rate of the information resources and create financial surge; the information sharing improves the utilization rate of information resources, and avoids repetition in information acquisition, storage and management;
the computer information security is a security protection on technology and management established and adopted for a data processing system, and protects software, hardware and data of the computer from being damaged, changed and leaked, so that the system can continuously and normally run, information services are not interfered, and the information security comprises operating system security, database security, network security, encryption, authentication and the like, so that information is ensured not to be illegally read, modified and leaked;
In the use process of the computer, because the user is not a professional related to information safety, the computer safety is challenged when the user acquires information and browses the webpage, and because the data is damaged by meaningless operation of the user, personal behavior habit and data of the user are stolen carelessly, a system and a method are required to be provided for protecting the information safety problem of the user when the user browses the webpage.
Disclosure of Invention
The invention aims to provide a computer information safety sharing system and a computer information safety sharing method based on a big data identification technology, which solve the problems in the background technology, and the user information and the computer safety are protected by establishing an untrusted website list, monitoring the reading and operation information of a user in the untrusted website, and replacing the user operation data and processing related data on the untrusted website.
In order to solve the above problems, the present invention provides a computer information security sharing system based on big data identification technology, the information security sharing system comprising: the system comprises a data acquisition module, a behavior recognition module, a data replacement module and a data cleaning module;
the data acquisition module is used for acquiring webpage data information and judging webpage safety; the behavior recognition module is used for recognizing a behavior mode when a user browses webpage data, and the data replacement module is used for replacing data in the background and replacing and updating data information when the user browses webpage information; the data cleaning module is used for checking and cleaning the data downloaded from the webpage cache;
The effectiveness degree of the webpage data and the safety of the webpage are judged by collecting the data information related to the webpage, early warning is provided for the user, the situation that the user is deceptively deceived by false network information when receiving the information is reduced, and the safety warning awareness of the user is improved; the operation habit of the user in the un-trusted webpage is identified through the behavior identification module, the attention stay time of the user in the current webpage is judged, the validity of the information of the attention stay position is judged, whether the user acquires the information from the un-trusted webpage or not is identified, and related operation is carried out; through the data replacement module, unordered replacement is carried out on the operation of the user at the position of acquiring the effective information from the background, the input equipment is driven to upload the operation information, and the information data left by the effective operation of the user on the webpage is replaced or disturbed, so that the current webpage cannot form effective information analysis through the operation data of the user, and pushing is carried out;
further, the data acquisition module comprises an address capturing unit, a webpage list unit and a searching and checking unit; the address capturing unit captures the domain name of the webpage, the IP address of the webpage is obtained through the analysis of a DNS domain name system, and the DNS domain name system searches the real IP address of the network according to the domain name of the webpage; the web page list unit searches the searched domain name and IP address, the web page list is stored by a database, the domain names and the IP addresses of the trusted web page and the untrusted web page are stored, the domain names and the IP addresses of the trusted web page are respectively put into a web page white list, the domain names and the IP addresses of the untrusted web page are put into a web page black list, and the web page black list is classified and stored; the searching and checking unit compares whether the domain name and the IP address of the searched web page exist in the list or not by comparing the web page list information in the database, if the IP address or the domain name is searched in the web page blacklist, the web page is marked as an untrusted web page, and relevant information is added into the web page blacklist, if the matching item in the web page blacklist is searched through the IP address, and the domain name does not exist in the blacklist, the domain name is also added into the web page blacklist; if the IP address and the domain name are found in the webpage white list, marking the current webpage as a trusted webpage, and enabling normal access;
Further, the behavior recognition module comprises a screen monitoring unit, an input device monitoring unit, a behavior capturing unit and a behavior recognition unit; the screen monitoring unit captures information displayed on a screen by monitoring the display content of the screen, judges whether the information belongs to sensitive words, downloads and compares complete sentences in which the sensitive words are positioned, and judges the validity of the information expressed by the complete sentences; the input device monitoring unit captures relevant data of operation performed by a user on a current webpage, wherein the relevant data comprise operation types, operation time and operation areas; the behavior capturing unit captures the position of the user focusing on the screen by using an eye tracker, and records the stay time of each gaze; the behavior recognition unit comprehensively captures user operation data, judges the correlation of a series of operations performed by a user, judges the purpose of the user operation, classifies the operations performed by the user at the same time node according to a time axis, classifies the operation data along the time axis, enables the meaning of the operation data to be more obvious and objective, classifies the data according to the operation type, monitors the continuity of the operation, detects the roller operation with abnormal frequency, for example, the user rolls pages through a mouse roller, and indicates that the user performs fine reading on the current webpage;
Further, the data replacement module comprises a data capturing unit, a data analysis unit, a replacement algorithm unit and a data uploading unit; the data capturing unit is used for receiving user operation data and webpage related data, setting retrieval links of the user operation data according to a classification method, and setting sensitive word retrieval for the webpage related data so as to quickly retrieve the data; the data analysis unit performs correlation analysis of operation data and webpage information according to the captured user data and webpage related data, the replacement algorithm unit performs background data replacement on the operation of a user on a webpage where sensitive information is located, so that the operation of the user is disordered, the operation habit and click preference of the user are irregular, the personal information of the user is hidden, and the data uploading unit is used for uploading operation data designed by a background program and is simulated to be the operation data of the user;
further, the data cleaning module comprises a data checking unit, a data classifying unit and a data processing unit; the data checking unit checks browsing data, downloading data and cache data of the untrusted web page, checks related data stored in the local computer, sets an isolated storage area, and places the related data of the untrusted web page in the isolated storage area, so that Trojan horse programs are prevented from being existed, and other normal programs and systems are prevented from being influenced; the data classifying unit classifies the related data of the webpage, performs security check on the downloaded file information, then moves the file information into a normal storage space of the computer, and screens the configuration information after placing the configuration information into an isolated storage space, retrieves the configuration file aiming at the user information and marks the configuration file; the data processing unit carries out formatting deletion on dangerous files and data, carries out normalized security check on the data in the isolated storage area, records the addresses of the files and the data in the normal storage space of the computer, and when the files and the data are copied, moved and the like, puts forward authority application to an administrator and carries out security check;
The computer information safety sharing method based on the big data identification technology comprises the following steps:
s1, judging whether a webpage is trusted, capturing all sensitive words of the webpage, and judging the validity of information containing the sensitive words according to the positions and the relativity of the sensitive words in the information;
s2, judging whether the user pays attention to effective information containing sensitive words or not according to the eye residence time of the user and the operation record of the user on the page;
s3, according to an effective information interface containing sensitive words and related operations focused by a user, designing unordered operation data to be uploaded synchronously, and blurring the operation data;
s4, processing the related data of the untrusted webpage, ensuring the safety of the storage space of the computer system, analyzing the related data of the untrusted webpage, and isolating or deleting dangerous data;
further, in step S1, whether the web page is trusted or not is judged according to the domain name and the IP address of the web page, and warning is provided for the untrusted web page link, if the user still selects to click on the untrusted web page, all information of the web page is captured, sensitive words are compared and captured according to database information, words and sentences where the sensitive words are located are marked, the sensitive words are associated, and information validity is judged; according to the network security protocol content, loading a sensitive word stock in advance, comparing words to obtain independent sensitive word information, judging the semantics of the sensitive words, judging whether a website is formed by random grabbing and splicing, judging that the website content is not significant for the website which is randomly grabbing and splicing, judging that the website does not contain effective information and has no danger of an information layer, vectorizing the words in natural language by using a distributed representation method, digitizing the symbols, and enabling a plurality of existing models and technologies to vectorize the words, which are not described herein; the relevance judgment of the relation between the sensitive words and the positions in the sentences is as follows, a plurality of sensitive words in one paragraph are extracted, in the extraction process, the sensitive words are recorded, the extraction of one paragraph is carried out until space symbols are searched, the recording is that the mutual information of two closely arranged sensitive words is calculated, and after the mutual information of the two closely arranged sensitive words is calculated in sequence, the association degree comparison is carried out; intercepting paragraphs containing two continuous associated vocabularies, intercepting a sentence containing sensitive vocabularies forwards or backwards, carrying out mutual information calculation on the paragraphs containing 3 sensitive vocabularies, respectively giving two calculated weight, and judging that the paragraph information is valid if the calculated weight sum is higher than a threshold value, which indicates that the whole paragraph expresses valid information; the formula for calculating mutual information of two sensitive words is as follows:
P and Q are vectorization representation of two sensitive words respectively, P (P, Q) is a joint probability density function of two vectors, P (P) and P (Q) are marginal probability densities, if the two vectors are not connected and are independent of each other, mutual information is 0, otherwise, if the connection between the vectors is tighter, the mutual information is larger;
the mutual information of the three sensitive words is calculated to judge whether a plurality of sentences containing three continuous sensitive words are meaningful or not, and the calculation formula is as follows:
I(P,Q,R)=I(P,Q)+I(R,P|Q)
regarding P and Q as a whole, and then solving mutual information between R; the continuous three sensitive words are randomly selected, have no specificity, have higher possibility of selecting paragraphs without meaning, give weight to mutual information of two sensitive words and mutual information of three sensitive words, and judge whether sentences containing the sensitive words have effective information or not;
M=∝*I(P,Q)+β*I(P,Q,R)
wherein, the oc and the beta are respectively given with the weight of two mutual information, and the value of beta is usually larger and can be fit according to actual conditions; if the value of M is greater than the threshold value, judging that the paragraphs containing the sensitive words transmit effective information to form the expression of complete meaning, wherein the threshold value is usually trained by a model and can be obtained by inquiry;
Further, in step S2, according to the user gaze stay area and the user gaze stay time, it is determined whether the user focuses on the effective information including the sensitive vocabulary, and if the user reads the relevant information, the operations after the time point of the user reading the information are recorded, and it is determined which operations have the possibility of information leakage, and which operations expose the operation habit of the user; tracking a user's gaze concentration place through an eye tracker, marking webpage information of the gaze concentration place, recording user gazing time if the webpage information is effective information containing sensitive words, judging general reading time according to the information capacity of the effective information section, comparing the information capacity with the user gazing time, if the gazing time is longer than the general reading time, indicating that the user reads the effective information containing the sensitive words, and recording operations in a later time period of the user, including mouse movement clicking operation and keyboard input operation;
further, in step S3, dangerous operations are processed according to the interface of the user for the effective information including the sensitive vocabulary, and the operation data is replaced unordered through a design algorithm, so that the operation data has no meaning; recording operation after user reads effective information containing sensitive vocabulary, and setting operation type as A j And B j ,A j To store the data class of the mouse operation, the times, time intervals and the sequence on the time axis of the mouse operation are stored, and an operation interval T is set A Is B j In order to store the data class of the keyboard operation type, the time interval, the operation duration and the sequence on the time axis of the keyboard operation performed by the user are stored, and the operation interval is set as T B The method comprises the steps of carrying out a first treatment on the surface of the Data type A j Put into set A, data type B j Putting the data types into a set B, respectively sequencing the data types in the two sets according to the sequence on a time axis, inserting the data in the set B into the data in the set A, and replacing the data types; generating a collectionThe insertion method is as follows:
s301, B in the B data set j Adding a random function such that B j Is a distortion of the data type of (a),
B k =B j +random()
B k inserting the set A according to the sequence of the time axis and according to the operation interval T B The operation data with short selection interval is inserted between the operation data with long selection interval, since B k The data type is already distorted, B k The time axis in the data type does not represent a time axis stamp point under real conditions;
s302, selecting partial data from the A data and the B data, replacing the data type, replacing the A data class with the B data class, and replacing the B data class with the A data class; selecting the replacement data, generating a random array, and replacing the random array with the same numerical value in the random array in the arrangement sequence in the selection set;
S303, sequencing again according to a time axis to generate a set C, wherein the operation data in the set C is completely disturbed at the moment, and uploading the operation data in the set C when the user still stays on the current page so as to achieve the purpose of disturbing the webpage to capture and analyze the operation data of the user;
further, in step S4, the data of the user browsing the untrusted web page is processed, and the security check is performed on the related data of the untrusted web page, and the classification processing is performed; after the disturbing operation is carried out, the local storage of disturbing data is deleted, and the uploading record is stored in a local configuration file so as to update a later website list, and the data is recovered from the website; after a user downloads from a current untrusted website, checking a downloaded file by using antivirus software, setting an isolated storage area, putting the file downloaded from the untrusted website and the cached configuration file into the isolated storage area until the antivirus software confirms that the file data does not contain danger, and putting the file data into a common storage area; in the browser configuration file, configuration information of the untrusted website is stored, and the configuration information of the untrusted website is uploaded to a server for the cloud server to check, so that data information is enriched, and the accuracy of website security detection is improved; after the user browses the untrusted website and clicks to close, deleting the relevant cache data of the untrusted website, and marking the downloaded file so as to quickly search in the computer.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, whether the webpage is randomly generated by software is judged by comparing the correlation of the sensitive words in the webpage, whether paragraphs containing a plurality of sensitive words transmit complete specific meanings is more accurately judged by designing a weight judging method, and whether effective information containing the sensitive words is read by the user is judged by capturing the sight focusing point of the user through the eye tracker; by designing a replacement uploading mechanism of operation data, capturing of user data by the untrusted webpage is obscured, personal privacy of a user is protected, related data of the untrusted webpage is processed, and computer safety is protected.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a schematic diagram of the modular composition of a computer information security sharing system based on big data identification technology;
FIG. 2 is a flow chart of the steps of a method for secure sharing of computer information based on big data identification technology.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1: a computer information security sharing system based on big data recognition technology, as shown in fig. 1, the information security sharing system comprises: the system comprises a data acquisition module, a behavior recognition module, a data replacement module and a data cleaning module;
the data acquisition module is used for acquiring webpage data information and judging webpage safety; the behavior recognition module is used for recognizing a behavior mode when a user browses webpage data, and the data replacement module is used for replacing data in the background and replacing and updating data information when the user browses webpage information; the data cleaning module is used for checking and cleaning the data downloaded from the webpage cache;
the effectiveness degree of the webpage data and the safety of the webpage are judged by collecting the data information related to the webpage, early warning is provided for the user, the situation that the user is deceptively deceived by false network information when receiving the information is reduced, and the safety warning awareness of the user is improved; the operation habit of the user in the un-trusted webpage is identified through the behavior identification module, the attention stay time of the user in the current webpage is judged, the validity of the information of the attention stay position is judged, whether the user acquires the information from the un-trusted webpage or not is identified, and related operation is carried out; through the data replacement module, unordered replacement is carried out on the operation of the user at the position of acquiring the effective information from the background, the input equipment is driven to upload the operation information, and the information data left by the effective operation of the user on the webpage is replaced or disturbed, so that the current webpage cannot form effective information analysis through the operation data of the user, and pushing is carried out;
The data acquisition module comprises an address capturing unit, a webpage list unit and a searching and checking unit; the address capturing unit captures the domain name of the webpage, the IP address of the webpage is obtained through the analysis of a DNS domain name system, and the DNS domain name system searches the real IP address of the network according to the domain name of the webpage; the web page list unit searches the searched domain name and IP address, the web page list is stored by a database, the domain names and the IP addresses of the trusted web page and the untrusted web page are stored, the domain names and the IP addresses of the trusted web page are respectively put into a web page white list, the domain names and the IP addresses of the untrusted web page are put into a web page black list, and the web page black list is classified and stored; the searching and checking unit compares whether the domain name and the IP address of the searched web page exist in the list or not by comparing the web page list information in the database, if the IP address or the domain name is searched in the web page blacklist, the web page is marked as an untrusted web page, and relevant information is added into the web page blacklist, if the matching item in the web page blacklist is searched through the IP address, and the domain name does not exist in the blacklist, the domain name is also added into the web page blacklist; if the IP address and the domain name are found in the webpage white list, marking the current webpage as a trusted webpage, and enabling normal access;
The behavior recognition module comprises a screen monitoring unit, an input device monitoring unit, a behavior capturing unit and a behavior recognition unit; the screen monitoring unit captures information displayed on a screen by monitoring the display content of the screen, judges whether the information belongs to sensitive words, downloads and compares complete sentences in which the sensitive words are positioned, and judges the validity of the information expressed by the complete sentences; the input device monitoring unit captures relevant data of operation performed by a user on a current webpage, wherein the relevant data comprise operation types, operation time and operation areas; the behavior capturing unit captures the position of the user focusing on the screen by using an eye tracker, and records the stay time of each gaze; the behavior recognition unit comprehensively captures user operation data, judges the correlation of a series of operations performed by a user, judges the purpose of the user operation, classifies the operations performed by the user at the same time node according to a time axis, classifies the operation data along the time axis, enables the meaning of the operation data to be more obvious and objective, classifies the data according to the operation type, monitors the continuity of the operation, detects the roller operation with abnormal frequency, for example, the user rolls pages through a mouse roller, and indicates that the user performs fine reading on the current webpage;
The data replacement module comprises a data capturing unit, a data analysis unit, a replacement algorithm unit and a data uploading unit; the data capturing unit is used for receiving user operation data and webpage related data, setting retrieval links of the user operation data according to a classification method, and setting sensitive word retrieval for the webpage related data so as to quickly retrieve the data; the data analysis unit performs correlation analysis of operation data and webpage information according to the captured user data and webpage related data, the replacement algorithm unit performs background data replacement on the operation of a user on a webpage where sensitive information is located, so that the operation of the user is disordered, the operation habit and click preference of the user are irregular, the personal information of the user is hidden, and the data uploading unit is used for uploading operation data designed by a background program and is simulated to be the operation data of the user;
the data cleaning module comprises a data checking unit, a data classifying unit and a data processing unit; the data checking unit checks browsing data, downloading data and cache data of the untrusted web page, checks related data stored in the local computer, sets an isolated storage area, and places the related data of the untrusted web page in the isolated storage area, so that Trojan horse programs are prevented from being existed, and other normal programs and systems are prevented from being influenced; the data classifying unit classifies the related data of the webpage, performs security check on the downloaded file information, then moves the file information into a normal storage space of the computer, and screens the configuration information after placing the configuration information into an isolated storage space, retrieves the configuration file aiming at the user information and marks the configuration file; the data processing unit carries out formatting deletion on dangerous files and data, carries out normalized security check on the data in the isolated storage area, records the addresses of the files and the data in the normal storage space of the computer, and when the files and the data are copied, moved and the like, puts forward authority application to an administrator and carries out security check;
Example 2: the computer information safety sharing method based on the big data identification technology comprises the following steps: in step S1, judging whether the web page is trusted according to the domain name and the IP address of the web page, giving a warning to the un-trusted web page link, capturing all information of the web page if the user still selects to click on the un-trusted web page, comparing and capturing sensitive words according to database information, marking the words and sentences where the sensitive words are located, associating the sensitive words, and judging the information validity; according to the network security protocol content, loading a sensitive word stock in advance, comparing words to obtain independent sensitive word information, judging the semantics of the sensitive words, judging whether a website is formed by random grabbing and splicing, judging that the website content is not significant for the website which is randomly grabbing and splicing, judging that the website does not contain effective information and has no danger of an information layer, vectorizing the words in natural language by using a distributed representation method, digitizing the symbols, and enabling a plurality of existing models and technologies to vectorize the words, which are not described herein; the relevance judgment of the relation between the sensitive words and the positions in the sentences is as follows, a plurality of sensitive words in one paragraph are extracted, in the extraction process, the sensitive words are recorded, the extraction of one paragraph is carried out until space symbols are searched, the recording is that the mutual information of two closely arranged sensitive words is calculated, and after the mutual information of the two closely arranged sensitive words is calculated in sequence, the association degree comparison is carried out; intercepting paragraphs containing two continuous associated vocabularies, intercepting a sentence containing sensitive vocabularies forwards or backwards, carrying out mutual information calculation on the paragraphs containing 3 sensitive vocabularies, respectively giving two calculated weight, and judging that the paragraph information is valid if the calculated weight sum is higher than a threshold value, which indicates that the whole paragraph expresses valid information; the formula for calculating mutual information of two sensitive words is as follows:
P and Q are vectorization representation of two sensitive words respectively, P (P, Q) is a joint probability density function of two vectors, P (P) and P (Q) are marginal probability densities, if the two vectors are not connected and are independent of each other, mutual information is 0, otherwise, if the connection between the vectors is tighter, the mutual information is larger;
the mutual information of the three sensitive words is calculated to judge whether a plurality of sentences containing three continuous sensitive words are meaningful or not, and the calculation formula is as follows:
I(P,Q,R)=I(P,Q)+I(R,P|Q)
regarding P and Q as a whole, and then solving mutual information between R; the continuous three sensitive words are randomly selected, have no specificity, have higher possibility of selecting paragraphs without meaning, give weight to mutual information of two sensitive words and mutual information of three sensitive words, and judge whether sentences containing the sensitive words have effective information or not;
M=∝*I(P,Q)+β*I(P,Q,R)
wherein, the oc and the beta are respectively given with the weight of two mutual information, and the value of beta is usually larger and can be fit according to actual conditions; if the value of M is greater than the threshold value, judging that the paragraphs containing the sensitive words transmit effective information to form the expression of complete meaning, wherein the threshold value is usually trained by a model and can be obtained by inquiry;
In step S2, according to the user gaze stay area and the user gaze stay time, judging whether the user pays attention to the effective information containing the sensitive vocabulary, and reading the information, if the user reads the related information, recording the operations after the time point of the user reading the information, judging which operations have the possibility of information leakage, and which operations expose the operation habit of the user; tracking a user's gaze concentration place through an eye tracker, marking webpage information of the gaze concentration place, recording user gazing time if the webpage information is effective information containing sensitive words, judging general reading time according to the information capacity of the effective information section, comparing the information capacity with the user gazing time, if the gazing time is longer than the general reading time, indicating that the user reads the effective information containing the sensitive words, and recording operations in a later time period of the user, including mouse movement clicking operation and keyboard input operation;
in step S3, dangerous operation is processed according to an interface of a user for effective information containing sensitive words, and operation data is replaced unordered through a design algorithm, so that the operation data has no meaning; recording operation after user reads effective information containing sensitive vocabulary, and setting operation type as A j And B j ,A j To store the data class of the mouse operation, the times, time intervals and the sequence on the time axis of the mouse operation are stored, and an operation interval T is set A Is thatB j In order to store the data class of the keyboard operation type, the time interval, the operation duration and the sequence on the time axis of the keyboard operation performed by the user are stored, and the operation interval is set as T B The method comprises the steps of carrying out a first treatment on the surface of the Data type A j Put into set A, data type B j Putting the data types into a set B, respectively sequencing the data types in the two sets according to the sequence on a time axis, inserting the data in the set B into the data in the set A, and replacing the data types; generating a set C, wherein the insertion method is as follows:
s301, B in the B data set j Adding a random function such that B j Is a distortion of the data type of (a),
B k =B j +random()
B k inserting the set A according to the sequence of the time axis and according to the operation interval T B The operation data with short selection interval is inserted between the operation data with long selection interval, since B k The data type is already distorted, B k The time axis in the data type does not represent a time axis stamp point under real conditions;
s302, selecting partial data from the A data and the B data, replacing the data type, replacing the A data class with the B data class, and replacing the B data class with the A data class; selecting the replacement data, generating a random array, and replacing the random array with the same numerical value in the random array in the arrangement sequence in the selection set;
S303, sequencing again according to a time axis to generate a set C, wherein the operation data in the set C is completely disturbed at the moment, and uploading the operation data in the set C when the user still stays on the current page so as to achieve the purpose of disturbing the webpage to capture and analyze the operation data of the user;
in step S4, processing data of the user browsing the untrusted web page, performing security check on related data of the untrusted web page, and performing classification processing; after the disturbing operation is carried out, the local storage of disturbing data is deleted, and the uploading record is stored in a local configuration file so as to update a later website list, and the data is recovered from the website; after a user downloads from a current untrusted website, checking a downloaded file by using antivirus software, setting an isolated storage area, putting the file downloaded from the untrusted website and the cached configuration file into the isolated storage area until the antivirus software confirms that the file data does not contain danger, and putting the file data into a common storage area; in the browser configuration file, configuration information of the untrusted website is stored, and the configuration information of the untrusted website is uploaded to a server for the cloud server to check, so that data information is enriched, and the accuracy of website security detection is improved; after the user browses the untrusted website and clicks to close, deleting the relevant cache data of the untrusted website, and marking the downloaded file so as to quickly search in the computer.
Example 3:
in step S3, set B 1 =[B,3,3,5];B 2 =[B,3,5,7];B 3 =[B,4,6,9]
For the B data set { B 1 ,B 2 ,B 3 Adding a random function to the data in the sequence, and after the random function is added
B k =B j +random()
B 1 =[B,5,5,8]
B 2 =[B,6,3,9]
B 3 =[B,4,2,5]
B k Inserting the set A according to the sequence of the time axis and according to the operation interval T B Selecting operation data with short interval to be inserted between operation data with long interval, inserting B 1 Insert B 2 And B 3 In C set, B 1 The data is at B 3 And B 3 Between (a) and (b);
selecting partial data from the B data, replacing the data type, and adding B 1 By substitution to A 1 The method comprises the steps of carrying out a first treatment on the surface of the And (4) sequencing again according to the time axis to generate a set C, wherein the data in the set C are disordered.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The computer information security sharing system based on big data identification technology is characterized in that: the information security sharing system includes: the system comprises a data acquisition module, a behavior recognition module, a data replacement module and a data cleaning module;
The data acquisition module is used for acquiring webpage data information and judging webpage safety; the behavior recognition module is used for recognizing a behavior mode when a user browses webpage data, and the data replacement module is used for replacing data in the background and replacing and updating data information when the user browses webpage information; the data cleaning module is used for checking and cleaning the data downloaded from the webpage cache.
2. The secure sharing system of computer information based on big data recognition technology of claim 1, wherein: the data acquisition module comprises an address capturing unit, a webpage list unit and a searching and checking unit; the address capturing unit captures the domain name of the webpage, the IP address of the webpage is obtained through the analysis of a DNS domain name system, and the DNS domain name system searches the real IP address of the network according to the domain name of the webpage; the web page list unit searches the searched domain name and IP address, the web page list is stored by a database, the domain names and the IP addresses of the trusted web page and the untrusted web page are stored, the domain names and the IP addresses of the trusted web page are respectively put into a web page white list, the domain names and the IP addresses of the untrusted web page are put into a web page black list, and the web page black list is classified and stored; the searching and checking unit compares whether the domain name and the IP address of the searched web page exist in the list or not by comparing the web page list information in the database, if the IP address or the domain name is searched in the web page blacklist, the web page is marked as an untrusted web page, and relevant information is added into the web page blacklist, if the matching item in the web page blacklist is searched through the IP address, and the domain name does not exist in the blacklist, the domain name is also added into the web page blacklist; if the IP address and the domain name are found in the webpage white list, the current webpage is marked as a trusted webpage.
3. The secure sharing system of computer information based on big data recognition technology of claim 1, wherein: the behavior recognition module comprises a screen monitoring unit, an input device monitoring unit, a behavior capturing unit and a behavior recognition unit; the screen monitoring unit captures information displayed on a screen by monitoring the display content of the screen, judges whether the information belongs to sensitive words, downloads and compares complete sentences in which the sensitive words are positioned, and judges the validity of the information expressed by the complete sentences; the input device monitoring unit captures relevant data of operation performed by a user on a current webpage, wherein the relevant data comprise operation types, operation time and operation areas; the behavior capturing unit captures the position of the user focusing on the screen by using an eye tracker, and records the stay time of each gaze; the behavior recognition unit is used for comprehensively capturing user operation data, judging the correlation of a series of operations performed by a user, judging the purpose of the user operation, classifying the operations performed by the user at the same time node according to a time axis, classifying the operation data along the time axis, classifying the data according to the operation type, and monitoring the continuity of the operation.
4. The secure sharing system of computer information based on big data recognition technology of claim 1, wherein: the data replacement module comprises a data capturing unit, a data analysis unit, a replacement algorithm unit and a data uploading unit; the data capturing unit is used for receiving user operation data and webpage related data, setting retrieval links of the user operation data according to a classification method, and setting sensitive word retrieval for the webpage related data so as to quickly retrieve the data; the data analysis unit performs correlation analysis of operation data and webpage information according to the captured user data and webpage related data, the replacement algorithm unit performs background data replacement on the operation of a user on a webpage where sensitive information is located, and the data uploading unit is used for uploading operation data designed by a background program and is simulated to be the operation data of the user.
5. The secure sharing system of computer information based on big data recognition technology of claim 1, wherein: the data cleaning module comprises a data checking unit, a data classifying unit and a data processing unit; the data checking unit checks browsing data, downloading data and cache data of the untrusted web page, checks related data stored in the local computer, sets an isolated storage area, and places the related data of the untrusted web page in the isolated storage area; the data classifying unit classifies the related data of the webpage, performs security check on the downloaded file information, then moves the file information into a normal storage space of the computer, and screens the configuration information after placing the configuration information into an isolated storage space, retrieves the configuration file aiming at the user information and marks the configuration file; the data processing unit carries out formatting deletion on dangerous files and data, carries out normalized security check on the data in the isolated storage area, records the addresses of the files and the data placed in the normal storage space of the computer, and when the files and the data are copied, moved and the like, puts forward authority application to an administrator and carries out security check.
6. The computer information safety sharing method based on the big data identification technology comprises the following steps:
s1, judging whether a webpage is trusted, capturing all sensitive words of the webpage, and judging the validity of information containing the sensitive words according to the positions and the relativity of the sensitive words in the information;
s2, judging whether the user pays attention to effective information containing sensitive words or not according to the eye residence time of the user and the operation record of the user on the page;
s3, according to an effective information interface containing sensitive words and related operations focused by a user, designing unordered operation data to be uploaded synchronously, and blurring the operation data;
s4, processing the related data of the untrusted webpage, guaranteeing the safety of the storage space of the computer system, analyzing the related data of the untrusted webpage, and isolating or deleting dangerous data.
7. The method for securely sharing computer information based on big data recognition technology according to claim 6, wherein: in step S1, judging whether the web page is trusted according to the domain name and the IP address of the web page, giving a warning to the un-trusted web page link, capturing all information of the web page if the user still selects to click on the un-trusted web page, comparing and capturing sensitive words according to database information, marking the words and sentences where the sensitive words are located, associating the sensitive words, and judging the information validity; loading a sensitive word stock in advance, comparing words to obtain independent sensitive word information, judging the semantics of the sensitive words, judging whether a website is formed by random grabbing and splicing, judging the website which does not contain effective information as a website which does not have the risk of an information layer, and carrying out vectorization operation on the words; the relevance judgment of the relation between the sensitive words and the positions in the sentences is as follows, a plurality of sensitive words in one paragraph are extracted, in the extraction process, the sensitive words are recorded, the extraction of one paragraph is carried out until space symbols are searched, the recording is that the mutual information of two closely arranged sensitive words is calculated, and after the mutual information of the two closely arranged sensitive words is calculated in sequence, the association degree comparison is carried out; intercepting paragraphs containing two continuous associated vocabularies, intercepting a sentence containing sensitive vocabularies forwards or backwards, carrying out mutual information calculation on the paragraphs containing 3 sensitive vocabularies, respectively giving two calculated weight, and judging that the paragraph information is valid if the calculated weight sum is higher than a threshold value, which indicates that the whole paragraph expresses valid information; the formula for calculating mutual information of two sensitive words is as follows:
P and Q are vectorization representation of two sensitive words respectively, P (P, Q) is a joint probability density function of two vectors, P (P) and P (Q) are marginal probability densities, if the two vectors are not connected and are independent of each other, mutual information is 0, otherwise, if the connection between the vectors is tighter, the mutual information is larger;
the mutual information of the three sensitive words is calculated to judge whether a plurality of sentences containing three continuous sensitive words are meaningful or not, and the calculation formula is as follows:
I(P,Q,R)=I(P,Q)+I(R,P|Q)
regarding P and Q as a whole, and then solving mutual information between R; giving weights to the two sensitive word mutual information and the three sensitive word mutual information, and judging whether sentences containing the sensitive words have effective information or not;
M=∝*I(P,Q)+P*I(P,Q,R)
wherein ∈and β are weights respectively given to two mutual information; if the value of M is larger than the threshold value, judging that the paragraphs containing the sensitive words transmit effective information to form a complete meaning expression with meaning.
8. The method for securely sharing computer information based on big data recognition technology according to claim 6, wherein: in step S2, according to the user gaze stay area and the user gaze stay time, judging whether the user pays attention to the effective information containing the sensitive vocabulary, and reading the information, if the user reads the related information, recording the operations after the time point of the user reading the information, judging which operations have the possibility of information leakage, and which operations expose the operation habit of the user; and tracking the eye focusing part of the user through the eye tracker, marking webpage information of the eye focusing part, recording the gazing time of the user if the webpage information is effective information containing sensitive words, judging the general reading time according to the information capacity of the effective information section, comparing the information capacity with the gazing time of the user, and if the gazing time is longer than the general reading time, indicating that the user reads the effective information containing the sensitive words, and recording the operation in the later time period of the user.
9. The method for securely sharing computer information based on big data recognition technology according to claim 6, wherein: in step S3, dangerous operation is processed according to an interface of a user for effective information containing sensitive vocabulary, and unordered replacement is carried out on operation data through a design algorithm; recording operation after user reads effective information containing sensitive vocabulary, and setting operation type as A j And B j ,A j To store the data class of the mouse operation, the times, time intervals and the sequence on the time axis of the mouse operation are stored, and an operation interval T is set A Is B j In order to store the data class of the keyboard operation type, the time interval, the operation duration and the sequence on the time axis of the keyboard operation performed by the user are stored, and the operation interval is set as T B The method comprises the steps of carrying out a first treatment on the surface of the Data type A j Put into set A, data type B j Putting the data types into a set B, respectively sequencing the data types in the two sets according to the sequence on a time axis, inserting the data in the set B into the data in the set A, and replacing the data types; generating a set C, wherein the insertion method is as follows:
s301, B in the B data set j Adding a random function such that B j Is a distortion of the data type of (a),
B k =B j +random()
B k inserting the set A according to the sequence of the time axis and according to the operation interval T B Selecting operation data with short interval to insert between operation data with long interval;
s302, selecting partial data from the A data and the B data, replacing the data type, replacing the A data class with the B data class, and replacing the B data class with the A data class; selecting the replacement data, generating a random array, and replacing the random array with the same numerical value in the random array in the arrangement sequence in the selection set;
s303, sequencing again according to a time axis to generate a set C, wherein the operation data in the set C is completely disturbed, and uploading the operation data in the set C when the user still stays on the current page.
10. The method for securely sharing computer information based on big data recognition technology according to claim 6, wherein: in step S4, processing data of the user browsing the untrusted web page, performing security check on related data of the untrusted web page, and performing classification processing; after the disturbing operation is carried out, the local storage of disturbing data is deleted, and the uploading record is saved to a local configuration file; after a user downloads from a current untrusted website, checking a downloaded file by using antivirus software, setting an isolated storage area, putting the file downloaded from the untrusted website and the cached configuration file into the isolated storage area until the antivirus software confirms that the file data does not contain danger, and putting the file data into a common storage area; storing configuration information of the untrusted website in the browser configuration file, and uploading the configuration information of the untrusted website to a server for the cloud server to check; after the user browses the untrusted website and clicks to close, deleting the relevant cache data of the untrusted website, and marking the downloaded file.
CN202310357987.1A 2023-04-06 2023-04-06 Computer information security sharing system and method based on big data identification technology Pending CN116455623A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310357987.1A CN116455623A (en) 2023-04-06 2023-04-06 Computer information security sharing system and method based on big data identification technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310357987.1A CN116455623A (en) 2023-04-06 2023-04-06 Computer information security sharing system and method based on big data identification technology

Publications (1)

Publication Number Publication Date
CN116455623A true CN116455623A (en) 2023-07-18

Family

ID=87126562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310357987.1A Pending CN116455623A (en) 2023-04-06 2023-04-06 Computer information security sharing system and method based on big data identification technology

Country Status (1)

Country Link
CN (1) CN116455623A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117520627A (en) * 2023-10-18 2024-02-06 广州汉申信息科技有限公司 Project retrieval data processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117520627A (en) * 2023-10-18 2024-02-06 广州汉申信息科技有限公司 Project retrieval data processing method and device
CN117520627B (en) * 2023-10-18 2024-04-26 广州汉申信息科技有限公司 Project retrieval data processing method and device

Similar Documents

Publication Publication Date Title
Javed et al. A comprehensive survey on computer forensics: State-of-the-art, tools, techniques, challenges, and future directions
KR100723867B1 (en) Apparatus and method for blocking access to phishing web page
CN107332848B (en) Network flow abnormity real-time monitoring system based on big data
EP2248062B1 (en) Automated forensic document signatures
Ren et al. CSKG4APT: A cybersecurity knowledge graph for advanced persistent threat organization attribution
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
US20190347429A1 (en) Method and system for managing electronic documents based on sensitivity of information
Das Guptta et al. Modeling hybrid feature-based phishing websites detection using machine learning techniques
CN104471582A (en) Defense against search engine tracking
CN109347808B (en) Safety analysis method based on user group behavior activity
Yang et al. Scalable detection of promotional website defacements in black hat {SEO} campaigns
Guo et al. Research and review on computer forensics
Huang et al. Monitoring social media for vulnerability-threat prediction and topic analysis
Rahman et al. A literature review on mining cyberthreat intelligence from unstructured texts
Le Page et al. Domain classifier: Compromised machines versus malicious registrations
CN116455623A (en) Computer information security sharing system and method based on big data identification technology
Lee et al. Toward Semantic Assessment of Vulnerability Severity: A Text Mining Approach.
Mhaske-Dhamdhere et al. A novel approach for phishing emails real time classification using k-means algorithm
Michalas et al. MemTri: A memory forensics triage tool using bayesian network and volatility
KR102318297B1 (en) Crime detection system through fake news decision and web monitoring and Method thereof
Canelón et al. Unstructured data for cybersecurity and internal control
Sun et al. Leveraging machine learning techniques to identify deceptive decoy documents associated with targeted email attacks
Kayarkar et al. Mining frequent sequences for emails in cyber forensics investigation
Gomes de Barros et al. Piracema: a Phishing snapshot database for building dataset features
RU2778460C1 (en) Method and apparatus for clustering phishing web resources based on an image of the visual content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination