CN110442771B

CN110442771B - Deep learning-based method and device for detecting station tampering

Info

Publication number: CN110442771B
Application number: CN201910741015.6A
Authority: CN
Inventors: 魏向前; 张融
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2023-09-29
Anticipated expiration: 2039-08-12
Also published as: CN110442771A

Abstract

The application relates to a method and a device for detecting station tampering based on deep learning, and belongs to the field of communication. The method comprises the following steps: and crawling page content in a site to be detected, inputting the picture into a sensitive image detection model when the page content is the picture, wherein the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not, acquiring a detection result output by the sensitive image detection model, and determining that the site to be detected is tampered when the detection result is that the picture comprises the sensitive image. When the page content is text information, inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, acquiring a detection result output by the text detection model, and determining that the station to be detected is tampered when the detection result is that the text information comprises the sensitive information. The application can improve the accuracy of whether the station is tampered.

Description

Deep learning-based method and device for detecting station tampering

Technical Field

The present application relates to the field of communications, and in particular, to a method and apparatus for detecting station tampering based on deep learning.

Background

Sites of websites often have a large amount of content such as web pages that can be accessed and browsed by users. However, currently, lawbreakers can tamper with web pages in sites, sensitive information related to pornography, gambling, riot, politics and the like can be embedded in the sites, and the sensitive information is transmitted through the sites. For example, referring to the site home page shown in fig. 1, links of several pages are included on the site home page, which are respectively news in a hospital, medical common knowledge, public hospital, online consultation, and life aid development funds. Referring to fig. 2, but after the station is tampered with by an unlawful party, the link of the medical knowledge is modified to Jin Boqi cards for promoting gambling.

In order to prevent sensitive information from being embedded in a site for transmission, the site can be detected, and the tampered site can be detected so as to remind an administrator of the site of timely processing. The current method for detecting the site comprises the following steps: and crawling each webpage in the site, and acquiring an MD5 (Message-Digest Algorithm 5) value of each webpage. And comparing the MD5 value of each webpage with the MD5 value of each webpage obtained in a history mode. When the MD5 value of a certain webpage is different from the MD5 value of the webpage obtained in a history mode, determining that the content of the webpage is tampered, and reminding a manager of the website of the webpage.

The web pages in the site may be dynamic web pages, and the change of the content of the dynamic web pages may not be due to tampering of sensitive content by lawbreakers, but may also be falsely detected by the method. In addition, the lawbreaker can make the sensitive content into a web page alone and put the web page on the site, and the web page cannot be detected by the method because the MD5 value obtained by the web page is not used. The accuracy of the current solution in detecting tampering with the site is low.

Disclosure of Invention

The embodiment of the application provides a method and a device for detecting a site, which are used for improving the accuracy of whether the site is tampered. The technical scheme is as follows:

in one aspect, the present application provides a method for detecting station tampering based on deep learning, the method comprising:

crawling address information of page content and the page content in a site to be detected;

determining the type of the page content according to the suffix of the address information;

when the type is a picture, inputting the picture into a sensitive image detection model, wherein the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not, acquiring a detection result output by the sensitive image detection model, and determining that the station to be detected is tampered when the detection result is that the picture comprises the sensitive image;

When the type is text information, inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, acquiring a detection result output by the text detection model, and determining that the station to be detected is tampered when the detection result is that the text information comprises the sensitive information.

Optionally, the inputting the page content into the sensitive information detection model, and obtaining the detection result output by the sensitive information detection model includes:

when the page content is a picture, extracting text information in the picture;

and inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, and acquiring a detection result output by the text detection model.

Optionally, the inputting the text information into a text detection model includes:

when a sensitive word exists in words included in the text information, acquiring x words adjacent to the sensitive word from the text information, wherein x is an integer greater than 1;

acquiring word vectors of each word of x+1 words, wherein the word vectors of the words are semantic representations of the words, and the x+1 words comprise the sensitive words and the x words;

And inputting the word vector of each word into a text detection model according to the sequence of each word in the text information.

Optionally, the x words include x/2 words located before and adjacent to the sensitive word and x/2 words located after and adjacent to the sensitive word in the text information.

Optionally, the inputting the picture into the sensitive image detection model includes:

and graying the picture to obtain a gray level image, converting the converted size of the gray level image into a preset size, and inputting the converted gray level image into the sensitive image detection model.

In another aspect, the present application provides a deep learning-based device for detecting tampering with a site, the device comprising:

the crawling module is used for crawling the address information of the page content in the site to be detected and the page content;

the determining module is used for determining the type of the page content according to the suffix of the address information;

the first acquisition module is used for inputting the picture into a sensitive image detection model when the type is the picture, the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not, acquiring a detection result output by the sensitive image detection model, and determining that the station to be detected is tampered when the detection result is that the picture comprises the sensitive image;

The second acquisition module is used for inputting the text information into a text detection model when the type is the text information, the text detection model is used for detecting whether the text information comprises sensitive information or not, acquiring a detection result output by the text detection model, and determining that the station to be detected is tampered when the detection result is that the text information comprises the sensitive information.

Optionally, the apparatus includes:

the extraction module is used for extracting text information in the picture when the page content is the picture; and inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, and acquiring a detection result output by the text detection model.

Optionally, the second obtaining module is configured to:

Optionally, the first obtaining module is configured to:

In another aspect, the present application provides an electronic device comprising at least one processor and at least one memory for storing at least one instruction loaded and executed by the at least one processor to implement the above-described method.

In another aspect, the present application provides a computer readable storage medium storing at least one instruction for loading and execution by a processor to implement the method described above.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

By crawling the page content in the site to be detected; inputting the page content into a sensitive information detection model, wherein the sensitive information detection model is used for detecting whether the page content contains sensitive information or not based on the page content, and acquiring a detection result output by the information detection model; and when the detection result shows that the page content contains sensitive information, determining that the station to be tampered. Because the sensitive information detection model is used for detecting based on page content, sensitive information can be detected based on meaning expressed by page content, accuracy of detecting the sensitive information is improved, and further accuracy of detecting whether a site is tampered is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

FIG. 1 is a schematic diagram of a prior art page;

FIG. 2 is a schematic diagram of another page of the prior art;

FIG. 3 is a flowchart of a method for detecting site tampering based on deep learning according to an embodiment of the present application;

FIG. 4 is a flow chart of crawling page content provided by an embodiment of the present application;

FIG. 5 is a flowchart of detecting a picture according to an embodiment of the present application;

FIG. 6 is a flow chart of training a first deep learning network provided by an embodiment of the present application;

FIG. 7 is a flow chart of detecting text information provided by an embodiment of the present application;

FIG. 8 is a flow chart of training a second deep learning network provided by an embodiment of the present application;

fig. 9 is a schematic structural diagram of a device for detecting site tampering based on deep learning according to an embodiment of the present application;

fig. 10 is a schematic diagram of a terminal structure according to an embodiment of the present application.

Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

The content in the site may be tampered with by an illegitimate party, so that the site includes sensitive information related to pornography, gambling, riot and politics.

In order to detect the sensitive information in the tampered site, a sensitive information detection model is trained in the application, and the sensitive information detection model can detect whether the page content in the site comprises the sensitive information or not. The page content in the site mainly comprises two types, namely a picture and text information. The picture may be a picture including image content, a picture including text information, and a picture including both text information and image content.

Sensitive information includes two types, one type of sensitive information being text information, for example, text information for advertising pornography, gambling, riot, or devising politics. Another type of sensitive information is image content, such as pornography, gambling, etc. The trained sensitive information detection model comprises a sensitive image detection model and a text detection model. The sensitive image detection model is used for detecting sensitive images in pictures, and the text detection model is used for detecting whether the text information comprises sensitive information or not.

The sensitive image detection model is an intelligent detection model obtained by training the first deep learning network. The text detection model is an intelligent model obtained by training the second deep learning network. After the sensitive image detection model and the text detection model are obtained through training, whether the site is tampered with by an illegal party can be detected through any embodiment, so that the site comprises sensitive information.

Referring to fig. 3, an embodiment of the present application provides a method for detecting tampering of a site based on deep learning, including:

step 101: and crawling the address information of the page content of the site to be detected and the page content, and determining the type of the page content according to the suffix determination of the address information.

The site to be detected comprises a site home page, the site home page comprises address information of each page belonging to the site to be detected, and each page can comprise page content belonging to the site to be detected.

For any one of the pages, the page content of that page may include at least one of text information, picture or address information of other pages, etc. When the page content is a picture, the page comprises address information of the picture.

It should be noted that: for address information of a page included in a first page of a site, the page corresponding to the address information of the page may be a page belonging to the site to be detected, or may be a page belonging to other sites. Optionally, address information of other pages may be included in the page, where other pages corresponding to the address information of other pages may be pages belonging to the site to be detected, or may be pages belonging to other sites.

The address information of the site top page of the site to be detected is the domain name information of the site to be detected. The address information of any page belonging to the site to be detected comprises the domain name information, the storage address of the page in the site to be detected and the suffix. For the page content belonging to the to-be-detected site, if the page content is a picture, the address information of the picture includes domain name information of the to-be-detected site, a storage address of the picture in the to-be-detected site and a suffix, and the suffix may be a picture type of the picture. The picture type may be jpg, jpeg, png, bmp or gif, etc.

In this step, referring to fig. 4, domain name information of a site to be detected is obtained, and page content included in a site top page of the site to be detected is crawled according to the domain name information of the site to be detected. And crawling address information in the first page of the station, and judging that the domain name information included in the address information is the same as the domain information of the station to be detected.

If the content is different, other content in the first page of the site is continuously crawled. If the address information is the same, judging whether the suffix of the address information is of a picture type, if so, crawling a picture corresponding to the address information, and storing the corresponding relation between the address information and the picture in a picture list. And if the picture type is not the picture type, crawling the page content in the page corresponding to the address information.

The page content of the page may include at least one of text information, address information of a picture, address information of other pages, or the like. When the text information is crawled in the page, the corresponding relation between the address information and the text information is stored in a text list. When the address information is crawled, when the suffix of the address information is the picture type, the address information indicating that the address information is the picture is crawled based on the address information, and the address information and the picture are correspondingly stored in a picture list. When the suffix of the address information is not the picture type, the address information is indicated to be the address information of the page, whether the domain name information included in the address information is the domain name information of the site to be detected is judged, and if so, the page is crawled based on the address information.

And for any one piece of address information, after all the contents in the page corresponding to the address information are crawled, returning to the page where the address information is located, and continuously crawling the contents of the page. Until all content in the site to be detected is crawled.

For any one page, if address information is included in the page, the address information is located after a preset address tag. Preset address tags are href and src, etc.

Step 102: when the type of the page content is a picture, extracting text information in the picture, and correspondingly storing the text information and the address information of the picture in a text list.

The picture may be a picture including image content, a picture including text information, and a picture including both text information and image content.

For a picture comprising image content, no text information is extracted from the picture. For a picture comprising text information or a picture comprising both text information and image content, the text information is extracted from the picture.

In this step, a picture and address information corresponding to the picture are read from a picture list, text information may be extracted from the picture using OCR (Optical Character Recognition ) technology, and the extracted text information and the address information are stored in the text list in correspondence.

In this step, the picture may be grayed to obtain a gray scale, from which text information may be extracted using OCR techniques.

Step 103: and inputting the crawled picture into a sensitive image detection model, wherein the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not, and obtaining a detection result output by the sensitive image detection model.

In this step, referring to fig. 5, a first picture stored in a picture list may be read, and the first picture is input into a sensitive image detection model, and a detection result corresponding to the first picture is output by the sensitive image detection model. And reading a second picture stored in the picture list, inputting the second picture into a sensitive image detection model, and obtaining a detection result corresponding to the second picture output by the sensitive image detection model. Repeating the above process until the sensitive image detection model is obtained and the detection result corresponding to the last picture in the picture list is output.

Optionally, before inputting the crawled picture into the sensitive image detection model, the crawled picture may be further grayed to obtain a gray level image of the picture. Converting the size of the gray level map into a preset size, and inputting the converted gray level map into a sensitive image detection model. The sensitive image detection model detects the input gray level map, and detects whether the sensitive image is included.

The sensitive image detection model is obtained by training a first deep learning network in advance. The plurality of image samples and the labeling information corresponding to each image sample may be set in advance. For any one of the image samples, when the image sample is a sensitive image, the labeling information of the image sample indicates that the image sample is a sensitive image, and the sensitive image can be an image related to pornography, gambling, riot, politics, or the like. When the image sample is not a sensitive image, the annotation information for the image sample indicates that the image sample is not a sensitive image.

The labeling information of the image sample may be represented by a value of 1 when the image sample is a sensitive image and by a value of 0 when the image sample is not a sensitive image. Alternatively, when the image sample is a sensitive image, the labeling information of the image sample may be represented by a value of 0, and when the image sample is not a sensitive image, the labeling information of the image sample may be represented by a value of 1.

In this step, the first deep learning network is trained using the plurality of image samples and the annotation information for each image sample. Referring to fig. 6, the training process is as follows:

1031: the plurality of image samples and the labeling information corresponding to each image sample are input to the first deep learning network while the first deep learning network is trained.

1032: and the first deep learning network detects whether each image sample is a sensitive image, and a detection result of each image sample is obtained.

1033: the first deep learning network compares the detection result of each image sample with the labeling information to obtain difference information, and adjusts network parameters according to the difference information.

The difference information obtained by comparing the detection result and the labeling information of each image sample may be a vector, and an element of the vector is a comparison result corresponding to each image sample. The corresponding comparison result of the image samples can be represented by a value of 0 or 1. The detection result of the image sample may be represented by a value of 0, which is the same as the labeling information, and the detection result of the image sample may be represented by a value of 1, which is different from the labeling information. Alternatively, the detection result of the image sample may be represented by a value 1 that is the same as the labeling information, and the detection result of the image sample may be represented by a value 0 that is different from the labeling information.

1034: the first deep learning network inputs the difference information into a preset cost function, calculates a cost value, and returns to 1032 when the cost value is not the minimum cost value of the preset cost function.

When the cost value is not the minimum cost value of the preset cost function, the first deep learning network then repeats the above process to detect whether each image sample is a sensitive image.

And stopping the operation of training the first deep learning network when the cost value is the minimum cost value of the preset cost function, and taking the first deep learning network as a sensitive image detection model.

Alternatively, the first deep learning network may be ResNet, and the number of convolutional layers of ResNet may be set to 32, the number of filters to 16, and the size of each filter to 3 before training ResNet. Thus, the set ResNet is trained to obtain a sensitive image detection model, and the sensitive image detection model has higher detection precision.

Step 104: when the detection result is that the picture comprises a sensitive image, the address information of the crawled picture is obtained, the point to be detected is determined to be tampered, the picture is taken as evidence content, and the address information of the picture and the evidence content are correspondingly stored in the corresponding relation between the address information and the evidence content.

Each picture in the picture list is detected by the above-described steps 103 and 104, whereby each picture including a sensitive image can be detected. Because the sensitive image detection model can detect based on the image content in the picture, whether the picture comprises the sensitive image or not can be accurately detected, and the accuracy of detecting whether the site comprises the sensitive image or not can be improved.

Step 105: when the type of the page content is text information, the crawled text information is input into a text detection model, the text detection model is used for detecting whether the text information comprises sensitive information, and a detection result output by the text detection model is obtained.

Referring to fig. 7, in this step, the first text information stored in the text list may be read, the first text information is input into the sensitive image detection model, and the detection result corresponding to the first text information is output by the text information detection model. And reading second text information stored in the text list, inputting the second text information into a text detection model, and obtaining a detection result corresponding to the second text information output by the text detection model. Repeating the above process until the text detection model is obtained and the detection result corresponding to the last text information in the text list is output.

Before the crawled text information is input into a text detection model, the text information is segmented to obtain each word included in the text information, and the word vector of each word included in the text information is obtained from the corresponding relation between the word and the word vector. The word vector of a word is a semantic representation of the word, and then the word vector of each word is input to the text detection model in the order of each word in the text information. The text detection model detects whether the text information includes sensitive information based on word vectors of words.

The corresponding relation between the words and the word vectors can be downloaded from a network.

The sensitive information typically includes sensitive words, which are words related to at least one of pornography, gambling, riot, politics, and the like. The sensitive information is usually a sentence of text information or several sentences of text information or the like including the sensitive word.

The word vector of each word in the text information is input to the text detection model, which increases the detection amount of the text detection model.

In order to solve the problem, when words are segmented to obtain words included in the text information, whether sensitive words in a sensitive dictionary exist in the words included in the text information is judged. And if the sensitive words exist in each word included in the text information, acquiring x words adjacent to the sensitive words, wherein x is an integer greater than 1. And acquiring the word vector of the sensitive word and the word vector of each word in the x words from the corresponding relation between the words and the word vector. And then inputting the word vector of the sensitive word and the word vector of each word in the x words into a text detection model according to the sequence in the text information. The text detection model detects whether the text information includes sensitive information based on the word vector of the sensitive word and the word vector of each of the x words. Therefore, the number of words detected by the text detection module can be reduced, and the detection efficiency is improved.

If the text information includes words with sensitive words, x/2 words which are positioned before the sensitive words and are adjacent to the sensitive words and x/2 words which are positioned after the sensitive words and are adjacent to the sensitive words can be obtained.

If the number of words preceding the sensitive word in the text information is less than x/2, all words preceding the sensitive word are obtained. If the number of words located after the sensitive word in the text information is less than x/2, all words located after the sensitive word are acquired.

When the text information includes a sensitive word, it indicates that the text information may or may not include sensitive information. For example, drugs are sensitive words, and for text information "severe hit drugs" that includes a sensitive word, but is not sensitive, word vectors of words adjacent to the sensitive word are input to a text detection model. The text detection model detects whether the input text information is sensitive information based on the context of the sensitive words, so that the detection accuracy is improved.

The text detection model is obtained by training the second deep learning network in advance. A plurality of text samples and labeling information corresponding to each text sample may be set in advance. For any one of the text samples, when the text sample is sensitive information including a sensitive word, the labeling information of the text sample indicates that the text sample is sensitive information, and the sensitive information may be text information related to pornography, gambling, riot, politics, and the like. When the text sample is non-sensitive information including a sensitive word, the annotation information of the text sample indicates that the text sample is not sensitive information.

The labeling information of the text sample may be represented by a value of 1 when the text sample is sensitive information, and by a value of 0 when the text sample is not sensitive information. Alternatively, when the text sample is sensitive information, the labeling information of the text sample may be represented by a value of 0, and when the text sample is not sensitive information, the labeling information of the text sample may be represented by a value of 1.

In this step, the second deep learning network is trained using the plurality of text samples and the annotation information for each text sample. Referring to fig. 8, the training process is as follows:

1051: and when the second deep learning network is trained, the plurality of text samples and the labeling information corresponding to each text sample are input into the second deep learning network.

1052: and the second deep learning network detects whether each text sample is sensitive information or not to obtain a detection result of each text sample.

1053: and the second deep learning network compares the detection result of each text sample with the labeling information to obtain difference information. The second deep learning network adjusts own network parameters according to the difference information.

The difference information obtained by comparing the detection result and the labeling information of each text sample may be a vector, and an element of the vector is a comparison result corresponding to each text sample. The comparison result corresponding to the text sample may have a value of 0 or 1. The detection result of the text sample may be represented by a value of 0, which is the same as the labeling information, and the detection result of the text sample may be represented by a value of 1, which is different from the labeling information. Alternatively, the detection result of the text sample may be represented by a value 1 that is the same as the labeling information, and the detection result of the text sample may be represented by a value 0 that is different from the labeling information.

1054: the second deep learning network inputs the difference information into a preset cost function, and the cost value is calculated. Execution returns to 1052 when the cost value is not the minimum cost value of the preset cost function.

And when the cost value is not the minimum cost value of the preset cost function, the second deep learning network repeats the process to detect whether each text sample is sensitive information. And stopping the operation of training the second deep learning network when the cost value is the minimum cost value of the preset cost function, and taking the second deep learning network as a text detection model.

Alternatively, the second deep learning network may be an LSTM, and prior to training the LSTM, the number of words permitted to be output by the LSTM may be set to x+1, and the dimensions of the word vectors permitted to be input may be set. For example, the dimension of the word vector allowed to be input is set to 100, 200, 300, or the like. That is, the dimension of the word vector of each word is the dimension of the setting.

Step 106: when the detection result is that the text information comprises sensitive information, determining that a station to be detected is tampered, acquiring address information of a page where the text information is located, taking the text information as evidence content, and correspondingly storing the address information of the page and the evidence content in a corresponding relation between the address information and the evidence content.

In this step, the sensitive word and x words adjacent to the sensitive word input into the text detection model may be taken as evidence contents.

Each text message in the text list is detected by the above-described steps 105 and 106. Because the text detection model can detect based on the semantics of the text information, whether the text information comprises sensitive information or not can be accurately detected, and the accuracy of detecting whether the site comprises the sensitive information or not can be improved.

In this embodiment, when a picture including a sensitive image or text information including sensitive information is detected, it is determined that a station to be detected is tampered with. The correspondence between the address information and the evidence content can be sent to the terminal corresponding to the administrator of the site. Thus, the administrator can check the content in the corresponding relation between the address information and the evidence content and process the site to be detected.

In the embodiment of the application, the page content and the address information of the page content in the site to be detected are crawled; and the suffix of the address information is of a picture type, which indicates that the page content is a picture, otherwise, indicates that the page content is text information. Inputting the picture into a sensitive image detection model, wherein the sensitive image detection model is used for detecting whether the picture contains a sensitive image or not based on the image capacity in the picture, and acquiring a detection result output by the sensitive image detection model; and when the detection result is that the sensitive image is contained, determining that the station to be tampered. The sensitive image detection model is used for detecting based on the image content in the picture, so that the accuracy of detecting the sensitive image is improved, and the accuracy of detecting whether the station is tampered is further improved. Inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information contains sensitive information or not based on the semantics of the text information, and acquiring a detection result output by the text detection model; and when the detection result is that the sensitive information is contained, determining that the station to be tampered. The text detection model is used for detecting based on the content in the text information, so that the accuracy of detecting the sensitive information is improved, and the accuracy of detecting whether the station is tampered is further improved.

The following are examples of the apparatus of the present application that may be used to perform the method embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method of the present application.

Referring to fig. 9, the present application provides a deep learning-based device 200 for detecting site tampering, the device 200 comprising:

a crawling module 201, configured to crawl address information of page content and the page content in a site to be detected;

a determining module 202, configured to determine a type of the page content according to the suffix of the address information;

the first obtaining module 203 is configured to input, when the type is a picture, the picture into a sensitive image detection model, where the sensitive image detection model is configured to detect whether the picture includes a sensitive image, obtain a detection result output by the sensitive image detection model, and determine that the station to be detected is tampered when the detection result is that the picture includes a sensitive image;

and the second obtaining module 204 is configured to input the text information into a text detection model when the type is text information, where the text detection model is configured to detect whether the text information includes sensitive information, obtain a detection result output by the text detection model, and determine that the site to be detected is tampered when the detection result is that the text information includes sensitive information.

Optionally, the apparatus 200 further includes:

the extraction module is used for extracting text information in the picture when the type is the picture; and inputting the text information into a text detection model, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, and acquiring a detection result output by the text detection model.

Optionally, the second obtaining module 204 is configured to:

Optionally, the first obtaining module 203 is configured to:

In the embodiment of the application, the page content in the station to be detected is crawled; inputting the page content into a sensitive information detection model, wherein the sensitive information detection model is used for detecting whether the page content contains sensitive information or not based on the page content, and acquiring a detection result output by the information detection model; and when the detection result shows that the page content contains sensitive information, determining that the station to be tampered. Because the sensitive information detection model is used for detecting based on page content, sensitive information can be detected based on meaning expressed by page content, accuracy of detecting the sensitive information is improved, and further accuracy of detecting whether a site is tampered is improved.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Fig. 10 shows a block diagram of a terminal 300 according to an exemplary embodiment of the present application. The method for executing the station detection by the terminal 300 may be a smart phone, a tablet computer, a notebook computer or a desktop computer. The terminal 300 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.

In general, the terminal 300 includes: a processor 301 and a memory 302.

Processor 301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 301 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 301 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 301 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 301 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 302 may include one or more computer-readable storage media, which may be non-transitory. Memory 302 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 302 is used to store at least one instruction for execution by processor 301 to implement the method of detecting a site provided by a method embodiment of the present application.

In some embodiments, the terminal 300 may further optionally include: a peripheral interface 303, and at least one peripheral. The processor 301, memory 302, and peripheral interface 303 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 303 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 304, touch screen 305, camera 306, audio circuitry 307, positioning component 308, and power supply 309.

The peripheral interface 303 may be used to connect at least one Input/Output (I/O) related peripheral to the processor 301 and the memory 302. In some embodiments, processor 301, memory 302, and peripheral interface 303 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 301, the memory 302, and the peripheral interface 303 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 304 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuitry 304 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 304 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 304 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 304 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 304 may also include NFC (Near Field Communication ) related circuitry, which is not limiting of the application.

The display screen 305 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 305 is a touch screen, the display 305 also has the ability to collect touch signals at or above the surface of the display 305. The touch signal may be input as a control signal to the processor 301 for processing. At this point, the display 305 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 305 may be one, providing a front panel of the terminal 300; in other embodiments, the display screen 305 may be at least two, respectively disposed on different surfaces of the terminal 300 or in a folded design; in still other embodiments, the display 305 may be a flexible display disposed on a curved surface or a folded surface of the terminal 300. Even more, the display screen 305 may be arranged in an irregular pattern other than rectangular, i.e., a shaped screen. The display 305 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 306 is used to capture images or video. Optionally, the camera assembly 306 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 306 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuit 307 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 301 for processing, or inputting the electric signals to the radio frequency circuit 304 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 300. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 301 or the radio frequency circuit 304 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 307 may also include a headphone jack.

The location component 308 is used to locate the current geographic location of the terminal 300 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 308 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, or the Galileo system of Russia.

The power supply 309 is used to power the various components in the terminal 300. The power source 309 may be alternating current, direct current, disposable or rechargeable. When the power source 309 comprises a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 300 further includes one or more sensors 310. The one or more sensors 310 include, but are not limited to: acceleration sensor 311, gyroscope sensor 312, pressure sensor 313, fingerprint sensor 314, optical sensor 315, and proximity sensor 316.

The acceleration sensor 311 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 300. For example, the acceleration sensor 311 may be used to detect components of gravitational acceleration on three coordinate axes. The processor 301 may control the touch display screen 305 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 311. The acceleration sensor 311 may also be used for the acquisition of motion data of a game or a user.

The gyro sensor 312 may detect the body direction and the rotation angle of the terminal 300, and the gyro sensor 312 may collect the 3D motion of the user to the terminal 300 in cooperation with the acceleration sensor 311. The processor 301 may implement the following functions according to the data collected by the gyro sensor 312: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

The pressure sensor 313 may be disposed at a side frame of the terminal 300 and/or at a lower layer of the touch screen 305. When the pressure sensor 313 is disposed at a side frame of the terminal 300, a grip signal of the terminal 300 by a user may be detected, and the processor 301 performs left-right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 313. When the pressure sensor 313 is disposed at the lower layer of the touch screen 305, the processor 301 performs control over the operability control on the UI interface according to the pressure operation of the user on the touch screen 305. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 314 is used to collect a fingerprint of a user, and the processor 301 identifies the identity of the user based on the fingerprint collected by the fingerprint sensor 314, or the fingerprint sensor 314 identifies the identity of the user based on the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the user is authorized by the processor 301 to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 314 may be provided on the front, back or side of the terminal 300. When a physical key or a manufacturer Logo is provided on the terminal 300, the fingerprint sensor 314 may be integrated with the physical key or the manufacturer Logo.

The optical sensor 315 is used to collect the ambient light intensity. In one embodiment, processor 301 may control the display brightness of touch screen 305 based on the intensity of ambient light collected by optical sensor 315. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 305 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 305 is turned down. In another embodiment, the processor 301 may also dynamically adjust the shooting parameters of the camera assembly 306 according to the ambient light intensity collected by the optical sensor 315.

A proximity sensor 316, also referred to as a distance sensor, is typically provided on the front panel of the terminal 300. The proximity sensor 316 is used to collect the distance between the user and the front of the terminal 300. In one embodiment, when the proximity sensor 316 detects a gradual decrease in the distance between the user and the front face of the terminal 300, the processor 301 controls the touch screen 305 to switch from the on-screen state to the off-screen state; when the proximity sensor 316 detects that the distance between the user and the front surface of the terminal 300 gradually increases, the processor 301 controls the touch display screen 305 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the structure shown in fig. 10 is not limiting and that more or fewer components than shown may be included or certain components may be combined or a different arrangement of components may be employed.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method for detecting site tampering based on deep learning, the method comprising:

when the type is a picture type, extracting first text information in a picture, correspondingly storing the first text information and the address information in a text list, and correspondingly storing the picture and the address information in a picture list; when the type is a text information type, correspondingly storing the second text information and the address information in the crawled page content in the text list;

Sequentially inputting text information into a text detection model in a sensitive information detection model based on the text list, and sequentially inputting pictures into a sensitive image detection model in the sensitive information detection model based on the picture list; the text detection model is used for detecting whether the text information comprises sensitive information or not, the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not based on the image content of the picture, and the text information comprises the first text information and/or the second text information;

obtaining a detection result output by the sensitive information detection model, wherein when the detection result is that the text information contains sensitive information, the station to be detected is determined to be tampered, and the text information and the corresponding address information are correspondingly stored in the corresponding relation between the address information and the evidence content; when the detection result is that the picture comprises a sensitive image, determining that the station to be detected is tampered, and correspondingly storing the picture and the corresponding address information in the corresponding relation between the address information and the evidence content;

sending the corresponding relation between the address information and the evidence content to a terminal, wherein the terminal is a terminal corresponding to a site manager,

Wherein, before sequentially inputting the text information into the text detection model based on the text list, the method further comprises:

word segmentation is carried out on the text information to obtain words included in the text information, and when sensitive words exist in the words included in the text information, x words adjacent to the sensitive words are obtained from the text information, wherein x is an integer greater than 1;

sequentially inputting the text information into the text detection model based on the text list, including:

and inputting the word vector of each word into the text detection model according to the sequence of each word in the text information.

2. The method of claim 1, wherein the x words include x/2 words in the text information that precede and are adjacent to the sensitive word and x/2 words that follow and are adjacent to the sensitive word.

3. The method of claim 1, wherein said inputting the picture into a sensitive image detection model comprises:

4. An apparatus for detecting site tampering based on deep learning, the apparatus comprising:

the storage module is used for extracting first text information in a picture when the type is a picture type, correspondingly storing the first text information and the address information in a text list, and correspondingly storing the picture and the address information in a picture list; when the type is a text information type, correspondingly storing the second text information and the address information in the crawled page content in the text list;

the acquisition module is used for sequentially inputting text information into a text detection model in a sensitive information detection model based on the text list and sequentially inputting pictures into a sensitive image detection model in the sensitive information detection model based on the picture list, wherein the text detection model is used for detecting whether the text information comprises sensitive information or not, the sensitive image detection model is used for detecting whether the picture comprises a sensitive image or not based on the image content of the picture, and the text information comprises the first text information and/or the second text information; obtaining a detection result output by the sensitive information detection model, wherein when the detection result is that the text information contains sensitive information, the station to be detected is determined to be tampered, and the text information and the corresponding address information are correspondingly stored in the corresponding relation between the address information and the evidence content; when the detection result is that the picture comprises a sensitive image, determining that the station to be detected is tampered, and correspondingly storing the picture and the corresponding address information in the corresponding relation between the address information and the evidence content;

A sending module, configured to send the correspondence between the address information and the evidence content to a terminal, where the terminal is a terminal corresponding to a site manager,

wherein, the acquisition module is further used for:

5. The apparatus of claim 4, wherein the x words comprise x/2 words in the text information that precede and are adjacent to the sensitive word and x/2 words that follow and are adjacent to the sensitive word.

6. The apparatus of claim 4, wherein the acquisition module is further to:

7. An electronic device comprising at least one processor and at least one memory for storing at least one instruction to be loaded and executed by the at least one processor to implement the method of any one of claims 1 to 2.

8. A computer readable storage medium storing at least one instruction for loading and execution by a processor to implement the method of any one of claims 1 to 2.