CN110309423A - A kind of sensitive information recognition methods, device and electronic equipment - Google Patents
A kind of sensitive information recognition methods, device and electronic equipment Download PDFInfo
- Publication number
- CN110309423A CN110309423A CN201910574799.8A CN201910574799A CN110309423A CN 110309423 A CN110309423 A CN 110309423A CN 201910574799 A CN201910574799 A CN 201910574799A CN 110309423 A CN110309423 A CN 110309423A
- Authority
- CN
- China
- Prior art keywords
- information
- search result
- sensitive
- query information
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Data Mining & Analysis (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a kind of sensitive information recognition methods, device and electronic equipments.This method comprises: obtaining user is input to the query information in search box;Query information is searched in excavating obtained target susceptibility information in advance;Wherein, target susceptibility information includes the historical query information for meeting preset condition;Preset condition are as follows: there are the second quantity that the first quantity of the first search result of sensitive content is greater than the second search result there is no sensitive content;When finding query information in target susceptibility information, determine that query information is sensitive information.In this way, can identify whether query information is sensitive information by the target susceptibility information excavated in advance, avoid through regular expression and identify sensitive information, reduces the human cost of identification sensitive information.
Description
Technical field
The present invention relates to technical field of information processing, more particularly to a kind of sensitive information recognition methods, device and electronics
Equipment.
Background technique
In order to keep good network environment, it is often necessary to identify query information that user inputs in search box whether be
Sensitive information.If sensitive information, then Search Hints information and search result etc. comprising the sensitive information can be shielded.Its
In, which typically refers to pornography.
Whether the query information for often identifying user's input by regular expression at present is sensitive information.For example, passing through
Regular expression " men and women does " come identify query information " men and women does " be sensitive information.But the canonical table
It can not identify whether query information " men and women does " is sensitive information up to formula.
That is, this kind of sensitive information identification method needs technical staff that a large amount of regular expression is arranged, with can be with
The various sensitive informations that user is inputted are identified by regular expression.But since a large amount of regular expressions are arranged
Formula needs to expend the more time and efforts of technical staff, so that the human cost of identification sensitive information is higher.
Summary of the invention
The embodiment of the present invention is designed to provide a kind of sensitive information recognition methods, device and electronic equipment, with can be with
Sensitive information is not identified by regular expression, to reduce the human cost of identification sensitive information.Specific technical solution is such as
Under:
In a first aspect, the embodiment of the invention provides a kind of sensitive information recognition methods, comprising:
It obtains user and is input to the query information in search box;
Query information is searched in excavating obtained target susceptibility information in advance;Wherein, target susceptibility information includes meeting
The historical query information of preset condition;Preset condition are as follows: there are the first quantity of the first search result of sensitive content to be greater than not
There are the second quantity of the second search result of sensitive content;
When finding query information in target susceptibility information, determine that query information is sensitive information.
Optionally, in excavating obtained target susceptibility information in advance before lookup query information, further includes:
Obtain the historical query information that search box is input in preset historical time section;
It determines in historical time section, the search result being clicked corresponding to historical query information;
It determines in the search result that was clicked there are the first quantity of the first search result of sensitive content and is not present
Second quantity of the second search result of sensitive content;
When the first quantity is greater than the second quantity, determine that historical query information is target susceptibility information.
Optionally it is determined that in the search result being clicked there are the first quantity of the first search result of sensitive content and
The step of there is no the second quantity of the second search result of sensitive content, comprising:
Identify whether the object content in the search result being clicked includes sensitive content;Wherein, object content packet
It includes: title and/or surface plot;
If object content includes sensitive content, determine that the search result being clicked is the first search result;
If object content does not include sensitive content, determine that the search result being clicked is the second search result;
Count the first quantity of the first search result and the second quantity of the second search result.
Optionally, the step of whether object content in search result that identification was clicked includes sensitive content, packet
It includes:
Determine the number that the search result being clicked is clicked in historical time section;
When number is more than or equal to default number of clicks, identify whether the object content in the search result being clicked wraps
Contain sensitive content.
Optionally, the step of being input to the historical query information of search box in preset historical time section is obtained, comprising:
Obtain the user journal in preset historical time section;
From obtaining the historical query information for being input to search box in historical time section in user journal.
Optionally, in embodiments of the present invention, sensitive information includes: pornography and/or violence information.
Second aspect, the embodiment of the invention also provides a kind of sensitive information identification devices, comprising:
First obtains module, the query information being input in search box for obtaining user;
Searching module, for searching query information in excavating obtained target susceptibility information in advance;Wherein, the target
Sensitive information includes the historical query information for meeting preset condition;The preset condition are as follows: there are the first search of sensitive content
As a result the first quantity is greater than the second quantity of the second search result there is no sensitive content;
First determining module, for when finding query information in target susceptibility information, determining that query information is quick
Feel information.
Optionally, in embodiments of the present invention, further includes:
Second obtains module, for obtaining before searching query information in excavating obtained target susceptibility information in advance
The historical query information of search box is input in preset historical time section;
Second determining module, for determining in historical time section, what is be clicked corresponding to historical query information is searched
Hitch fruit;
Third determining module, for determining, there are the first search results of sensitive content in the search result being clicked
First quantity and there is no the second quantity of the second search result of sensitive content;
4th determining module, for when the first quantity is greater than the second quantity, determining that historical query information is target susceptibility
Information.
Optionally, in embodiments of the present invention, third determining module includes:
Whether recognition unit, the object content in search result being clicked for identification include sensitive content;Its
In, object content includes: title and/or surface plot;
First determination unit determines that the search result being clicked is for when object content includes sensitive content
First search result;
Second determination unit, for when object content does not include sensitive content, determining that the search result being clicked is
Second search result;
Statistic unit, for counting the first quantity of the first search result and the second quantity of the second search result.
Optionally, in embodiments of the present invention, recognition unit is specifically used for:
Determine the number that the search result being clicked is clicked in historical time section;
When number is more than or equal to default number of clicks, identify whether the object content in the search result being clicked wraps
Contain sensitive content.
Optionally, in embodiments of the present invention, the first acquisition module is specifically used for:
Obtain the user journal in preset historical time section;
From obtaining the historical query information for being input to search box in historical time section in user journal.
Optionally, in embodiments of the present invention, sensitive information may include: pornography and/or violence information.
The third aspect, the embodiment of the invention also provides a kind of electronic equipment, including processor, communication interface, memory
And communication bus, wherein processor, communication interface, memory complete mutual communication by communication bus;
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes the described in any item method steps of first aspect
Suddenly.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage medium, the computer-readable storages
Dielectric memory contains computer program, and first aspect described in any item sides are realized when the computer program is executed by processor
Method step.
5th aspect, the embodiment of the invention also provides a kind of computer program products comprising instruction, when it is being calculated
When being run on machine, so that computer executes the described in any item method and steps of first aspect.
In embodiments of the present invention, the query information that user is input in search box can be obtained.It is then possible to preparatory
It excavates in obtained target susceptibility information and searches the query information.Wherein, target susceptibility information includes meeting going through for preset condition
History query information.The preset condition are as follows: be greater than there are the first quantity of the first search result of sensitive content and be not present in sensitivity
Second quantity of the second search result held.It, then can be true also, when finding the query information in target susceptibility information
The fixed query information is sensitive information.In this way, can identify query information by the target susceptibility information excavated in advance
Whether be sensitive information, avoid through regular expression and identify sensitive information, reduce identification sensitive information manpower at
This.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described.
Fig. 1 is a kind of flow chart of sensitive information recognition methods provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of method for excavating target susceptibility information provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of sensitive information identification device provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention is described.
In order to solve the problems in the existing technology, the embodiment of the invention provides a kind of sensitive information recognition methods,
Device and electronic equipment.
Sensitive information recognition methods provided in an embodiment of the present invention is illustrated first below.
The sensitive information recognition methods can be applied to electronic equipment, the electronic equipment include but is not limited to computer,
Mobile phone, intelligent wearable device and server.
Fig. 1 is a kind of flow chart of sensitive information recognition methods provided in an embodiment of the present invention.Referring to Fig. 1, sensitivity letter
Breath recognition methods may include steps of:
S101: it obtains user and is input to the query information in search box;
It is understood that the search box can be search box set in browser, or in video website
Set search box, is not limited thereto certainly.
For example, which can be " popular TV play " or " men and women does ", be not limited thereto certainly.
S102: query information is searched in excavating obtained target susceptibility information in advance;Wherein, target susceptibility information includes
Meet the historical query information of preset condition;Preset condition are as follows: there are the first quantity of the first search result of sensitive content is big
In there is no the second quantity of the second search result of sensitive content;
It is understood that can first excavate to obtain target susceptibility information, then searching in target susceptibility information again should
Query information.
Wherein, the search result of each historical query information may include the first search result and the second search result.And
And there are sensitive content in each first search result, sensitive content is not present in each second search result.
In one implementation, in mining process, for a historical query information, when the historical query is believed
It, then can be with when the quantity (i.e. the first quantity) of first search result of breath is greater than quantity (i.e. the second quantity) of the second search result
Determine that there are the search result of sensitive content is more than the search result there is no sensitive content.At this point it is possible to determine that this is gone through
History query information is target susceptibility information.In this way, making in mining process, can be based on existing in obtained search result
First search result quantity of sensitive content and there is no the second search result quantity of sensitive content come determine target susceptibility believe
Breath.Wherein, sensitive content may include: Pornograph and/or violent content.
In another implementation, in mining process, for a historical query information, when the historical query
The quantity (i.e. the first quantity) for the first search result of information being clicked is greater than the number for the second search result being clicked
When measuring (i.e. the second quantity), the search result that there are the search results of sensitive content than sensitive content is not present can also be determined
It is more.At this point it is possible to determine that the historical query information is target susceptibility information.In this way, making in mining process, can be based on
There are the first search result quantity of sensitive content and there is no the second of sensitive content in the search result clicked by user
Search result quantity determines target susceptibility information.
Wherein, due to may not only include the first search result in the search result of a historical query information but also include second
Search result.And for a user, what user wanted search may be in the search result there is no the of sensitive content
Two search results.Thus, based in the search result clicked by user there are the first search result quantity of sensitive content and
The mode of target susceptibility information is determined there is no the second search result quantity of sensitive content, can be searched by what is be clicked
Hitch fruit predicts user really interested content erotic content or non-sensitive content.In this way, can be in conjunction with the point of user
It hits Behavior mining and obtains accurate target susceptibility information.
Wherein, Fig. 2 is a kind of flow chart of method for excavating target susceptibility information provided in an embodiment of the present invention.It ties below
Fig. 2 is closed, the mode provided in an embodiment of the present invention for excavating target susceptibility information is illustrated.Referring to fig. 2, target susceptibility letter is excavated
The mode of breath may include steps of:
S201: the historical query information that search box is input in preset historical time section is obtained;
It is understood that electronic equipment can obtain the user journal in preset historical time section.Then, from
The historical query information that search box is input in historical time section is obtained in the log of family.For example, the historical query information
It can be " men and women does ", be not limited thereto certainly.
Wherein, preset historical time section can be set as the case may be by those skilled in the art, this is default
Historical time section can be with are as follows: the previous day of current point in time.It is of course also possible to be the previous moon of current point in time, certainly
It is not limited thereto.
S202: determining in historical time section, the search result being clicked corresponding to historical query information;
Wherein it is determined that the search result clicked in the search result by user, it can be realized that user inputs the history
The purpose of query information, it can determine user's really interested search result.To be searched according to user is really interested
Hitch fruit determines whether the historical query information is target susceptibility information.
S203: there are the first quantity of the first search result of sensitive content and not in the determining search result being clicked
There are the second quantity of the second search result of sensitive content;
After determining the corresponding search result being clicked of the historical query information, it can determine in the search result
There are the first search results of sensitive content, and, there is no the second search results of sensitive content.It is then possible to count
First quantity of one search result and the second quantity of the second search result.
Wherein it is possible to be determined as follows each search result being clicked with the presence or absence of sensitive content: identification
Whether the object content in search result being clicked includes sensitive content.Wherein, object content include: title and/or
Surface plot.
Since under normal circumstances, the number of words of the title of search result is more, thus can by identification title in whether
There are sensitive contents to determine in the search result with the presence or absence of sensitive content.It specifically, can be by the semanteme that constructs in advance
Whether identification model determines comprising sensitive content in the title, certainly simultaneously to identify the meaning of the title according to the meaning
It is not limited to this.Wherein, when there are when sensitive content, then show that there are sensitive contents in the search result in title.
Wherein, the title of search result may include: the browser pop-up after clicking the search historical query information
, it is one or more in the title division content and brief introductory section content of the search result, this is all reasonable.
In addition, since the surface plot in search result usually can reflect out the main contents of the search result.Thus, it can
To determine in the search result by the way that whether identification surface plot is sensitization picture with the presence or absence of sensitive content.It specifically, can be with
Content included in the surface plot (such as nude) is identified by picture recognition model, alternatively, identifying institute in the surface plot
Classification belonging to the content for including (such as pornographic classification).In turn, can and then the surface plot be determined according to picture recognition result
In whether include sensitive content, be not limited thereto certainly.
It is understood that the semantics recognition model can be that any one can identify text semanteme in the related technology
Model.In addition, the picture recognition model can be that any one can be identified in image content or picture in the related technology
Hold the model of generic.It is not specifically limited herein.
Wherein, in order to avoid clicking operation caused by maloperation, it can also determine that the search result being clicked is being gone through
The number being clicked in the history period.Wherein, when number is more than or equal to default number of clicks, then show that number of clicks is more,
It is the purposive click of user.At this point it is possible to execute object content in the search result that was clicked of identification whether include
There is the operation of sensitive content.
Conversely, then showing that click caused by being likely to user misoperation is grasped when number is less than default number of clicks
Make.At this point it is possible to abandon executing object content in the search result that was clicked of identification whether include sensitive content behaviour
Make.In this way, the overdue search result hit can be filtered out, so as to reduce the number for the search result for needing to detect sensitive content
Amount, and the accuracy of determining target susceptibility information can be improved.
S204: when the first quantity is greater than the second quantity, determine that historical query information is target susceptibility information.
Wherein, for a historical query information, the search knot that was clicked when the historical query information is corresponding
When the first quantity of the first search result is greater than the second quantity of the second search result in fruit, then show the historical query information pair
The search knot that there are the search result quantity of sensitive content than sensitive content is not present in the search result being clicked answered
Fruit quantity is more, and shows that user wants to search sensitive content by the historical query information.At this point it is possible to determine the history
Query information is target susceptibility information.
S103: when finding query information in target susceptibility information, determine that query information is sensitive information.
Wherein, when finding the query information in target susceptibility information, show that user wants through the query information
Sensitive content is searched, at this time can determine that the query information is sensitive information.In this way, can be by the mesh that excavates in advance
Sensitive information is marked simply and rapidly to identify whether query information is sensitive information.
Wherein, after determining that the query information is sensitive information, the search comprising the sensitive information can also be shielded and mentioned
Show information and search result etc..Wherein, sensitive information includes: pornography and/or violence information.
In embodiments of the present invention, the query information that user is input in search box can be obtained.It is then possible to preparatory
It excavates in obtained target susceptibility information and searches the query information.Wherein, target susceptibility information includes meeting going through for preset condition
History query information.The preset condition are as follows: be greater than there are the first quantity of the first search result of sensitive content and be not present in sensitivity
Second quantity of the second search result held.It, then can be true also, when finding the query information in target susceptibility information
The fixed query information is sensitive information.In this way, can identify query information by the target susceptibility information excavated in advance
Whether be sensitive information, avoid through regular expression and identify sensitive information, reduce identification sensitive information manpower at
This.
It to sum up, can be by the mesh that excavates in advance using sensitive information identifying schemes provided in an embodiment of the present invention
Sensitive information is marked simply and rapidly to identify whether query information is sensitive information, improves the speed of identification sensitive information, and
Reduce the human cost of identification sensitive information.
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of sensitive information identification devices, referring to figure
3, the apparatus may include:
First obtains module 301, the query information being input in search box for obtaining user;
Searching module 302, for searching query information in excavating obtained target susceptibility information in advance;Wherein, described
Target susceptibility information includes the historical query information for meeting preset condition;The preset condition are as follows: there are the first of sensitive content
First quantity of search result is greater than the second quantity of the second search result there is no sensitive content;
First determining module 303, for when finding query information in target susceptibility information, determining that query information is
Sensitive information.
Using device provided in an embodiment of the present invention, the query information that user is input in search box can be obtained.Then,
The query information can be searched in excavating obtained target susceptibility information in advance.Wherein, target susceptibility information includes meeting in advance
If the historical query information of condition.The preset condition are as follows: there are the first quantity of the first search result of sensitive content to be greater than not
There are the second quantity of the second search result of sensitive content.Also, works as and find the query information in target susceptibility information
When, then it can determine that the query information is sensitive information.In this way, can be known by the target susceptibility information excavated in advance
Whether other query information is sensitive information, avoids through regular expression and identifies sensitive information, reduces the sensitive letter of identification
The human cost of breath.
Optionally, in embodiments of the present invention, can also include:
Second obtains module, for obtaining before searching query information in excavating obtained target susceptibility information in advance
The historical query information of search box is input in preset historical time section;
Second determining module, for determining in historical time section, what is be clicked corresponding to historical query information is searched
Hitch fruit;
Third determining module, for determining, there are the first search results of sensitive content in the search result being clicked
First quantity and there is no the second quantity of the second search result of sensitive content;
4th determining module, for when the first quantity is greater than the second quantity, determining that historical query information is target susceptibility
Information.
Optionally, in embodiments of the present invention, third determining module may include:
Whether recognition unit, the object content in search result being clicked for identification include sensitive content;Its
In, object content includes: title and/or surface plot;
First determination unit determines that the search result being clicked is for when object content includes sensitive content
First search result;
Second determination unit, for when object content does not include sensitive content, determining that the search result being clicked is
Second search result;
Statistic unit, for counting the first quantity of the first search result and the second quantity of the second search result.
Optionally, in embodiments of the present invention, recognition unit is specifically used for:
Determine the number that the search result being clicked is clicked in historical time section;
When number is more than or equal to default number of clicks, identify whether the object content in the search result being clicked wraps
Contain sensitive content.
Optionally, in embodiments of the present invention, the first acquisition module 301 is specifically used for:
Obtain the user journal in preset historical time section;
From obtaining the historical query information for being input to search box in historical time section in user journal.
Optionally, in embodiments of the present invention, sensitive information may include: pornography and/or violence information.
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of electronic equipment, as shown in figure 4, including
Processor 401, communication interface 402, memory 403 and communication bus 404, wherein processor 401, communication interface 402, storage
Device 403 completes mutual communication by communication bus 404,
Memory 403, for storing computer program;
Processor 401 when for executing the program stored on memory 403, realizes any of the above-described sensitive letter
Cease the method and step of recognition methods.
In embodiments of the present invention, electronic equipment can obtain the query information that user is input in search box.Then, may be used
To search the query information in excavating obtained target susceptibility information in advance.Wherein, target susceptibility information includes meeting to preset
The historical query information of condition.The preset condition are as follows: be greater than there are the first quantity of the first search result of sensitive content and do not deposit
In the second quantity of the second search result of sensitive content.Also, when finding the query information in target susceptibility information,
It can then determine that the query information is sensitive information.In this way, can be identified by the target susceptibility information excavated in advance
Whether query information is sensitive information, avoids through regular expression and identifies sensitive information, reduces identification sensitive information
Human cost.
Corresponding to above method embodiment, the embodiment of the invention also provides a kind of computer readable storage medium, the meters
It is stored with computer program in calculation machine readable storage medium storing program for executing, realizes that any of the above-described is sensitive when computer program is executed by processor
The method and step of information identifying method.
The computer program stored in computer readable storage medium provided in an embodiment of the present invention is by the place of electronic equipment
After managing device execution, electronic equipment can obtain the query information that user is input in search box.It is then possible to be excavated in advance
To target susceptibility information in search the query information.Wherein, target susceptibility information includes meeting the historical query of preset condition
Information.The preset condition are as follows: be greater than that there is no the of sensitive content there are the first quantity of the first search result of sensitive content
Second quantity of two search results.Also, when finding the query information in target susceptibility information, then it can determine that this is looked into
Inquiry information is sensitive information.In this way, can be identified by the target susceptibility information excavated in advance query information whether be
Sensitive information avoids through regular expression and identifies sensitive information, reduces the human cost of identification sensitive information.
Corresponding to above method embodiment, in another embodiment provided by the invention, additionally provide a kind of comprising instruction
Computer program product, when run on a computer, so that computer executes the sensitive letter of any one of above-described embodiment
Cease the method and step of recognition methods.
After computer program provided in an embodiment of the present invention is executed by the processor of electronic equipment, electronic equipment can be obtained
User is input to the query information in search box.It is looked into it is then possible to search this in excavating obtained target susceptibility information in advance
Ask information.Wherein, target susceptibility information includes the historical query information for meeting preset condition.The preset condition are as follows: there are sensitivities
First quantity of the first search result of content is greater than the second quantity of the second search result there is no sensitive content.Also,
When finding the query information in target susceptibility information, then it can determine that the query information is sensitive information.In this way, can be with
Identify whether query information is sensitive information, avoids and passes through regular expressions by the target susceptibility information excavated in advance
Formula identifies sensitive information, reduces the human cost of identification sensitive information.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component
Interconnect, PCI) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control bus etc..For just
It is only indicated with a thick line in expression, figure, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory may include random access memory (Random Access Memory, RAM), also may include non-easy
The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be storage device that at least one is located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal
Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing
It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete
Door or transistor logic, discrete hardware components.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program
Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or
It partly generates according to process or function described in the embodiment of the present invention.The computer can be general purpose computer, dedicated meter
Calculation machine, computer network or other programmable devices.The computer instruction can store in computer readable storage medium
In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer
Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center
User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or
Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or
It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with
It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk
Solid State Disk (SSD)) etc..
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device,
For electronic equipment, computer readable storage medium and computer program product embodiments, since it is substantially similar to method reality
Example is applied, so being described relatively simple, the relevent part can refer to the partial explaination of embodiments of method.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all
Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention
It is interior.
Claims (10)
1. a kind of sensitive information recognition methods characterized by comprising
It obtains user and is input to the query information in search box;
The query information is searched in excavating obtained target susceptibility information in advance;Wherein, the target susceptibility information includes
Meet the historical query information of preset condition;The preset condition are as follows: there are the first numbers of the first search result of sensitive content
Amount is greater than the second quantity of the second search result there is no sensitive content;
When finding the query information in the target susceptibility information, determine that the query information is sensitive information.
2. the method according to claim 1, wherein described in excavating obtained target susceptibility information in advance
Before the step of searching the query information, further includes:
Obtain the historical query information that described search frame is input in preset historical time section;
It determines in the historical time section, the search result being clicked corresponding to the historical query information;
It determines in the search result that was clicked there are the first quantity of the first search result of sensitive content and there is no sensitivities
Second quantity of the second search result of content;
When first quantity is greater than second quantity, determine that the historical query information is the target susceptibility information.
3. according to the method described in claim 2, it is characterized in that, there is sensitivity in the search result that the determination was clicked
First quantity of the first search result of content and the step of there is no the second quantity of the second search result of sensitive content, packet
It includes:
Identify whether the object content in the search result being clicked includes sensitive content;Wherein, the object content packet
It includes: title and/or surface plot;
If the object content includes sensitive content, the search result being clicked described in determination is the first search result;
If the object content does not include sensitive content, the search result being clicked described in determination is the second search result;
Count the first quantity of first search result and the second quantity of second search result.
4. according to the method described in claim 3, it is characterized in that, in the target identified in the search result being clicked
The step of whether appearance includes sensitive content, comprising:
Determine the number that the search result being clicked is clicked in the historical time section;
When the number is more than or equal to default number of clicks, identify whether the object content in the search result being clicked wraps
Contain sensitive content.
5. the method according to any one of claim 2-4, which is characterized in that the acquisition is in preset historical time section
The step of being inside input to the historical query information of described search frame, comprising:
Obtain the user journal in preset historical time section;
From obtaining the historical query information for being input to described search frame in the historical time section in the user journal.
6. the method according to claim 1, wherein the sensitive information includes: pornography and/or violence letter
Breath.
7. a kind of sensitive information identification device characterized by comprising
First obtains module, the query information being input in search box for obtaining user;
Searching module, for searching the query information in excavating obtained target susceptibility information in advance;Wherein, the target
Sensitive information includes the historical query information for meeting preset condition;The preset condition are as follows: there are the first search of sensitive content
As a result the first quantity is greater than the second quantity of the second search result there is no sensitive content;
First determining module, for determining the inquiry when finding the query information in the target susceptibility information
Information is sensitive information.
8. device according to claim 7, which is characterized in that further include:
Second obtains module, for obtaining before searching the query information in excavating obtained target susceptibility information in advance
The historical query information of described search frame is input in preset historical time section;
Second determining module was clicked corresponding to the historical query information for determining in the historical time section
Search result;
Third determining module, for determining, there are the first of the first search result of sensitive content in the search result being clicked
Quantity and there is no the second quantity of the second search result of sensitive content;
4th determining module, for determining that the historical query information is when first quantity is greater than second quantity
The target susceptibility information.
9. device according to claim 8, which is characterized in that the third determining module includes:
Whether recognition unit, the object content in search result being clicked for identification include sensitive content;Wherein, institute
Stating object content includes: title and/or surface plot;
First determination unit, the search knot for being clicked described in determination when the object content includes sensitive content
Fruit is the first search result;
Second determination unit, the search knot for being clicked described in determination when the object content does not include sensitive content
Fruit is the second search result;
Statistic unit, for counting the first quantity of first search result and the second quantity of second search result.
10. a kind of electronic equipment, which is characterized in that including processor, communication interface, memory and communication bus, wherein processing
Device, communication interface, memory complete mutual communication by communication bus;
Memory, for storing computer program;
Processor when for executing the program stored on memory, realizes method step as claimed in any one of claims 1 to 6
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910574799.8A CN110309423A (en) | 2019-06-28 | 2019-06-28 | A kind of sensitive information recognition methods, device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910574799.8A CN110309423A (en) | 2019-06-28 | 2019-06-28 | A kind of sensitive information recognition methods, device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110309423A true CN110309423A (en) | 2019-10-08 |
Family
ID=68078597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910574799.8A Pending CN110309423A (en) | 2019-06-28 | 2019-06-28 | A kind of sensitive information recognition methods, device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110309423A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111666317A (en) * | 2020-07-06 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Cheating information mining method and cheating information identification method and device |
CN112818249A (en) * | 2021-03-04 | 2021-05-18 | 中南大学 | Multi-dimensional image construction method and system for crowd with specific tendency |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020123A (en) * | 2012-11-16 | 2013-04-03 | 中国科学技术大学 | Method for searching bad video website |
CN107862076A (en) * | 2017-11-29 | 2018-03-30 | 四川九鼎智远知识产权运营有限公司 | A kind of sensitive vocabulary monitor supervision platform |
CN108388582A (en) * | 2012-02-22 | 2018-08-10 | 谷歌有限责任公司 | The mthods, systems and devices of related entities for identification |
-
2019
- 2019-06-28 CN CN201910574799.8A patent/CN110309423A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388582A (en) * | 2012-02-22 | 2018-08-10 | 谷歌有限责任公司 | The mthods, systems and devices of related entities for identification |
CN103020123A (en) * | 2012-11-16 | 2013-04-03 | 中国科学技术大学 | Method for searching bad video website |
CN107862076A (en) * | 2017-11-29 | 2018-03-30 | 四川九鼎智远知识产权运营有限公司 | A kind of sensitive vocabulary monitor supervision platform |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111666317A (en) * | 2020-07-06 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Cheating information mining method and cheating information identification method and device |
CN112818249A (en) * | 2021-03-04 | 2021-05-18 | 中南大学 | Multi-dimensional image construction method and system for crowd with specific tendency |
CN112818249B (en) * | 2021-03-04 | 2022-06-21 | 中南大学 | Multi-dimensional image construction method and system for crowd with specific tendency |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10353947B2 (en) | Relevancy evaluation for image search results | |
US7917528B1 (en) | Contextual display of query refinements | |
CN110209827B (en) | Search method, search device, computer-readable storage medium, and computer device | |
US8515809B2 (en) | Dynamic modification of advertisements displayed in response to a search engine query | |
CN107784092A (en) | A kind of method, server and computer-readable medium for recommending hot word | |
US10713291B2 (en) | Electronic document generation using data from disparate sources | |
US20120221411A1 (en) | Apparatus and methods for determining user intent and providing targeted content according to intent | |
US9892096B2 (en) | Contextual hyperlink insertion | |
EP2646933A1 (en) | Enabling predictive web browsing | |
WO2018205845A1 (en) | Data processing method, server, and computer storage medium | |
EP2862105A1 (en) | Ranking search results based on click through rates | |
CN108390788A (en) | User identification method, device and electronic equipment | |
WO2017045532A1 (en) | Application program classification display method and apparatus | |
CN109753601A (en) | Recommendation information clicking rate determines method, apparatus and electronic equipment | |
CN109190014B (en) | Regular expression generation method and device and electronic equipment | |
CN110309423A (en) | A kind of sensitive information recognition methods, device and electronic equipment | |
CN104699837B (en) | Method, device and server for selecting illustrated pictures of web pages | |
CN103955480B (en) | A kind of method and apparatus for determining the target object information corresponding to user | |
CN107885875B (en) | Synonymy transformation method and device for search words and server | |
CN108427883A (en) | Webpage digs the detection method and device of mine script | |
CN109067794A (en) | A kind of detection method and device of network behavior | |
CN116015842A (en) | Network attack detection method based on user access behaviors | |
CN112836126A (en) | Recommendation method and device based on knowledge graph, electronic equipment and storage medium | |
TWI457775B (en) | Method for sorting and managing websites and electronic device of executing the same | |
CN109240591A (en) | Interface display method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |