CN109308295A - A kind of privacy exposure method of real-time of data-oriented publication - Google Patents

A kind of privacy exposure method of real-time of data-oriented publication Download PDF

Info

Publication number
CN109308295A
CN109308295A CN201811118685.4A CN201811118685A CN109308295A CN 109308295 A CN109308295 A CN 109308295A CN 201811118685 A CN201811118685 A CN 201811118685A CN 109308295 A CN109308295 A CN 109308295A
Authority
CN
China
Prior art keywords
user
keyword
data
input
susceptibility
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811118685.4A
Other languages
Chinese (zh)
Inventor
柯昌博
陈成
张力中
吴佳挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201811118685.4A priority Critical patent/CN109308295A/en
Publication of CN109308295A publication Critical patent/CN109308295A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to a kind of privacy exposure method of real-time of data-oriented publication; the content of user's input is recorded; the comparison of character and judge susceptibility; in the first stage; the content of user's input is obtained by keyboard mutuality interface; lookup matching is carried out to the content of user's input; to obtain required keyword, in second stage, database is searched by keyword; obtain corresponding sensitive rank; and terminal is fed back to, in the phase III, user terminal obtains the result of server return; corresponding operation is executed, to protect the privacy information of user.The present invention is monitored by the input behavior to user, the historical information in comprehensive server client database, is obtained corresponding security level and is made corresponding operation, so that user information realization is effectively protected in the stage in data publication.

Description

A kind of privacy exposure method of real-time of data-oriented publication
Technical field
The invention belongs to information security technology and secret protection technical field, a kind of data-oriented publication is especially designed The method of privacy exposure real-time monitoring.
Background technique
With the progress of science and technology, our society is just entering the big data era of data explosion, all are said with data Words, is the vertical body that data become Internet enterprises basic, what is big data? in general sense, big data, which refers to, can hold The data set that it is perceived, obtain, manage, handle and is serviced with tradition IT technology and software and hardware tools in the time born It closes.The characteristics of big data, can be summarized as 4 V, i.e. Volume (scale of construction is great), Variety (mode is various), Velocity (generating quick) and Value (Huge value but density is very low).
Big data has particularly important influence at many aspects, in financial circles, compared with traditional financial industry, big data Financial performance goes out that transparency is stronger, participation is higher, collaborative is more preferable, a series of intermediate more low advantages of cost.
In market industry, by Analysis on Data Mining market orientation and the behavior of consumer, by concentrating analysis to disappear The information of expense person to obtain rule, the demand of certain consumer groups, and according to these information determine corresponding market orientation and Marketing program.The cost that can reduce enterprise makes it obtain higher profit.
In big data era, data are exactly all, and by seeming many and diverse unordered data, we are available very much
But there is also very big hidden danger for big data, largely facts proved that, the big data of unprocessed processing is without authorization Announcement will have a huge impact the privacy of user, and the personal information and data of user will be easy for being acquired, danger The property safety of evil user, information security or even personal safety.
The threat that people face is not limited in individual privacy leakage, also resides in based on big data to people's state and behavior Prediction mono- typical example to be certain retailer analyzed by historical record, know that its daughter has been pregnant earlier than parent The fact, and to its post relevant advertisements information and social network analysis research also indicate that, group characteristics therein can be passed through It was found that the attribute of user for example by analyze user Twitter information, it can be found that the political orientation of user, consumption habit And team of hobby etc..
Therefore, the purpose of the present invention is to from the privacy information of source protection user, in information launch phase to privacy of user Information monitoring prevents irreversible influence, so that data collection side can not therefrom get valuable information.
Summary of the invention
The purpose of the present invention is a kind of privacy exposure real-time monitorings of data-oriented publication provided for realization secret protection Method.By analyzing data source, integrated data matching and machine learning algorithm exercise supervision the present invention, make data Avoid that there are privacy informations and the content for endangering user information safety in source as far as possible.
The invention discloses a kind of privacy exposure method of real-time of data-oriented publication, comprising:
The step of obtaining user's input content;
Is carried out to user's input content: by user's input content according to the Chinese the step of searching matching and obtaining required keyword Word grammar property resolves into several individual words, is combined by fuzzy matching and KMP algorithm, obtains corresponding keyword;
Search database by keyword, and obtain the corresponding other step of sensitivity level: first determine whether keyword whether be The keyword stored in database, if so, obtaining corresponding susceptibility numerical value to the data for being clearly keyword, which is deposited It stores up in the database, if it is not, then returning to not warning message;It is judged as if susceptibility numerical value is greater than preset threshold value Danger sounds a warning, and returns to terminal, if being not more than the threshold value, returns to not warning message;
The step of user terminal executes corresponding operating according to the susceptibility numerical value that feedback obtains.
The fuzzy matching algorithm are as follows: according to the editing distance between two Chinese characters, by preset threshold value, if the editor Distance is less than threshold value and is then considered as the two Chinese characters successful match;The editing distance is tired out by word tone editing distance and font editing distance Add to obtain, phonetic transcriptions of Chinese characters is divided into initial consonant, simple or compound vowel of a Chinese syllable and tone, and assign its weight respectively, initial consonant, rhythm after being endowed weight Female and tone is cumulative to obtain word tone editing distance;Font is obtained by four-corner system data, assigns four angles of the four-corner system to its power Weight, by the four-corner system of two Chinese characters relatively after, and cumulative obtain font editing distance;
It will judge that the whether equal sentence of two characters is substituted for fuzzy matching function distance_ in the KMP algorithm Compare (char c_a, char c_b) realizes that fuzzy matching and KMP algorithm combine.
The weight that the initial consonant, simple or compound vowel of a Chinese syllable and tone are endowed differs, and four angles of the four-corner system are endowed identical power Weight.
By adjusting the threshold value of fuzzy matching algorithm, error correction is carried out to the content of user's input.
The step of the step of the invention also includes initialization data libraries and more new database;
The step of described initialization data library are as follows: corresponding with its to obtain initial keyword by the setting of user Susceptibility numerical value, and store to database;
The step of described more new database: susceptibility calculation formula are as follows:
Y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1]
Wherein, y indicates the susceptibility numerical value that prediction obtains, and ws [0], ws [1], ws [2] is respectively in regression coefficient matrix Three values, so-called regression coefficient i.e. our specific gravity for being occupied in prediction by each input value that training dataset obtains, k What is indicated is offset, and what x [0] was indicated is first input parameter, for the medium sensitive numerical value of a upper measurement period, x [1] What is indicated is second input parameter, the number used for the keyword in a upper measurement period.Journey is calculated according to susceptibility Sequence inputs the access times and the keyword current sensitivity in the database of some keyword in a upper measurement period Degree value, the susceptibility numerical value after available amendment.More new database and modified method: firstly, from database The numerical value of susceptibility field is read, the method used is java link mysql database, wherein the method for link is using library Mysql-connector-java-5.1.43.jar obtains corresponding susceptibility numerical value by select sentence, later again by this The access times that a numerical value is same as above the keyword in a measurement period are input to together in susceptibility calculation formula, wherein susceptibility Numerical value is x [0], and access times are x [1], corresponding corrected susceptibility numerical value can be obtained, then use by java program Update sentence in data base manipulation statement is updated into database.
The modification method passes through the best-fitting straight line being initially obtained and obtains, and so-called optimum fit curve passes through upper State formula y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1], obtained straight line, wherein [0] parameter ws, ws [1], ws [2] These three parameters are, as regression coefficients obtained by following told about training program, and k is offset, initial value It is 1, when finding that result error is excessive when we train, can be improved by modification offset.
The utility model has the advantages that compared with prior art, the present invention having the advantage that first is that can be from data source i.e. data Launch phase avoids the leakage of privacy of user data, effectively prevents private data from maliciously being obtained.Second is that can adapt to user Individual demand, different sensitivity levels can be formulated for the individual demand of user.Third is that being capable of real-time detection use The keyword service condition at family updates susceptibility according to use habit and frequency of use, uses to meet to the full extent The needs at family.
Detailed description of the invention
Fig. 1 is use example of the invention.
Fig. 2 is the schematic diagram for executing Data Matching.
Fig. 3 is the program schematic diagram that susceptibility calculates.
Fig. 4 is system architecture diagram of the invention.
Fig. 5 is login interface of the invention.
Fig. 6 is use interface of the invention.
Fig. 7 is the schematic diagram of safety setting of the invention.
Specific embodiment
The present invention is further explained with reference to the accompanying drawings and examples.
A kind of privacy exposure method of real-time of data-oriented publication of the invention, comprising the following steps:
S1: user's input content is obtained;
S2: user's input content is carried out searching matching and obtains required keyword: by user's input content according to Chinese character Grammar property resolves into several individual words, by fuzzy matching and KMP algorithm, obtains corresponding keyword.
Wherein, fuzzy matching algorithm be found out respectively according to the word tone and font of two Chinese characters the editor of word tone and font away from From, phonetic transcriptions of Chinese characters can directly adopt the acquisition of phonetic database, after obtaining phonetic, phonetic is divided into initial consonant, simple or compound vowel of a Chinese syllable, tone, point This three different weights are not assigned, are added up and are obtained word tone editing distance;And font we then be use four-corner system database It obtains, we assign four angles of the four-corner system to equal weight, add up obtain font editing distance more afterwards;Finally by word tone with Font playwright, screenwriter's distance is accumulated by final playwright, screenwriter's distance, the i.e. editing distance of the two Chinese characters by certain weight, such as this volume The threshold value that distance is less than setting is collected, then is considered as the two Chinese character fuzzy matching success.
Wherein, the effect of KMP algorithm is to accelerate character match fast reading, and the equal sentence of two characters will be judged in KMP algorithm Our fuzzy matching function is replaced with, this two algorithm can be combined.
Wrong word input caused by the present invention can be directed to because of personal habits carries out error correction, obtains its original content.It is so-called Error correction refers to the character regarded the successful character of fuzzy matching as in sensitive dictionary, at this point, since algorithm may accidentally will Insensitive character regards sensitive character as, produces " false alarm ", but can be " accidentally by strictly limiting the threshold value of fuzzy matching Alarm " probability reaches acceptable degree;Secondly as the particularity of data safety, we can receive to exist " false alarm " without In the presence of " failing to report police ".
For example, if having " fuzzy " word in sensitive dictionary, when user, which inputs, mistake writing " touching paste " occurs, because of " touching " All very close with " mould " word tone font, editing distance is less than given threshold, so " can will touch " word is corrected as " mould ".
S3: database is searched by keyword, and obtains corresponding susceptibility numerical value: it is first determined whether being in database The keyword of storage, if it is, turning in next step, if it is not, then returning to not warning message;Second step, to being clearly crucial The data of word obtain corresponding susceptibility numerical value, which stores in the database, judge if the numerical value is greater than threshold value It for danger, sounds a warning, returns to terminal, if being not more than the threshold value, return to not warning message;
S4: user terminal executes corresponding operating according to the sensitive numerical value that feedback obtains.
The step of the step of the invention also includes initialization data libraries and more new database;
The step of described initialization data library are as follows: corresponding with its to obtain initial keyword by the setting of user Susceptibility, and store to database;
The step of described more new database: according to user's use habit and input content, the information in database will not Be it is unalterable, according to susceptibility calculation procedure, input the access times of some keyword in a upper measurement period With this keyword current numerical value in the database, numerical value after available amendment.Wherein, modified method such as Fig. 3 Shown in, by the optimum fit curve being initially obtained, so-called optimum fit curve passes through formula y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1], obtained straight line, wherein [0] parameter ws, ws [1], ws [2] these three parameters are to pass through instruction It is obtained to practice program, as regression coefficient, and k is offset, initial value 1 finds that result error is excessive when training When, it can be improved by modification offset.It only needs using above-mentioned two values as x [0] and x [1] input process sequence In can obtain susceptibility numerical value, what x [0] was indicated is first input parameter, is existed for keyword in a upper measurement period Current susceptibility numerical value in database, what x [1] was indicated is second input parameter, for the key in a upper measurement period The number that word uses.More new database and modified method: firstly, reading the numerical value of susceptibility field from database, make Method is java link mysql database, wherein the method for link is using library mysql-connector-java- 5.1.43.jar, corresponding susceptibility numerical value is obtained by select sentence, later again by this numerical value with last measurement period In the access times of the keyword be input in susceptibility calculation formula together, wherein susceptibility numerical value is x [0], access times For x [1], corresponding corrected susceptibility numerical value can be obtained, then used in data base manipulation statement by java program Update sentence is updated into database.
System environments: Ubuntu 16.04
Simple code of the invention executes as follows:
Module one, the mainly end android software are write: being inputted according to cell phone keyboard, obtained the ASCII of input value Whether value judges whether input value is number, if typed values are number, judge in character string buffer area with the presence of character string, If switching to Chinese character array after tabling look-up with the presence of character string, output phase answers Chinese character, while word in emptying buffer on mobile phone Symbol string and typed values, but when keying in number greater than array length, then it jumps out;If typed values is numbers and character string is not present, Then it is considered as input number;If typed values are not number, typed values are exported;
Code:=Click () // cell phone keyboard input, obtains the ASCII value of input value
If Code >=48and Code < 58 judges whether typed values are number
After 0 typed values of if String.length () > are digital, then continue to judge whether have word in character string buffer area Symbol string exists
If Code-49 > list () -1 switchs to Chinese character array with the presence of character string, then after tabling look-up, but digital when keying in When greater than array length, then jump out
Module two
Backstage executes matching algorithm
Algorithm description is as follows:
A) user's input is obtained, as pattern string, KMP pretreatment carried out to pattern string, when processing, is calculated using fuzzy matching Method is adopted to judge whether character is identical.
B) sensitive word is taken out in sensitive character library, as main string.
C) KMP algorithm is executed to main string and pattern string, if successful match, shows there is sensitive character in input, alarm, Terminate algorithm;If it fails to match, turn operation d)
D) judge whether current character is character in the last one sensitivity, if so, show that sensitive character is not present in input, Algorithm terminates;If not then turning operation b).
Module three: the modification of susceptibility judgement and data is carried out;
In lower array function, reading x and y first, wherein x can also be write as x [0], and what x [1], x were indicated is a upper statistics The access times combination of the two in susceptibility numerical value and a upper measurement period in period is one when because reading in It rises and is put into X matrix, what y was indicated is pre-set correct susceptibility numerical value, it is deposited into matrix xArr and yArr, Wherein xArr is exactly the array of a n row 2 column, and what n was indicated is the line number of data set, and 2 indicate susceptibility numerical value and access times This two column, calculates XTX, then judges whether the determinant is zero, if it is zero, will go out when calculating inverse matrix It is wrong.Finally, returning to ws, this parameter is in susceptibility calculation formula (y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1]) Regression coefficient.
def standRegress(xArr,yArr):
XMat=mat (xArr)
YMat=mat (yArr) .T
XTx=xMat.T*xMat
If linalg.det (xTx)==0.0//can directly obtain the value of determinant by linalg.det function, For judging whether it is zero
print“This matrix is singular,cannot do inverse”
Ws=xTx.I* (xMat.T*yMat) //xTx.I is inverse matrix, and the meter of regression coefficient is realized by this line code It calculates
return ws
Susceptibility updates step are as follows:
Be training process first, input a group data set, including three component parts, first be keyword sensitivity Degree, second is access times, and third is the pre- susceptibility obtained, and then by calculation procedure, (formula wherein calculated is y= Ws [0] * k+ws [1] * x [0]+ws [2] * x [1]) optimum fit curve is obtained, so-called optimum fit curve passes through above-mentioned public affairs Formula y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1], obtained straight line, wherein [0] parameter ws, ws [1], ws [2] this three A parameter is, as regression coefficient obtained by following told about training program, and k is offset, initial value 1, When finding that result error is excessive when we train, it can be improved by modification offset.
Input test data error in judgement situation readjusts data set and parameter size, directly if error is excessive To tolerance interval.
The update of susceptibility is carried out, (wherein calculation formula is by using above-mentioned calculation procedure
Y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1]), it, will be original quick when a measurement period terminates Sensitivity is as x [0], and using access times as x [1], ws and k are known quantity (ws is regression coefficient vector, and k is offset) It is input in program, new susceptibility numerical value can be obtained.
It writes back in database, updates susceptibility.
Specific implementation method of the invention is further elaborated below according to Use Case Map.
It is the login interface of this software such as Fig. 5, passes through input User ID and password login input method.It can be seen that this is used The name at family and preset sensitive word.
It is the setting of the input safety of this software such as Fig. 7, input password mistake will issue prompting.
Such as Fig. 6, lower section is this input method using interface, and by taking notepad inputs as an example, " the valence of cargo is inputted in interface Money is how many ", it has detected sensitive word " cargo ", pop-up is reminded.
The method of the privacy exposure real-time monitoring of a kind of data-oriented publication proposed by the present invention, to the content of user's input It is recorded, the comparison of character and judges susceptibility, in the first stage, obtained in user's input by keyboard mutuality interface Hold, lookup matching is carried out to the content of user's input, to obtain required keyword, in second stage, number is searched by keyword According to library, obtain corresponding sensitive rank, and feed back to terminal, in the phase III, user terminal obtain server return as a result, Corresponding operation is executed, to protect the privacy information of user.
According to Fig. 1, all members are as follows: user, terminal device, matcher, there are also servers for internet, and user is in terminal The beginning of entire program is entered information as, detailed process is as follows:
Step 1, user input information to terminal device, and terminal device is collected a segment information and pressed until user clicks transmission Key.
Step 2, matcher obtain the data that terminal device transmission comes, obtain corresponding set of words according to matching algorithm It closes, is transmitted on Internet, target is server.
Step 3, by Internet, in data transmission to server, server obtains set of letters, executes and differentiates and calculate Method judges that word whether in keywords database and whether the susceptibility of word exceeds threshold value, if be all satisfied, returns to police Information is accused, not warning message is returned if being unsatisfactory for.
Step 4 returns to server info to terminal device by Internet, and terminal device checks the information of return, such as Fruit return information is warning, then prompt information is issued to user, if it is not, then return step one.

Claims (5)

1. a kind of privacy exposure method of real-time of data-oriented publication, it is characterised in that: include:
The step of obtaining user's input content;
Is carried out to user's input content: by user's input content according to Chinese character language the step of searching matching and obtaining required keyword Method feature decomposition is combined at several individual words by fuzzy matching and KMP algorithm, and corresponding keyword is obtained;
Database, and the step of obtaining corresponding susceptibility numerical value are searched by keyword: first determining whether keyword is several According to the keyword stored in library, if so, obtaining corresponding susceptibility numerical value, numerical value storage to the data for being clearly keyword In the database, if it is not, then returning to not warning message;It is judged as danger if susceptibility numerical value is greater than preset threshold value Danger, sounds a warning, returns to terminal, if being not more than the threshold value, returns to not warning message;
The step of user terminal executes corresponding operating according to the susceptibility numerical value that feedback obtains.
2. a kind of privacy exposure method of real-time of data-oriented publication according to claim 1, it is characterised in that:
The fuzzy matching algorithm are as follows: according to the editing distance between two Chinese characters, by preset threshold value, if the editing distance Then it is considered as the two Chinese characters successful match less than threshold value;The editing distance is added up by word tone editing distance and font editing distance Arrive, phonetic transcriptions of Chinese characters be divided into initial consonant, simple or compound vowel of a Chinese syllable and tone, and assign its weight respectively, initial consonant, simple or compound vowel of a Chinese syllable after being endowed weight and Tone is cumulative to obtain word tone editing distance;Font is obtained by four-corner system data, assigns four angles of the four-corner system to its weight, will The four-corner system of two Chinese characters relatively after, and cumulative obtain font editing distance;
It will judge that the whether equal sentence of two characters is substituted for fuzzy matching function distance_compare in the KMP algorithm (char c_a, char c_b) realizes that fuzzy matching and KMP algorithm combine.
3. a kind of privacy exposure method of real-time of data-oriented publication according to claim 2, it is characterised in that: institute It states the weight that initial consonant, simple or compound vowel of a Chinese syllable and tone are endowed to differ, four angles of the four-corner system are endowed identical weight.
4. a kind of privacy exposure method of real-time of data-oriented publication according to claim 2, it is characterised in that: logical The threshold value of the whole fuzzy matching algorithm of toning carries out error correction to the content of user's input.
5. a kind of privacy exposure method of real-time of data-oriented publication according to claim 1, it is characterised in that: also The step of including the steps that initialization data library and more new database;
The step of described initialization data library are as follows: initial keyword sensitivity corresponding with its is obtained by the setting of user Degree value, and store to database;
The step of described more new database: susceptibility numerical computational formulas are as follows:
Y=ws [0] * k+ws [1] * x [0]+ws [2] * x [1]
Wherein, y indicates the susceptibility numerical value that prediction obtains, and ws [0], ws [1], ws [2] is respectively three in regression coefficient matrix A value, the regression coefficient are the specific gravity occupied in prediction by each input value that training dataset obtains, and what k was indicated is Offset, what x [0] was indicated is first input parameter, current quick in the database for keyword in a upper measurement period Sensitivity numerical value, what x [1] was indicated is second input parameter, for the number that the keyword in a upper measurement period uses, according to Susceptibility calculation procedure, the access times and the keyword for inputting some keyword in a upper measurement period are in database In current susceptibility numerical value, the susceptibility numerical value after being corrected.
CN201811118685.4A 2018-09-26 2018-09-26 A kind of privacy exposure method of real-time of data-oriented publication Pending CN109308295A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811118685.4A CN109308295A (en) 2018-09-26 2018-09-26 A kind of privacy exposure method of real-time of data-oriented publication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811118685.4A CN109308295A (en) 2018-09-26 2018-09-26 A kind of privacy exposure method of real-time of data-oriented publication

Publications (1)

Publication Number Publication Date
CN109308295A true CN109308295A (en) 2019-02-05

Family

ID=65224055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811118685.4A Pending CN109308295A (en) 2018-09-26 2018-09-26 A kind of privacy exposure method of real-time of data-oriented publication

Country Status (1)

Country Link
CN (1) CN109308295A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119428A (en) * 2019-04-19 2019-08-13 腾讯科技(深圳)有限公司 A kind of block chain information management method, device, equipment and storage medium
CN111597310A (en) * 2020-05-26 2020-08-28 成都卫士通信息产业股份有限公司 Sensitive content detection method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184188A (en) * 2011-04-15 2011-09-14 百度在线网络技术(北京)有限公司 Method and equipment for determining sensitivity of target text
CN106250364A (en) * 2016-07-20 2016-12-21 科大讯飞股份有限公司 A kind of text modification method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184188A (en) * 2011-04-15 2011-09-14 百度在线网络技术(北京)有限公司 Method and equipment for determining sensitivity of target text
CN106250364A (en) * 2016-07-20 2016-12-21 科大讯飞股份有限公司 A kind of text modification method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119428A (en) * 2019-04-19 2019-08-13 腾讯科技(深圳)有限公司 A kind of block chain information management method, device, equipment and storage medium
CN111597310A (en) * 2020-05-26 2020-08-28 成都卫士通信息产业股份有限公司 Sensitive content detection method, device, equipment and medium
CN111597310B (en) * 2020-05-26 2023-10-20 成都卫士通信息产业股份有限公司 Sensitive content detection method, device, equipment and medium

Similar Documents

Publication Publication Date Title
US11475143B2 (en) Sensitive data classification
Xie et al. Sql injection detection for web applications based on elastic-pooling cnn
CN114610515B (en) Multi-feature log anomaly detection method and system based on log full semantics
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110968699B (en) Logic map construction and early warning method and device based on fact recommendation
Le et al. Text classification: Naïve bayes classifier with sentiment Lexicon
CN105740228A (en) Internet public opinion analysis method
CN111444320A (en) Text retrieval method and device, computer equipment and storage medium
CN106611375A (en) Text analysis-based credit risk assessment method and apparatus
CN107368542B (en) Method for evaluating security-related grade of security-related data
US10410139B2 (en) Named entity recognition and entity linking joint training
US20220230050A1 (en) Fact validation method and system, computer device and storage medium
Molino et al. Cota: Improving the speed and accuracy of customer support through ranking and deep networks
CN107579821B (en) Method for generating password dictionary and computer-readable storage medium
CN113742733B (en) Method and device for extracting trigger words of reading and understanding vulnerability event and identifying vulnerability type
CN112016313B (en) Spoken language element recognition method and device and warning analysis system
CN110197389A (en) A kind of user identification method and device
Rattá et al. Viability assessment of a cross-tokamak AUG-JET disruption predictor
CN115473726A (en) Method and device for identifying domain name
CN109308295A (en) A kind of privacy exposure method of real-time of data-oriented publication
Ou et al. Scs-gan: Learning functionality-agnostic stylometric representations for source code authorship verification
US20230075290A1 (en) Method for linking a cve with at least one synthetic cpe
CN114022233A (en) Novel commodity recommendation method
Vink et al. Mapping crime descriptions to law articles using deep learning
Kang et al. A transfer learning algorithm for automatic requirement model generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190205