CN112417887A - 敏感词句识别模型处理方法、及其相关设备 - Google Patents
敏感词句识别模型处理方法、及其相关设备 Download PDFInfo
- Publication number
- CN112417887A CN112417887A CN202011314105.6A CN202011314105A CN112417887A CN 112417887 A CN112417887 A CN 112417887A CN 202011314105 A CN202011314105 A CN 202011314105A CN 112417887 A CN112417887 A CN 112417887A
- Authority
- CN
- China
- Prior art keywords
- initial
- sensitive word
- sentence
- data source
- recognition model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 95
- 238000012545 processing Methods 0.000 claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000012360 testing method Methods 0.000 claims description 56
- 238000002372 labelling Methods 0.000 claims description 20
- 230000015654 memory Effects 0.000 claims description 18
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000014509 gene expression Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000009825 accumulation Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000002175 menstrual effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011314105.6A CN112417887B (zh) | 2020-11-20 | 2020-11-20 | 敏感词句识别模型处理方法、及其相关设备 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011314105.6A CN112417887B (zh) | 2020-11-20 | 2020-11-20 | 敏感词句识别模型处理方法、及其相关设备 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112417887A true CN112417887A (zh) | 2021-02-26 |
CN112417887B CN112417887B (zh) | 2023-12-05 |
Family
ID=74777813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011314105.6A Active CN112417887B (zh) | 2020-11-20 | 2020-11-20 | 敏感词句识别模型处理方法、及其相关设备 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112417887B (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113011171A (zh) * | 2021-03-05 | 2021-06-22 | 北京市博汇科技股份有限公司 | 一种基于bert的违规文本识别算法及装置 |
CN113642326A (zh) * | 2021-08-16 | 2021-11-12 | 广东鸿数科技有限公司 | 敏感数据识别模型训练方法、敏感数据识别方法及系统 |
CN114239591A (zh) * | 2021-12-01 | 2022-03-25 | 马上消费金融股份有限公司 | 敏感词识别方法及装置 |
CN117216280A (zh) * | 2023-11-09 | 2023-12-12 | 闪捷信息科技有限公司 | 敏感数据识别模型的增量学习方法、识别方法和装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090204605A1 (en) * | 2008-02-07 | 2009-08-13 | Nec Laboratories America, Inc. | Semantic Search Via Role Labeling |
CN108509424A (zh) * | 2018-04-09 | 2018-09-07 | 平安科技(深圳)有限公司 | 制度信息处理方法、装置、计算机设备和存储介质 |
CN110209818A (zh) * | 2019-06-04 | 2019-09-06 | 南京邮电大学 | 一种面向语义敏感词句的分析方法 |
CN110222170A (zh) * | 2019-04-25 | 2019-09-10 | 平安科技(深圳)有限公司 | 一种识别敏感数据的方法、装置、存储介质及计算机设备 |
US20200265301A1 (en) * | 2019-02-15 | 2020-08-20 | Microsoft Technology Licensing, Llc | Incremental training of machine learning tools |
-
2020
- 2020-11-20 CN CN202011314105.6A patent/CN112417887B/zh active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090204605A1 (en) * | 2008-02-07 | 2009-08-13 | Nec Laboratories America, Inc. | Semantic Search Via Role Labeling |
CN108509424A (zh) * | 2018-04-09 | 2018-09-07 | 平安科技(深圳)有限公司 | 制度信息处理方法、装置、计算机设备和存储介质 |
US20200265301A1 (en) * | 2019-02-15 | 2020-08-20 | Microsoft Technology Licensing, Llc | Incremental training of machine learning tools |
CN110222170A (zh) * | 2019-04-25 | 2019-09-10 | 平安科技(深圳)有限公司 | 一种识别敏感数据的方法、装置、存储介质及计算机设备 |
CN110209818A (zh) * | 2019-06-04 | 2019-09-06 | 南京邮电大学 | 一种面向语义敏感词句的分析方法 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113011171A (zh) * | 2021-03-05 | 2021-06-22 | 北京市博汇科技股份有限公司 | 一种基于bert的违规文本识别算法及装置 |
CN113642326A (zh) * | 2021-08-16 | 2021-11-12 | 广东鸿数科技有限公司 | 敏感数据识别模型训练方法、敏感数据识别方法及系统 |
CN114239591A (zh) * | 2021-12-01 | 2022-03-25 | 马上消费金融股份有限公司 | 敏感词识别方法及装置 |
CN114239591B (zh) * | 2021-12-01 | 2023-08-18 | 马上消费金融股份有限公司 | 敏感词识别方法及装置 |
CN117216280A (zh) * | 2023-11-09 | 2023-12-12 | 闪捷信息科技有限公司 | 敏感数据识别模型的增量学习方法、识别方法和装置 |
CN117216280B (zh) * | 2023-11-09 | 2024-02-09 | 闪捷信息科技有限公司 | 敏感数据识别模型的增量学习方法、识别方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN112417887B (zh) | 2023-12-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112084337B (zh) | 文本分类模型的训练方法、文本分类方法及设备 | |
CN112101041B (zh) | 基于语义相似度的实体关系抽取方法、装置、设备及介质 | |
CN112417887B (zh) | 敏感词句识别模型处理方法、及其相关设备 | |
WO2022174491A1 (zh) | 基于人工智能的病历质控方法、装置、计算机设备及存储介质 | |
CN112507125A (zh) | 三元组信息提取方法、装置、设备及计算机可读存储介质 | |
CN110929125B (zh) | 搜索召回方法、装置、设备及其存储介质 | |
CN112686022A (zh) | 违规语料的检测方法、装置、计算机设备及存储介质 | |
CN112632278A (zh) | 一种基于多标签分类的标注方法、装置、设备及存储介质 | |
CN111783471B (zh) | 自然语言的语义识别方法、装置、设备及存储介质 | |
CN112215008A (zh) | 基于语义理解的实体识别方法、装置、计算机设备和介质 | |
CN112231569A (zh) | 新闻推荐方法、装置、计算机设备及存储介质 | |
CN112287069A (zh) | 基于语音语义的信息检索方法、装置及计算机设备 | |
CN113986864A (zh) | 日志数据处理方法、装置、电子设备及存储介质 | |
CN114357117A (zh) | 事务信息查询方法、装置、计算机设备及存储介质 | |
CN113505601A (zh) | 一种正负样本对构造方法、装置、计算机设备及存储介质 | |
WO2022073341A1 (zh) | 基于语音语义的疾病实体匹配方法、装置及计算机设备 | |
CN112686053A (zh) | 一种数据增强方法、装置、计算机设备及存储介质 | |
CN115438149A (zh) | 一种端到端模型训练方法、装置、计算机设备及存储介质 | |
CN115730597A (zh) | 多级语义意图识别方法及其相关设备 | |
CN116796730A (zh) | 基于人工智能的文本纠错方法、装置、设备及存储介质 | |
CN112528040B (zh) | 基于知识图谱的引导教唆语料的检测方法及其相关设备 | |
CN114090792A (zh) | 基于对比学习的文档关系抽取方法及其相关设备 | |
Jiang et al. | Tapchain: A rule chain recognition model based on multiple features | |
CN115730237B (zh) | 垃圾邮件检测方法、装置、计算机设备及存储介质 | |
CN114742058B (zh) | 一种命名实体抽取方法、装置、计算机设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231101 Address after: 1701, No. 688 Dalian Road, Yangpu District, Shanghai, 200082 (nominal floor is 20 floors) Applicant after: XIAOVO TECHNOLOGY CO.,LTD. Address before: 4 / F, building 1, no.1-9, Lane 99, Shenmei Road, Pudong New Area, Shanghai Applicant before: Shanghai Pinyuan Information Technology Co.,Ltd. Effective date of registration: 20231101 Address after: 4 / F, building 1, no.1-9, Lane 99, Shenmei Road, Pudong New Area, Shanghai Applicant after: Shanghai Pinyuan Information Technology Co.,Ltd. Address before: Room 202, Block B, Aerospace Micromotor Building, No. 7 Langshan 2nd Road, Xili Street, Nanshan District, Shenzhen, Guangdong Province, 518000 Applicant before: Shenzhen LIAN intellectual property service center Effective date of registration: 20231101 Address after: Room 202, Block B, Aerospace Micromotor Building, No. 7 Langshan 2nd Road, Xili Street, Nanshan District, Shenzhen, Guangdong Province, 518000 Applicant after: Shenzhen LIAN intellectual property service center Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.) Applicant before: PING AN PUHUI ENTERPRISE MANAGEMENT Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |