KR20030024309A - Slang Remover Program On Web board - Google Patents
Slang Remover Program On Web board Download PDFInfo
- Publication number
- KR20030024309A KR20030024309A KR1020010057389A KR20010057389A KR20030024309A KR 20030024309 A KR20030024309 A KR 20030024309A KR 1020010057389 A KR1020010057389 A KR 1020010057389A KR 20010057389 A KR20010057389 A KR 20010057389A KR 20030024309 A KR20030024309 A KR 20030024309A
- Authority
- KR
- South Korea
- Prior art keywords
- slang
- text
- bulletin
- token
- list
- Prior art date
Links
- 235000013405 beer Nutrition 0.000 claims description 6
- 230000006870 function Effects 0.000 abstract description 8
- 230000000903 blocking effect Effects 0.000 abstract description 5
- 238000000034 method Methods 0.000 abstract description 4
- 238000012544 monitoring process Methods 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 3
- 235000006719 Cassia obtusifolia Nutrition 0.000 description 2
- 235000014552 Cassia tora Nutrition 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 244000277285 Cassia obtusifolia Species 0.000 description 1
- 244000201986 Cassia tora Species 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Primary Health Care (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Computer And Data Communications (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
본발명은 인터넷의 비어와 저속어에 대한 관리의 어려움에 대해 효율적으로 관리 할 수 있는 비어, 저속어의 제거 프로그램에 관한것이다.The present invention relates to a program for removing beer and vulgar words that can efficiently manage the difficulty of managing the beer and vulgar words on the Internet.
과거 인터넷 게시판의 비어 및 저속어의 처리 방식은 사용자가 입력 할 때 속어에 대한 차단으로 인하여 네티즌은 욕구 충족이 되지 않아 새로운 저속어(욕)을 만들게 하였다.In the past, Internet bulletin boards deal with beer and slang words, so netizens couldn't meet their needs due to the blocking of slang when a user inputs them.
또한 인터넷이 발전함에 따라 일부 네티즌의 심한 저속어 표현에 대해 방어를 위한 프로그램으로 인해 네티즌들의 과감한 욕만들기가 성행하고 있는 실정이며, 게시판의 저속어를 차단하기 위하여 실제 언어로 사용되는 단어조차 차단하는 경우가 발생되어 네티즌의 언어표현의 자유를 막고 있다.Also, with the development of the Internet, some netizens are trying to defend against the expression of severe vulgar language, which is causing the bold swearing of netizens, and even the words used in actual languages are blocked to block the vulgar language of bulletin boards. To prevent netizens' freedom of speech.
본 발명은 기존의 차단 방법이 아닌 네티즌의 언어표현의 자유를 지켜 주면서 게시판의 관리 모듈로 타 네티즌에게는 정화된 언어를 보여 주므로 인해 2중의 효과를 가져 올 수 있는 프로그램이다.The present invention is a program that can bring a double effect because it shows the purified language to other netizens as a management module of the bulletin board while protecting the freedom of language expression of the netizens rather than the existing blocking method.
또한 웹마스터의 효율적인 관리를 위하여 여러 기능을 추가 하였다.In addition, several functions have been added for efficient management of webmasters.
기존 프로그램과 차별된 것은 차단이 아닌 사용자가 저속어를 올린후 타 사용자가 그내용을 클릭 하였을때는 저속어는 치완 및 삭제 되면서 올바른 표현만 보여주는 방식이다.What is different from the existing program is not blocking, but when a user uploads a vulgar word and another user clicks on the content, the vulgar word is only corrected and deleted and shows only the correct expression.
본 발명은 인터넷의 언어 표현의 자유를 지켜주면서 타 네티즌에게 올바른 뜻을 전달 될 수 있도록 제작된 프로그램이다.The present invention is a program designed to deliver the correct meaning to other netizens while protecting the freedom of language expression of the Internet.
또한 웹관리자가 게시판의 관리를 효율적으로 관리 할 수 있도록 치완기능, 삭제기능, 업로드 기능등을 추가 하였으며 모니터링 기능, 운영 시간 설정기능등을 원활히 관리 할수 있도록 하였다.In addition, Web administrators can add management functions, delete function, upload function, etc. to manage the bulletin board efficiently, and manage the monitoring function and operation time setting function smoothly.
* 프로그램명: 웹크리너Program Name: Web Cleaner
도1은 프로그램 전체 구성도1 is a program overall configuration diagram
도2 는 비속어 처리모듈 프로그램2 is a slang processing module program
Webcleaner는 프로그래밍 모델 중MVC(Model-View- Controller) 디자인 패턴을 따라 개발하였으며 DB에 접근방법은 DB의 종류에 상관없는 JDBC(Java Database Connectivity)를 사용했다. Webcleaner는 Java언어로 작성되고 JDBC를 사용하므로 어떤 플랫폼에서든지 실행되며 또한 어떤 RDBMS(Relational DataBase Management Systems)에서든지 JDBC 드라이버가 있다면 연결되며 Java실행환경과 JSP/서블릿 엔진이 있으면 작동할 수 있다.Webcleaner developed according to MVC (Model-View-Controller) design pattern among programming models and used JDBC (Java Database Connectivity) regardless of DB type. Webcleaner is written in the Java language and uses JDBC, so it runs on any platform, and can be connected with any JDBC driver on any relational database management system (RDBMS), and works with the Java runtime and JSP / servlet engine.
도1은 시스템의 구성도 이다. 그림의 Web server와 Web application server부를 보면 Http Request를 서블릿이 받아서 자바 실행환경의 비즈니스 로직부분을 호출하며 Controller의 역할을 하고 있고, 비즈니스 로직에서는 백엔드시스템과 통신한 결과를 Model인 자바빈즈에 넘겨준다. 그러면 JSP에서는 자바빈즈의 내용을 화면에 보여주는 View의 역할을 하여 Response를 보낸다. 백엔드시스템에는 게시판DB와 비속어리스트 파일, 결과를 저장하는 파일, 비속어를 처리할 게시판리스트 정보를 가지고 있는 게시판 정보파일이 있다. 특정 게시판에 대한 비속어처리를 시작하라는 Request가 있을 때 비즈니스 로직부의 cleaner 쓰레드 중 그 게시판에 관한 쓰레드를 시작시키게 된다. cleaner 쓰레드는 자바가상머신에서 멀티쓰레드로 실행되면서 주기억장치의 비속어리스트를 공유하므로 효율적이며 각 게시판에 대한 동시수행이 가능하다.1 is a configuration diagram of a system. In the web and web application server shown in the figure, the servlet receives the Http Request and calls the business logic part of the Java execution environment and acts as a controller, and the business logic passes the result of communicating with the backend system to the Java bean which is the model. . Then JSP sends a response by acting as a view showing the contents of JavaBeans on the screen. The back-end system includes a bulletin board DB, a slang list file, a file storing results, and a bulletin board information file that contains bulletin board list information to process slang words. When a request is made to start slang processing for a specific bulletin board, the cleaner thread in the business logic section starts the thread for that bulletin board. The cleaner thread runs as a multi-threaded Java virtual machine, sharing the slang list of main memory, so it is efficient and allows simultaneous execution of each bulletin board.
도 2는 게시판 RDBMS로 부터 본문 텍스트들을 얻어서 비속어를 추출하여 치환된 텍스트를 얻는 과정인데 순서대로 설명하면 다음과 같다.2 is a process of extracting slang by obtaining the body texts from the bulletin board RDBMS to obtain the replaced text.
(1) 게시판RDBMS에 JDBC인터페이스를 사용하여 select 질의를 주어 가장마지막 처리한 게시물 의 입력날짜 이후의 게시물 리스트를 읽어온다.(1) By giving a select query to the bulletin board RDBMS using the JDBC interface, the list of posts after the input date of the last processed post is read.
(2) 게시물의 본문테스트를 차례로 토크나이저에 넘겨준다. 그러면 토크나이저가 정의된 특수문자세트에 의해 본문텍스트를 한번 스캐닝하여 단어(토큰)별로 나누게 하고 각 토큰의 본문텍스트 상의 인덱스를 기록해 둔다.(2) Pass the body test of the post to the tokenizer in turn. The tokenizer then scans the body text once by a defined set of special characters and divides them into words (tokens) and records the index on the body text of each token.
(3) 비속어 추출모듈에서는 토크나이저에게 토큰을 차례로 넘겨줄 것을 요청하여 각 토큰의 첫음절 음소에 의해 주기억장치에 올라와 있는 비속어 리스트의 키를 결정하여 찾아간다.(3) The slang extraction module asks the tokenizer to pass tokens in turn, and determines the key of the slang list on the main memory by the first syllable phoneme of each token.
(4) 찾아간 비속어 리스트 상의 특정리스트에서 토큰과 패턴이 매치되는 데이터가 있는지를 binary search기법으로 찾는다.(4) Use binary search to find out if there is data matching the token and pattern in a specific list on the slang list.
(5) 이렇게 해서 본문텍스트의 끝까지 찾아서 비속어리스트와 패턴이 매치되는 토큰이 발견되었다면 토크나이저에게 비속어 부분을 치환시킨 텍스트를 요청하여 치환텍스트를 얻어낸다. 토크나이저는 토크나이징 할 때 본문텍스트 상의 토큰의 인덱스를 기록해 두었으므로 치환텍스트를 만들어 낼 수 있다.(5) In this way, if the token that matches the slang list and the pattern is found by searching to the end of the body text, the tokenizer asks the text that replaces the slang part to obtain the replacement text. When Tokenize keeps track of the index of tokens in the body text when tokenizing, it can generate replacement text.
(6) 게시판 RDBMS에 JDBC인터페이스를 사용하여 비속어가 발견된 게시물의 본문텍스트를 치환텍스트로 update하는 SQL문을 실행시켜 치환시킨다.(6) Use the JDBC interface in the bulletin board RDBMS to execute the SQL statement that updates the body text of the post where the slang is found with the replacement text.
(7) 치환시킨 날짜별로 치환시킨 게시물에 대한 정보를 결과파일에 저장한다.(7) It saves the information about the replaced post by the date replaced in the result file.
본 발명의 전체 적인 구성은 저속어 분석 모듈(module), 설치모듈, 시스템 설정 모듈, 관리모듈로 나눌 수 있다.The overall configuration of the present invention can be divided into a low word analysis module (module), an installation module, a system configuration module, a management module.
이상과 같이 저속어 및 비어 처리 제거기는 웹관리자의 관리의 편리성과 네티즌의 심리적인 요인을 분석하여 차단이 아닌 제거로 인하여 인터넷 네티켓을 지켜나가는데 현실적인 프로그램이 될 수 있다.As mentioned above, the slang and beer processing eliminator can be a realistic program to protect the Internet netiquette by eliminating the blocking of the web administrator by analyzing the convenience of management and the psychological factors of the netizens.
본 발명의 결과를 정리하면 아래와 같다.The results of the present invention are summarized as follows.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020010057389A KR20030024309A (en) | 2001-09-17 | 2001-09-17 | Slang Remover Program On Web board |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020010057389A KR20030024309A (en) | 2001-09-17 | 2001-09-17 | Slang Remover Program On Web board |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20030024309A true KR20030024309A (en) | 2003-03-26 |
Family
ID=27724397
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020010057389A KR20030024309A (en) | 2001-09-17 | 2001-09-17 | Slang Remover Program On Web board |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20030024309A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005036412A1 (en) * | 2003-10-16 | 2005-04-21 | Nhn Corporation | A method of managing bulletin on internet and a system thereof |
KR100817848B1 (en) * | 2006-06-26 | 2008-03-31 | (주)트리니티소프트 | Method for network-based data inspection and apparatus thereof |
WO2010090382A1 (en) * | 2009-02-03 | 2010-08-12 | Jang Sung-Hee | Online protection system and protection method |
CN104252463A (en) * | 2013-06-26 | 2014-12-31 | 中国银联股份有限公司 | Db2 database management method based on web system |
-
2001
- 2001-09-17 KR KR1020010057389A patent/KR20030024309A/en not_active Application Discontinuation
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005036412A1 (en) * | 2003-10-16 | 2005-04-21 | Nhn Corporation | A method of managing bulletin on internet and a system thereof |
KR100817848B1 (en) * | 2006-06-26 | 2008-03-31 | (주)트리니티소프트 | Method for network-based data inspection and apparatus thereof |
WO2010090382A1 (en) * | 2009-02-03 | 2010-08-12 | Jang Sung-Hee | Online protection system and protection method |
CN104252463A (en) * | 2013-06-26 | 2014-12-31 | 中国银联股份有限公司 | Db2 database management method based on web system |
CN104252463B (en) * | 2013-06-26 | 2018-09-04 | 中国银联股份有限公司 | A kind of db2 data base management methods based on web system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5890103A (en) | Method and apparatus for improved tokenization of natural language text | |
US6782505B1 (en) | Method and system for generating structured data from semi-structured data sources | |
US6975983B1 (en) | Natural language input method and apparatus | |
US20020099536A1 (en) | System and methods for improved linguistic pattern matching | |
JP2014041615A (en) | Method and system with high performance data meta tag using coprocessor and with data index | |
WO1997004405A9 (en) | Method and apparatus for automated search and retrieval processing | |
US20040088651A1 (en) | Method and system for multiple level parsing | |
US7398210B2 (en) | System and method for performing analysis on word variants | |
Cahill et al. | Wide-coverage deep statistical parsing using automatic dependency structure annotation | |
CN112597307A (en) | Extraction method, device and equipment of figure action related data and storage medium | |
CN111984774A (en) | Search method, device, equipment and storage medium | |
KR20060043583A (en) | Compression of logs of language data | |
KR20030024309A (en) | Slang Remover Program On Web board | |
CN113032371A (en) | Database grammar analysis method and device and computer equipment | |
CN109800430B (en) | Semantic understanding method and system | |
US20040024741A1 (en) | Database processing method | |
CN115098365A (en) | SQL code debugging method and device, electronic equipment and readable storage medium | |
CN112069198B (en) | SQL analysis optimization method and device | |
JP2006004283A (en) | Method and system for extracting/narrowing keyword from text information source | |
JP2000194559A5 (en) | ||
JP5412137B2 (en) | Machine learning apparatus and method | |
KR100347055B1 (en) | Korean morpheme analyzing method | |
JP2830097B2 (en) | Sentence search method | |
CN117454378A (en) | Static detection method and device for storage type XSS loopholes in modern Web application | |
JPH0773200A (en) | Key word extracting method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application | ||
E601 | Decision to refuse application |