KR101242142B1

KR101242142B1 - Method and system for pre-verification of data

Info

Publication number: KR101242142B1
Application number: KR1020110104586A
Authority: KR
Inventors: 오태영; 한학희; 조용민; 김준수; 허성희; 백승희; 인소영; 김정섭; 송영훈; 정은숙; 정수미
Original assignee: (주) 케이씨넷
Priority date: 2011-10-13
Filing date: 2011-10-13
Publication date: 2013-03-11

Abstract

PURPOSE: A data pre-verification method and a system thereof are provided to approve the input of a product name data according to the validity verification result of the product name data included in a manifest, thereby accurately writing the manifest. CONSTITUTION: An input unit(10) receives import/export container product name data. A data processing unit(20) divides the import/export container product name data at the word level. An attribute extraction unit(30) extracts attribute information corresponding to a word from a reference database(40). A data verification unit determines whether or not the import/export container product name data is distinguished corresponding to a combination and meaning of the words. An approval determination unit determines the input approval and an input limit of the import/export container product name data corresponding to the determination result. [Reference numerals] (10) Input unit; (20) Data processing unit; (30) Attribute extraction unit; (40) Reference database; (50) Reference database updating unit; (60) Data verification unit

Description

Method and system for pre-verification of data

본 발명은 데이터 입력시 상기 데이터의 의미 식별 여부를 확인하여 데이터를 사전 검증함으로써 불분명한 데이터의 입력을 제한하는 방법 및 시스템에 관한 것이다.The present invention relates to a method and system for limiting the input of ambiguous data by confirming whether or not the meaning of the data is identified upon data input and pre-validating the data.

종래의 수출입 화물 처리 절차는 운송회사(선사 또는 항공사)가 입력하여 제출한 집단선하증권(Master B/L)과 포워더(운송 중개인)가 입력하여 제출한 혼재화물선하증권(House B/L)을 적하목록취합시스템(MFCS)에서 자동으로 취합하여 세관으로 전송한다.Conventional import and export cargo handling procedures include a mixed bill of lading (House B / L) submitted and entered by the carrier (master or airline) and submitted by the forwarder (transport broker). It is automatically collected by the manifest manifest collection system (MFCS) and sent to customs.

이때 상기 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)은 적하목록의 신고단위를 의미하며 적하목록이란 국내에 입출항하는 모든 선박 또는 항공기에 적재된 화물의 총괄목록을 의미한다.In this case, the master bill of lading (Master B / L) and mixed cargo bill of lading (House B / L) refer to the declaration unit of the manifest, and the manifest is a comprehensive list of cargo loaded on all vessels or aircraft entering and leaving the country. it means.

이후 세관에서는 적하목록취합시스템으로부터 전송된 적하목록에 기재된 내용을 바탕으로 세계관세기구(WCO)가 지정한 국제통일상품분류체계(HS)를 통해 하나의 품목번호로 분류하고 관세율이 적용하여 통관을 진행한다.Afterwards, customs shall classify the item into one item number through an international unified goods classification system (WCO) designated by the World Customs Organization (WCO) based on the contents of the manifest sent from the manifest collection system, and proceed with customs clearance by applying the tariff rate. do.

따라서 적하목록의 기재 내용이 불분명할 경우 적하목록과 실제 수출입품을 재확인하여 분류하고 관세율을 적용함으로써 세관 심사에 많은 시간 및 인력이 낭비되는 문제가 발생한다.Therefore, if the contents of the manifest are unclear, there is a problem in that a lot of time and manpower is wasted in customs inspection by reconfirming and classifying the manifest and the actual import and export goods.

종래에는 이러한 심사 부담의 문제를 완화하기 위한 일 방안으로, 화주(貨主, 수출입 화물의 주인), 운송회사 및 포워더 측이 적하목록상의 기재 오류를 발견할 경우 세관으로 전송되기 이전이라면 적하목록취합시스템에서 적하목록을 정정하도록 하고 있다.Conventionally, as a way to alleviate the problem of the inspection burden, if the shipper (owner, owner of import and export cargo), the carrier and the forwarder find an error in the manifest, the manifest is collected before it is sent to customs. Is correcting the manifest.

또한, 다른 방안의 하나로 적하목록이 적하목록취합시스템을 통해 세관으로 전송된 이후에 적하목록상의 기재 오류가 확인되면 적하목록정정신청서와 사유서 등을 세관에 제출하여 승인을 받은 후 소정의 과태료를 지불함으로써 적하목록을 정정하도록 하고 있다.As another option, if a manifest error on the manifest is confirmed after the manifest is sent to the customs office through the manifest collection system, an application for correction of the manifest and a letter of reason shall be submitted to the customs office for approval and payment of a predetermined penalty fee. By doing so, the manifest is corrected.

이로써 적하목록의 기재 오류로 인하여 세관 심사에서 발생하는 문제를 방지할 수 있으며, 불필요한 인력 및 시간 낭비를 최소화할 수 있다.This prevents problems in customs inspections due to errors in the manifest, and minimizes waste of manpower and time.

그러나 화주, 운송회사 및 포워더가 세율 감소 및 우범화물 은폐를 위하여 고의로 적하목록상에 부정확한 품목명을 기재하였을 경우에는 상기와 같은 자발적인 적하목록 정정 작업을 기대하기 힘드므로 적하목록의 확인 및 심사에 대한 부담은 크게 줄어들지 않고 있다.However, if the shipper, carrier and forwarder intentionally enter an incorrect item name on the manifest in order to reduce the tax rate and conceal the contingency cargo, it is difficult to expect such voluntary manifest correction. The burden is not greatly reduced.

한편, 적하목록관리번호의 관리 기술과 관련하여 대한민국 등록특허 공보 제955273호는 적하목록의 오류 여부를 검출하는 항공 적하목록취합시스템을 제안하고 있다.On the other hand, in relation to the management technology of the manifest manifest number, Korean Patent Publication No. 953273 proposes an aerial manifest list collection system for detecting the error of the manifest.

도 1은 종래 수출입 화물 목록의 전송 과정을 도시한다.1 shows a process of transmitting a conventional import and export cargo list.

상기 대한민국 등록특허 공보 제955273호에 의하면 적하목록취합시스템을 통해 세관에 전송된 적하목록을 심사하는 과정에서 상기 적하목록의 기재 오류를 검출하고 상기 기재 오류가 있는 적하목록을 작성한 화주, 운송회사 및 포워더에서 적하목록의 기재 오류를 통보하여 정정한다.According to the Republic of Korea Patent Publication No. 953273, a shipper, a shipping company that detects a description error of the manifest and creates a manifest containing the description error in the process of examining the manifest sent to customs through the manifest collection system; The forwarder will notify you of any errors in the manifest and correct them.

그러나 상기 대한민국 등록특허 공보 제955273호에 의하면 적하목록의 기재 오류를 검출하는 과정에서 불필요한 인력 및 시간이 세관 심사에 소요되는 문제가 발생하며, 상기 적하목록의 기재 오류를 통보받은 화주, 운송회사 및 포워더가 상기 적하목록을 정정하는데 발생하는 소정의 비용을 부담해야한다는 단점이 있다.However, according to the Republic of Korea Patent Publication No. 953273, unnecessary personnel and time in the process of detecting errors in the manifest list takes customs inspection, shipper, shipping company and the notification of the error in the manifest list The disadvantage is that the forwarder has to bear a certain cost incurred to correct the manifest.

따라서, 운송회사 또는 포워더가 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)의 작성 단계나 적하목록취합시스템(MFCS)에서 상기 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)을 취합하는 적하목록 생성 단계에서 상기 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)에 포함된 수출입 화물의 품명 데이터를 검증하고 상기 검증 결과에 따라 선택적으로 상기 적하목록을 세관에 전송할 수 있는 기술이 필요하다.Therefore, the carrier or forwarder may not be able to find the master bill of lading (Master B / L) and the mixed bill of lading (House B / L) at the stage of preparation or manifest loading system (MFCS). In the step of creating a manifest that collects the mixed bill of lading (House B / L), the item name data of the import and export cargo included in the master bill of lading and the mixed bill of lading (House B / L) is verified. According to the verification result, a technique for selectively transmitting the manifest to customs is required.

본 발명의 실시예들이 해결하려는 과제는 세관에 전송되기 이전 단계에서 적하목록에 포함된 품명데이터의 유효성을 검증하고 그 결과에 따라 상기 품명데이터의 승인 여부를 판단하는 방안을 제시하는 것이다.The problem to be solved by the embodiments of the present invention is to propose a method for verifying the validity of the item name data included in the manifest in the step before being transmitted to the customs and to determine whether to approve the item name data according to the result.

위와 같은 과제를 해결하기 위해 본 발명은, 수출입 화물 품명데이터가 입력되는 입력부와, 상기 품명데이터를 단어 단위로 분할하는 데이터 가공부와, 다년간 신고된 수출입신고서 및 적하목록신고서로부터 추출한 품명데이터의 형태소 및 단어를 분석하여 추출한 품사 정보와 상기 추출된 품사별 의미 정보 및 관련 정보를 저장하는 참조 데이터베이스와. 상기 참조 데이터베이스로부터 상기 단어에 대응하는 속성 정보를 추출하는 속성 추출부와, 상기 추출된 속성 정보를 바탕으로 상기 품사별 단어의 조합과 의미에 따라 상기 품명데이터의 식별 가능 여부를 판단하는 데이터 검증부 및 상기 판단 결과에 따라 상기 품명데이터의 입력 승인 및 입력 제한을 판정하는 승인 판정부를 포함하여 이루어지는 데이터 사전 검증 시스템을 일 실시예로 제안한다.In order to solve the above problems, the present invention provides an input unit for importing and exporting cargo name data, a data processing unit for dividing the article data into word units, and a morpheme of the item name data extracted from the import and export declaration and manifest declaration report reported for many years. And a reference database which stores part-of-speech information extracted by analyzing a word, semantic information for each part-of-speech, and related information. An attribute extracting unit for extracting attribute information corresponding to the word from the reference database, and a data verification unit for determining whether the item name data can be identified based on the combination and meaning of the words for each part of speech based on the extracted attribute information And an approval determination unit for determining an input approval and an input restriction of the item name data according to the determination result.

상기 속성 정보는, 상기 단어의 품사 정보, 의미 정보 및 상기 단어에 대응하는 동의어, 표준어, 약어, 규격 및 HS(국제통일상품분류체계)코드 등에 관한 관련 정보일 수 있다.The attribute information may be part-of-speech information, semantic information of the word, and related information about a synonym, a standard word, an abbreviation, a standard, and an HS code.

상기 속성 추출부는, 상기 참조 데이터베이스로부터 상기 단어의 속성 정보를 추출하지 못할 경우 웹 기반의 오픈소스 사전을 포함하는 외부 데이터베이스로부터 상기 단어의 속성 정보를 추출하여 상기 참조 데이터베이스를 업데이트하는 참조 데이터베이스 갱신부를 더 포함할 수 있다.The attribute extractor may further include a reference database updater configured to update the reference database by extracting attribute information of the word from an external database including a web-based open source dictionary when the attribute information of the word cannot be extracted from the reference database. It may include.

상기 데이터 검증부는, 상기 단어의 집합이 의미 식별이 가능한 명사 1개만으로 이루어지면 상기 입력된 품명데이터를 식별 가능한 것으로 판정할 수 있다.The data verification unit may determine that the input item name data is identifiable when the set of words is composed of only one noun for meaning identification.

상기 데이터 검증부는, 상기 단어의 집합이 의미 식별이 가능한 명사를 최소 2개 이상 포함하면 상기 입력된 품명데이터를 식별 가능한 것으로 판정할 수 있다. The data verification unit may determine that the input item name data is identifiable when the set of words includes at least two nouns capable of meaning identification.

상기 데이터 검증부는, 상기 단어의 집합이 의미 식별이 가능한 명사와 형용사를 각각 1개 이상 포함하면 상기 입력된 품명데이터를 식별 가능한 것으로 판정할 수 있다.The data verification unit may determine that the input item name data can be identified when the set of words includes one or more nouns and adjectives each capable of identifying a meaning.

상기 데이터 검증부는, 상기 단어의 집합이 의미 식별이 불가능한 명사 및 형용사를 포함하여 이루어지면 상기 입력된 품명데이터를 식별 불가능한 것으로 판정할 수 있다.The data verification unit may determine that the input item name data is not discernible when the set of words includes nouns and adjectives in which meanings cannot be identified.

위와 같은 과제를 해결하기 위해 본 발명은, 입력부가 수출입 화물 품명데이터를 입력받는 단계와, 데이터 가공부가 상기 입력된 품명데이터를 단어 단위로 분할하는 단계와, 속성 추출부가 상기 단어에 대응하는 속성 정보를 참조 데이터베이스로부터 추출하는 단계 및 데이터 식별부에서 상기 속성 정보를 가지는 단어의 집합으로 이루어진 수출입 화물 품명데이터의 식별 여부를 판단하는 단계를 포함하여 이루어지는 데이터 사전 검증 방법을 일 실시예로 제안한다.In order to solve the above problems, the present invention, the input unit receives the import and export cargo item name data, the data processing step of dividing the input item name data by word unit, the attribute extraction unit attribute information corresponding to the word According to an embodiment, a data pre-verification method including extracting a reference data from a reference database and determining whether to identify an import / export cargo name data consisting of a set of words having the attribute information in the data identification unit is provided.

상기 참조 데이터베이스는, 다년간 신고된 수출입신고서 및 적하목록신고서로부터 추출한 품명데이터의 형태소 및 단어를 분석하여 추출한 품사 정보와 상기 추출된 품사별 의미 정보 및 관련 정보를 저장할 수 있다.The reference database may store part-of-speech information extracted by analyzing the morphemes and words of the item-name data extracted from export and import declarations and manifest declarations reported for many years, and the extracted parts-of-speech information and related information.

상기 품명데이터 식별 여부 판단 단계는, 상기 단어의 집합이 의미 식별이 가능한 명사 1개만으로 이루어지면 상기 입력된 품명데이터를 식별 가능한 것으로 판정하여 입력을 승인할 수 있다.In the determining whether the item name data is identified, if the set of words includes only one noun capable of identifying the meaning, the input item name data may be determined as identifiable and the input may be approved.

상기 품명데이터 식별 여부 판단 단계는, 상기 단어의 집합이 의미 식별이 가능한 명사를 최소 2개 이상 포함하면 상기 입력된 품명데이터를 식별 가능한 것으로 판정하여 입력을 승인할 수 있다.In the determining whether the item name data is identified, if the set of words includes at least two nouns capable of meaning identification, the input item name data may be determined as identifiable and the input may be approved.

상기 품명데이터 식별 여부 판단 단계는, 상기 단어의 집합이 의미 식별이 가능한 명사와 형용사를 각각 1개 이상 포함하면 상기 입력된 품명데이터를 식별 가능한 것으로 판정하여 입력을 승인할 수 있다.In the determining whether the item name data is identified, if the set of words includes one or more nouns and adjectives each capable of meaning identification, the input item name data may be determined as identifiable and the input may be approved.

상기 품명데이터 식별 여부 판단 단계는, 상기 단어의 집합이 의미 식별이 불가능한 명사 및 형용사를 포함하여 이루어지면 상기 입력된 품명데이터를 식별 불가능한 것으로 판정하여 입력을 제한할 수 있다.In the determining whether the item name data is identified, if the set of words includes nouns and adjectives whose meanings cannot be identified, the input item name data may be determined to be indistinguishable and the input may be restricted.

본 발명의 실시예에 의하면 세관에 전송되기 이전 단계에서 적하목록에 포함된 품명데이터의 유효성 검증 결과에 따라 상기 품명 데이터의 입력을 승인함으로써 적하목록을 정확하게 작성할 수 있으며 적하목록의 기재 오류로 인하여 통관이 지연되는 사태를 방지할 수 있다. 또한, 관세율 결정과 우범화물 선별의 정확성을 높일 수 있다.According to the embodiment of the present invention, the entry list can be accurately generated by approving the entry of the article name data in accordance with the result of the validation of the article name data included in the manifest at the stage before it is transmitted to the customs office, This delay can be prevented. In addition, it is possible to increase the accuracy of tariff rate determination and selection of contingency cargo.

도 1은 종래의 수출입 화물 목록의 전송 과정을 도시한다.
도 2는 본 발명의 실시예에 따른 품명데이터 사전검증 시스템의 구성을 도시한 것이다.
도 3은 본 발명의 실시예에 따른 참조 데이터베이스의 구성을 도시한 것이다.
도 4는 본 발명의 실시예에 따른 참조 데이터베이스 갱신부의 구성을 도시한 것이다.
도 5는 본 발명의 실시예에 따른 데이터 식별부의 구성을 도시한 것이다.
도 6은 본 발명의 실시예에 따른 품명데이터 사전검증 과정을 도시한 순서도이다.
도 7 내지 도 8은 품명데이터 사전 검증 화면의 다양한 실시예를 도시한 것이다.1 shows a process of transmitting a conventional import and export cargo list.
2 is a block diagram of a product name data pre-verification system according to an exemplary embodiment of the present invention.
3 illustrates a configuration of a reference database according to an embodiment of the present invention.
4 illustrates a configuration of a reference database updater according to an embodiment of the present invention.
5 illustrates a configuration of a data identification unit according to an embodiment of the present invention.
6 is a flowchart illustrating a process of pre-verification of item name data according to an embodiment of the present invention.
7 to 8 illustrate various embodiments of the item name data pre-verification screen.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예들을 상세히 설명한다. 도면에서, 본 발명을 명확하게 설명하기 위해 설명과 관계없는 부분은 생략하며 명세서 전체를 통하여 동일한 부분에 대해서는 동일한 도면 부호를 사용한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, parts irrelevant to the description are omitted to clearly describe the present invention, and the same reference numerals are used for the same parts throughout the specification.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "…부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 소프트웨어의 결합으로 구현될 수 있다.Whenever a component is referred to as "including" an element throughout the specification, it is to be understood that the element may include other elements, not the exclusion of any other element, unless the context clearly dictates otherwise. In addition, the terms “… unit”, “module”, etc. described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware, software, or a combination of software.

명세서 전체에서 "품명데이터"라 함은 선박 또는 항공기를 통해 수출입되는 화물의 명칭을 의미하며, 화주, 운송회사 및 포워더가 작성하여 세관에 제출하는 적하목록 및 수출입 신고서에 기재되는 화물의 명칭을 통칭한다.
Throughout the specification, "article data" means the name of the cargo imported or exported by ship or aircraft, and collectively the name of the cargo written on the manifest and the import and export declaration prepared by the shipper, carrier and forwarder to Customs. do.

도 2는 본 발명의 실시예에 따른 품명데이터 사전 검증 시스템의 구성을 도시한 것이다.2 is a block diagram of a system for preliminary verification of product name data according to an embodiment of the present invention.

도 2의 품명데이터 사전 검증 시스템은 입력부(10)와, 입력된 품명데이터를 최소 단위로 가공하는 데이터 가공부(20), 최소 단위로 가공된 품명데이터의 속성을 추출하는 속성 추출부(30), 품명데이터의 속성 정보를 제공하는 참조 데이터베이스(40), 참조 데이터베이스(40)를 업데이트 하는 참조 데이터베이스 갱신부(50) 및 입력된 데이터의 승인 여부를 판단하는 데이터 식별부(60)를 포함하여 이루어진다.2, the product name data pre-verification system includes an input unit 10, a data processing unit 20 for processing the inputted product name data in a minimum unit, and an attribute extraction unit 30 for extracting attributes of the product name data processed in the minimum unit. , The reference database 40 providing attribute information of the item name data, the reference database updater 50 for updating the reference database 40, and a data identification unit 60 for determining whether the inputted data is approved. .

입력부(10)는 화주, 운송회사 및 포워더로부터 단어 및/또는 문장 형태의 품명데이터를 입력받기 위한 수단이며, 품명데이터는 전세계적으로 화물의 수출입에 사용되는 언어인 특성상 영어로 입력된다.The input unit 10 is a means for receiving product name data in a word and / or sentence form from a shipper, a transport company, and a forwarder, and the product name data is input in English due to the nature of the language used for import and export of cargoes worldwide.

데이터 가공부(20)는 입력부(10)를 통해 입력된 품명데이터를 문장의 최소 단위인 단어 단위로 분할한다.The data processing unit 20 divides the item name data input through the input unit 10 into word units, which are minimum units of sentences.

이후 분할된 단어들 중에서 숫자만으로 구성된 단어 또는 특수문자를 포함하는 단어 등 의미 식별이 불가능한 단어들을 제외한 나머지 단어들을 속성 추출부(30)로 전달한다.Afterwards, the remaining words except for words that cannot be meaningfully identified, such as a word composed of numbers only or words including special characters, are transmitted to the attribute extracting unit 30.

이때 상기 특수문자를 포함하는 단어 중에서 각 단어의 연결 및 분리를 표현하는 '-'(붙임표)과 문장의 마침을 표현하는 '.'(마침표)를 포함하는 단어는 의미 식별이 가능한 단어로 분류하고 속성 추출부(30)로 전달한다.At this time, among the words including the special characters, a word including '-' (an asterisk) expressing the connection and separation of each word and a '.' (Period) expressing the end of a sentence is classified as a word capable of identifying meaning. Transfer to attribute extraction unit 30.

속성 추출부(30)는 데이터 가공부(20)에서 전달된 단어의 속성 즉, 단어의 품사, 의미(meaning) 및 관련 정보를 참조 데이터베이스(40)에서 추출한다.The attribute extractor 30 extracts an attribute of a word transmitted from the data processor 20, that is, a part of speech, meaning, and related information of the word, from the reference database 40.

상기 단어의 품사는 명사, 형용사, 부사, 동사 및 전치사 등을 의미하며, 의미는 단어의 고유 의미를 의미하고, 관련 정보는 상기 단어에 대응하는 동의어, 표준어, 약어, 규격 및 HS코드 등을 의미한다.The part-of-speech of the word means a noun, an adjective, an adverb, a verb, and a preposition, and the meaning means a unique meaning of the word, and the related information means a synonym, a standard word, an abbreviation, a standard, and an HS code corresponding to the word. do.

참조 데이터베이스(40)는 다년간 신고된 수출입신고서 및 적하목록신고서로부터 추출한 품명데이터의 형태소 및 단어를 분석하여 품사를 추출한 결과와 상기 추출된 품사별 식별 가능한 의미 및 관련 정보를 저장하고 있다.The reference database 40 stores the result of extracting the parts of speech by analyzing the morphemes and words of the item name data extracted from the export and import declaration and the manifest declaration which have been reported for many years, and the meaning and the related information which can be identified by the extracted parts of speech.

또한, 무역업에서 통용화되어 있으며 국제통일상품분류체계(HS)를 통해 하나의 품목번호로 분류 가능한 물품명 및 상기 무역업에 관련된 물품명의 시소러스(thesaurus)를 포함하며, 상기 데이터 사전 검증 시스템의 사용자에 의해 업데이트 될 수 있다.It also includes a product name that is commonly used in the trade industry and can be classified into a single item number through the International Uniform Product Classification System (HS), and the thesaurus of the product name related to the trade industry, and by the user of the data pre-verification system. Can be updated.

도 3은 본 발명의 실시예에 따른 참조 데이터베이스의 구성을 도시한 것이다.3 illustrates a configuration of a reference database according to an embodiment of the present invention.

도 3의 참조 데이터베이스(40)는 무역업에서 통용되는 품명데이터와 관련된 단어의 품사별 의미 정보를 포함하는 의미 데이터베이스(41)와, 품명데이터의 동의어, 약어, 방언 및 표준어와 같은 관련어 정보를 포함하는 관련어 데이터베이스(42)와, 품명데이터별 대응되는 HS코드 정보를 포함하는 HS코드 데이터베이스(43) 및 전문용어와 관련된 품명데이터 정보, 브랜드 정보, 품명데이터의 규격 및 재료 정보 등을 포함하는 기타 데이터베이스(44) 등을 포함하여 이루어진다.The reference database 40 of FIG. 3 includes a semantic database 41 including semantic-specific meaning information of words related to product name data commonly used in the trade business, and related word information such as synonyms, abbreviations, dialects, and standard words of the product name data. An associated term database 42, an HS code database 43 including corresponding HS code information for each item name data, and other databases including item name data information, brand information, specification of material name data, material information, etc. 44).

예를 들어, 입력부(10)로부터 'live animal'이라는 품명데이터가 입력되면, 데이터 가공부(20)로부터 live와 animal의 단어가 전달되고, 속성 추출부(30)에서는 각 단어별 품사를 'live - 형용사, 동사', 'animal - 명사'와 같이 추출한다.For example, if the item name data 'live animal' is input from the input unit 10, the words live and animal are transmitted from the data processing unit 20, and the attribute extractor 30 'live' the parts of speech for each word. -Such as adjectives, verbs, and 'animal-noun'.

이후 추출된 품사 중에서 품명데이터의 의미 식별에 사용 가능한 명사와 형용사의 품사를 가지는 단어인 live와 animal을 추출하고, 상기 live와 animal에 대응하는 의미를 'live - 살아 있는, 활동적인', 'animal - 동물'과 같이 추출한다.From the extracted parts of speech, we extract the words live and animal which have nouns and adjective parts of speech that can be used to identify the meaning of the item name data, and the meanings corresponding to the live and animal are 'live-live, active', 'animal -Extract like animal.

또한, animal과 관련하여 'animal - 광의어'라는 관련 정보 역시 추출한다.It also extracts related information about animals, called animal.

상기와 같이 단어의 품사와 의미 및 관련 정보를 추출한 결과는 데이터 식별부(60)로 전달한다.As described above, the result of extracting the part-of-speech, meaning, and related information of the word is transmitted to the data identification unit 60.

이때 상기와 같이 live에 대응하는 의미 정보가 다수일 경우에는 의미 데이터베이스(41)내에 저장된 의미 정보의 중요도를 기준으로 중요도가 높은 의미 정보를 추출한다.In this case, when there is a large number of semantic information corresponding to live as described above, semantic information having high importance is extracted based on the importance of semantic information stored in the semantic database 41.

상기 중요도는 데이터 가공부(20)에서 전달된 단어에 대하여 무역업에서의 실제 사용빈도, 활용범위 및 정확도를 반영하여 설정되며, 품사 정보와 관련 정보가 다수 추출될 경우에도 중요도를 기준으로 중요도가 높은 품사 정보와 관련 정보를 추출한다.The importance is set based on the actual frequency of use, range of use, and accuracy in the trade business with respect to the words transmitted from the data processing unit 20, and even when parts of speech and related information are extracted, the importance is high based on the importance. Extract parts of speech information and related information.

다시 도 2의 설명으로 돌아와서, 만일 참조 데이터베이스(40)로부터 데이터 가공부(20)에서 전달된 단어의 속성을 추출하지 못한 경우 상기 단어는 참조 데이터베이스 갱신부(50)로 전달된다.Returning to the description of FIG. 2 again, if the attribute of the word transferred from the data processing unit 20 from the reference database 40 is not extracted, the word is transferred to the reference database update unit 50.

도 4는 본 발명의 실시예에 따른 참조 데이터베이스 갱신부의 구성을 도시한 것이다.4 illustrates a configuration of a reference database updater according to an embodiment of the present invention.

도 4에서 보듯, 참조 데이터베이스 갱신부(50)는 외부 데이터베이스(51)와 외부 데이터베이스(51)로부터 상기 단어를 검색하는 데이터 검색부(52) 및 상기 검색 결과를 바탕으로 상기 단어의 속성을 정의하는 데이터 정의부(53)를 포함하여 이루어진다.As shown in FIG. 4, the reference database updater 50 defines an attribute of the word based on an external database 51 and a data searcher 52 that searches for the word from the external database 51 and the search result. It includes a data defining unit 53.

외부 데이터베이스(51)는 다음, 네이버 및 구글과 같은 포털 사이트 및 공공기관에서 무상으로 제공하는 웹 기반의 오픈소스 사전 및 API(Application program interface, 응용프로그램인터페이스)를 의미한다.The external database 51 refers to a web-based open source dictionary and API (Application Program Interface), which are freely provided by portal sites and public institutions such as Naver and Google.

데이터 검색부(52)는 속성 추출부(30)에서 속성을 추출하지 못한 단어들을 대상으로 외부 데이터베이스(51)를 통해 상기 단어의 속성을 검색한다.The data retrieval unit 52 retrieves the attribute of the word through the external database 51 for the words for which the attribute is not extracted by the attribute extractor 30.

데이터 정의부(53)는 데이터 검색부(52)를 통해 검색된 상기 단어의 속성을 단어의 품사, 의미(meaning) 및 관련 정보로 분류하고 참조 데이터베이스(40)에 저장한다.The data definition unit 53 classifies the attribute of the word searched through the data retrieval unit 52 into parts of speech, meaning, and related information of the word and stores it in the reference database 40.

다시 도 2의 설명으로 돌아와서, 속성 추출부(30)의 속성 추출 결과는 데이터 식별부(60)로 전달된다.Returning to the description of FIG. 2 again, the attribute extraction result of the attribute extraction unit 30 is transmitted to the data identification unit 60.

도 5는 본 발명의 실시예에 따른 데이터 식별부의 구성을 도시한 것이다.5 illustrates a configuration of a data identification unit according to an embodiment of the present invention.

도 5의 데이터 식별부(60)는 소정의 기준에 따라 품명데이터를 검증하는 데이터 검증부(61)와 상기 품명데이터의 승인 여부를 판별하는 승인 판정부(62) 및 상기 승인 결과를 저장하는 결과 데이터베이스(63)를 포함하여 이루어진다.The data identification unit 60 of FIG. 5 stores a data verification unit 61 that verifies the article name data according to a predetermined criterion, an approval determination unit 62 that determines whether the article name data is approved, and a result of storing the approval result. Database 63.

데이터 검증부(61)는 속성 추출부(30)에서 전달된 상기 단어의 속성 추출 결과에 의하여 상기 단어의 품사 조합과 의미를 기준으로 상기 입력된 품명데이터의 식별 가능 여부를 판단한다.The data verification unit 61 determines whether the inputted item name data can be identified based on the part-of-speech combination and meaning of the word based on the result of attribute extraction of the word transmitted from the attribute extractor 30.

이때 상기 기준은 아래의 표 1과 같다.In this case, the criteria are shown in Table 1 below.

구분 항목Category 검증 결과Verification result 오류 메시지Error message 명사가 2개 이상 추출Extract two or more nouns 정상normal -- 명사와 형용사가 각각 1개 이상 추출Extract one or more nouns and adjectives 정상normal -- 명사만 1개 추출Extract 1 noun only 점검check 광의어인지 확인하세요.Make sure it's broad. 오타(의미 식별 가능) 추출Typo (Meaning Identifiable) Extraction 오류error 오타가 존재합니다.A typo exists. 방언(의미 식별 가능) 추출Dialect extraction (meaningly discernible) 오류error 표준어를 입력하세요.Please enter a standard language. 브랜드만 추출Extract only brand 오류error 브랜드명입니다.Brand name. 31바이트 이상의 단어
(의미 식별 불가능) 추출Words longer than 31 bytes
Extract (meaningly unidentifiable) 오류error 띄어쓰기가 필요합니다.A space is required. 노이즈(의미 식별 불가능) 단어만 추출Extract only noise (no meaningful discernible) words 오류error
의미가 불분명한 품명입니다.

The item name is unclear.
형용사만 1개 추출Extract only one adjective 오류error 오타(의미 식별 불가능) 추출Typo (no meaningful discernment) extraction 오류error

표 1에서 보듯, 상기 단어의 품사 조합에 따라 품명데이터의 식별 가능 여부를 판단한다.As shown in Table 1, it is determined whether the item name data can be identified according to the part-of-speech combination of the word.

데이터 가공부(20)로부터 전달된 단어가 두 개 이상이라고 가정할 경우 의미를 식별할 수 있는 최소 두 개 이상의 명사를 포함하거나 최소 하나 이상의 형용사와 명사를 포함하면 식별 가능한 단어의 조합 즉, 품명데이터로 판단한다.Assuming that two or more words transmitted from the data processing unit 20 include at least two nouns that can identify a meaning, or include at least one adjective and nouns, Judging by.

이 외의 모든 품사의 조합이나 상기 단어가 의미 식별이 불가능한 단어로 판단된 경우 및 기재된 품명데이터에 수정 가능한 오류가 발견된 경우 사용자에게 표 1과 같은 오류 메시지를 출력한다.If a combination of all other parts of speech or the word is determined to be a word that cannot be identified, and a correctable error is found in the described item name data, an error message as shown in Table 1 is output to the user.

승인 판정부(62)는 데이터 검증부(61)의 검증 결과에 따라 입력된 품명데이터의 승인 여부를 판단하고 승인 결과를 결과 데이터베이스(63)에 저장한다.The approval determination unit 62 determines whether or not the input article name data is approved according to the verification result of the data verification unit 61 and stores the approval result in the result database 63.

이때 상기 검증 결과가 '정상'인 품명데이터는 입력을 승인하며, '점검' 및 '오류'인 품명데이터는 데이터 검증부(61)에서 출력된 오류 메시지를 참조하여 사용자가 품명데이터를 재입력하도록 대기한다.In this case, the item name data having the verification result of 'normal' accepts the input, and the item name data of the 'check' and 'error' is referred to the error message output from the data verification unit 61 so that the user inputs the item name data again. Wait

도 6은 본 발명의 실시예에 따른 품명데이터 사전 검증 과정을 도시한 순서도이다.6 is a flowchart illustrating a process of pre-validating item name data according to an embodiment of the present invention.

먼저, 입력부(10)를 통해 영어 단어 및/또는 문장으로 구성된 품명데이터가 입력되면(S11), 데이터 가공부(20)를 통해 상기 품명데이터를 단어 단위로 분할한다(S12).First, when product name data composed of English words and / or sentences is input through the input unit 10 (S11), the product name data is divided into word units through the data processing unit 20 (S12).

예를 들어, 'Riding MotorBike 250cc'와 같은 품명데이터가 입력되었을 경우 상기 품명데이터는 Riding, MotorBike, 250cc와 같이 분할된다.For example, when product name data such as 'Riding MotorBike 250cc' is input, the product name data is divided into Riding, MotorBike, and 250cc.

이후 속성 추출부(30)에서는 참조 데이터베이스(40)를 통해 상기 분할된 단어의 속성을 추출하며(S13), 상기 Riding, MotorBike, 250cc의 속성 추출 결과는 아래의 표 2와 같다.Thereafter, the attribute extractor 30 extracts the attribute of the divided word through the reference database 40 (S13), and the attribute extraction results of the Riding, MotorBike, and 250cc are shown in Table 2 below.

단어word 속성property
Riding

Riding
품사Parts of speech 명사, 형용사Nouns 의미meaning 승마, 타기, 승용의, 구Horse Riding, Riding, Riding 관련 정보Related information --
MotorBike

Motorbike
품사Parts of speech 명사noun 의미meaning 오토바이motorcycle 관련 정보Related information 동의어 : motorcycleSynonyms: motorcycle
250cc

250 cc
품사Parts of speech 명사noun 의미meaning -- 관련 정보Related information 단위(cc) : 배기량Unit (cc): displacement

표 2의 속성 추출 결과는 데이터 검증부(61)로 전달되며, 다수의 추출 결과를 가지는 Riding 단어의 의미 정보에서는 중요도가 가장 높은 '승용의'가 추출되어 데이터 검증부(61)로 전달된다.The attribute extraction result of Table 2 is transmitted to the data verification unit 61, and the 'highest' of significance is extracted from the meaning information of the Riding word having a plurality of extraction results and transferred to the data verification unit 61.

이때 참조 데이터베이스(40)를 통해 상기 단어의 속성을 추출하지 못할 경우 외부 데이터베이스(51)를 검색하고(S14), 외부 데이터베이스(51)로부터 검색된 단어의 속성 정보를 데이터 검증부(61)로 전달하는 한편 참조 데이터베이스(40)에 저장한다(S15).In this case, if the attribute of the word cannot be extracted through the reference database 40, the external database 51 is searched (S14), and the attribute information of the word retrieved from the external database 51 is transmitted to the data verification unit 61. On the other hand, it is stored in the reference database 40 (S15).

이때 외부 데이터베이스(51)는 다음, 네이버 및 구글과 같은 포털 사이트 및 공공기관에서 무상으로 제공하는 웹 기반의 오픈소스 사전 및 API(Application program interface, 응용프로그램인터페이스)를 포함하며, 상기 단어에 대한 쿼리를 송신하여 리턴된 결과를 분석하는 방식으로 검색을 수행한다.At this time, the external database 51 includes a web-based open source dictionary and API (Application program interface) provided by portal sites such as Naver and Google and public institutions free of charge. Perform a search by sending a query and analyzing the returned results.

다음으로 속성 추출 결과를 바탕으로 데이터 검증부(61)에서는 상기 단어들의 품사 조합, 의미 및 관련 정보에 따라 품명데이터의 식별 가능 여부를 판단한다.Next, based on the attribute extraction result, the data verification unit 61 determines whether the item name data can be identified based on the parts of speech combination, meaning, and related information of the words.

속성 추출부(30)를 통해 전달된 단어가 의미 식별이 가능한 명사 1개로 이루어진 것인지를 판단하고(S16), 상기 단어가 의미 식별이 가능한 명사 1개로 이루어졌을 경우 입력부(10)에서 입력된 품명데이터는 정상으로 판정한다(S19).It is determined whether the word transmitted through the attribute extraction unit 30 is composed of one noun that can identify meanings (S16), and when the word is composed of one noun that can identify meanings, the item name data input from the input unit 10 is determined. Is determined to be normal (S19).

예를 들어 입력부(10)를 통해 입력된 품명데이터가 'Animal' 한 단어인 경우 'Animal'은 네 발을 가진 짐승 또는 식물과 인간을 제외한 모든 것을 총칭하는 광의어로써 오류 메시지가 출력된다.For example, when the item name data input through the input unit 10 is one word 'Animal', 'Animal' is a broad term that refers to everything except animals and plants and humans with four feet, and an error message is output.

그러나 입력부(10)를 통해 입력된 품명데이터가 'Avalon'과 같이 HS코드로 분류할 수 있는 의미 식별이 가능한 한 개의 단어로 구성될 경우 상기 품명데이터는 정상으로 판정한다.However, if the item name data input through the input unit 10 is composed of one word capable of meaning identification that can be classified as an HS code such as 'Avalon', the item name data is determined to be normal.

만일 속성 추출부(30)를 통해 전달된 단어의 집합이 의미 식별이 가능한 명사 1개로 이루어져 있지는 않으나 아래 두 가지 조건을 만족하는지를 판단하고(S17), 최소 한 가지 이상의 조건을 만족할 경우 입력된 품명데이터를 정상으로 판정한다(S19).If the set of words transmitted through the attribute extraction unit 30 does not consist of one noun that can identify meanings, it is determined whether the following two conditions are satisfied (S17), and if the at least one condition is satisfied, the entered item name data Is determined to be normal (S19).

조건 1. 의미 식별이 가능한 명사가 최소 2개 이상을 포함하는가Condition 1. Does the name contain at least two identifiable nouns?

조건 2. 의미 식별이 가능한 명사 1 개와 형용사 1개 이상을 포함하는가Condition 2. Does it include one identifiable noun and one or more adjectives?

만일 속성 추출부(30)를 통해 전달된 단어의 집합이 상기 조건 1과 조건 2 중에서 최소 하나의 조건을 만족하지 못하면 상기 표 1에 의하여 상기 단어들의 품사 조합, 의미 및 관련 정보에 따라 오류 메시지를 출력하고(S18) 그 결과를 결과 데이터베이스에 저장한다.If the set of words transmitted through the attribute extractor 30 does not satisfy at least one of the above conditions 1 and 2, an error message is generated according to the parts of speech combination, meaning, and related information according to Table 1 above. Output (S18) and store the result in the result database.

예를 들어, 상기 표 2와 같이 분할된 품명데이터 'Riding MotorBike 250cc'의 경우 의미 식별이 가능한 명사 1개(MotorBike-오토바이)와 형용사 1개(Riding-승용의) 이상을 포함하므로 입력된 품명데이터는 정상으로 판정한다.For example, in the case of the item name data 'Riding MotorBike 250cc' divided as shown in Table 2, the item name data is input because it includes at least one noun (MotorBike-motorcycle) and an adjective (Riding-use) that can identify meanings. Is determined to be normal.

만일, 입력된 품명데이터가 'Riding Vehicle'일 경우 상기 품명데이터는 의미 식별이 가능한 형용사 1개를(Riding-승용의) 포함하나 뒷단의 명사 Vehicle은 탈 수 있는 차량을 총칭하는 단어로 광의어라는 관련 정보가 추출되어 오류 메시지가 출력된다.If the input item name data is 'Riding Vehicle', the item name data includes one adjective (Riding-useable) that can identify meaning, but the noun vehicle at the rear is a general term for a rideable vehicle. The relevant information is extracted and an error message is output.

도 7 내지 도 8은 품명데이터 사전 검증 화면의 다양한 실시예를 도시한 것이다.7 to 8 illustrate various embodiments of the item name data pre-verification screen.

도 7 내지 도 8의 품명데이터 사전 검증 화면은 표 1을 참조하여 설명한다.The article name data pre-verification screen of FIGS. 7 to 8 will be described with reference to Table 1. FIG.

도 7에서 보듯, 'DW'와 같은 품명데이터가 입력되면 'DW'에 대한 속성을 추출한 결과 'DW'에 대응하는 '동원의 약어, 브랜드'라는 관련 정보가 추출되어 품목 식별이 불가능한 품명데이터로 판단되고 상기 판단 결과를 통보하는 오류 메시지를 출력한다.As shown in FIG. 7, when item name data such as 'DW' is inputted, as a result of extracting an attribute for 'DW', relevant information such as 'abbreviation of a mobilization, brand' corresponding to 'DW' is extracted and the item name data cannot be identified. An error message that is determined and notifies the result of the determination is output.

이후 사용자가 품명데이터를 재입력하기를 기다리거나 도 7과 같이 참조 데이터베이스 내에 저장된 'DW(동원)'이라는 브랜드명과 관련되는 관련 정보(동원에서 수출입하는 품명데이터)을 나열하여 사용자로부터 정확한 품명데이터를 선택하도록 유도할 수 있다.Thereafter, the user waits for the user to re-enter the product name data, or lists the relevant information related to the brand name 'DW (mobilization)' stored in the reference database as shown in FIG. It can be induced to choose.

상기 데이터 사전 검증은 운송회사 및 포워더가 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)를 작성하는 단계 또는 적하목록취합시스템에서 상기 집단선하증권(Master B/L)과 혼재화물선하증권(House B/L)를 취합하여 적하목록을 생성하는 단계 및 상기 적하목록을 세관에 전송하는 단계에서 수행될 수 있다. The data pre-validation may be performed by the carrier and the forwarder to create a master bill of lading (Master B / L) and a mixed bill of lading (House B / L) or the group bill of lading (Master B / L) in the manifest collection system. And a mixed bill of lading (House B / L) may be performed to generate a manifest and transmitting the manifest to customs.

또한, 이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이며 특히 본 발명에서 제안하는 방법들을 소프트웨어로 구현한 것 역시 본 발명의 권리범위에 속한다.In addition, although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concepts of the present invention defined in the following claims are also provided. The implementation of the method proposed in the present invention by software in particular belongs to the scope of the present invention.

10 : 입력부 20 : 데이터 가공부 30 : 데이터 분석부
40 : 참조 데이터베이스 41 : 의미 데이터베이스
42 : 관련어 데이터베이스 43 : HS코드 데이터베이스
44 : 기타 데이터베이스 50 : 참조 데이터베이스 갱신부
51 : 외부 데이터베이스 52 : 데이터 검색부
53 : 데이터 정의부 60 : 데이터 식별부
61 : 데이터 검증부 62 : 승인 판정부
63 : 결과 데이터베이스10: input unit 20: data processing unit 30: data analysis unit
40: Reference Database 41: Meaning Database
42: relational database 43: HS code database
44: other database 50: reference database update unit
51: external database 52: data search unit
53: data definition unit 60: data identification unit
61: data verification unit 62: approval determination unit
63: results database

Claims

An input unit for inputting import and export cargo name data;
A data processing unit for dividing the article name data into word units;
A reference database for storing the parts-of-speech information extracted by analyzing the morphemes and words of the item-name data extracted from the export and import declaration and the manifest report which have been reported for many years, and the extracted parts-of-speech information and related information;
An attribute extraction unit for extracting attribute information corresponding to the word from the reference database;
A data verification unit determining whether the item name data can be identified based on the combination and meaning of the words for each part of speech based on the extracted attribute information; And
Approval determination unit for determining the approval of the input of the article name data and the input restriction in accordance with the determination result
Data pre-validation system comprising a.

The method of claim 1, wherein the attribute information,
And at least one or more of parts-of-speech information, meaning information, and information on synonyms, standard words, abbreviations, standards, and HS codes corresponding to the words.

The method of claim 1, wherein the attribute extraction unit,
The apparatus may further include a reference database updater configured to update the reference database by extracting the attribute information of the word from an external database including a web-based open source dictionary when the attribute information of the word cannot be extracted from the reference database. Data pre-validation system.

The method of claim 1, wherein the data verification unit,
And if the set of words comprises only one noun capable of identifying the meaning, the input article name data is determined to be identifiable.

The method of claim 1, wherein the data verification unit,
And if the set of words includes at least two nouns that can identify meanings, determine the input item name data as identifiable.

The method of claim 1, wherein the data verification unit,
And if the set of words includes at least one noun and an adjective each capable of identifying a meaning, the input item name data is determined to be identifiable.

The method of claim 1, wherein the data verification unit,
And if the set of words includes nouns and adjectives whose meaning cannot be identified, determining the input item name data as unidentifiable.

Receiving an input / export cargo item name data by an input unit;
Dividing the inputted article name data into word units by a data processing unit;
Extracting, by an attribute extractor, attribute information corresponding to the word from a reference database; And
Determining whether the data identification unit identifies the import / export cargo name data consisting of the set of words having the attribute information;
Data pre-validation method comprising a.

The method of claim 8, wherein the attribute information,
And at least one or more of parts-of-speech information, meaning information, and information on synonyms, standard words, abbreviations, standards, and HS codes corresponding to the words.

The method of claim 8, wherein the reference database,
And a part-of-speech information extracted by analyzing the morphemes and words of the item-name data extracted from the export and import declaration and the manifest declaration report reported for many years, and storing the extracted parts-of-speech information and related information.

The method of claim 8, wherein the determining whether the item name data identification,
And if the set of words comprises only one noun capable of identifying the meaning, the input article name data is determined to be identifiable and the input is approved.

The method of claim 8, wherein the determining whether the item name data identification,
And if the set of words includes at least two nouns capable of meaning identification, determine the input item name data as identifiable and approve the input.

The method of claim 8, wherein the determining whether the item name data identification,
And if the set of words includes one or more nouns and adjectives each capable of identifying a meaning, the input item name data is determined to be identifiable and the input is approved.

The method of claim 8, wherein the determining whether the item name data identification,
And if the set of words includes nouns and adjectives whose meanings cannot be identified, determining the input item name data as unidentifiable to limit the input.