CN116776831B - Customs, customs code determination, decision tree construction method and medium - Google Patents

Customs, customs code determination, decision tree construction method and medium Download PDF

Info

Publication number
CN116776831B
CN116776831B CN202310328600.XA CN202310328600A CN116776831B CN 116776831 B CN116776831 B CN 116776831B CN 202310328600 A CN202310328600 A CN 202310328600A CN 116776831 B CN116776831 B CN 116776831B
Authority
CN
China
Prior art keywords
decision tree
code
classification
customs
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310328600.XA
Other languages
Chinese (zh)
Other versions
CN116776831A (en
Inventor
叶伟杰
陈佳铭
蔡孟松
李其
张巍伟
车慧珍
马骎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Wodewei Digital Technology Service Co ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202310328600.XA priority Critical patent/CN116776831B/en
Publication of CN116776831A publication Critical patent/CN116776831A/en
Application granted granted Critical
Publication of CN116776831B publication Critical patent/CN116776831B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a customs code determination and decision tree construction method and medium, which can improve the classification code of articles and ensure the accuracy of the classification code. The customs code determining method comprises the following steps: acquiring customs information of an object to be processed; acquiring a description text of the object to be processed according to the customs information; identifying the content belonging to the set dimension in the description text to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the description information of the plurality of articles and the classification codes; wherein, the nodes of the decision tree correspond to a classification code, and the description information of the article corresponding to the classification code is mounted; the nodes of the decision tree comprise at least one matching node; and determining the customs code of the object to be processed according to the coded depth and the at least one matching node.

Description

Customs, customs code determination, decision tree construction method and medium
Technical Field
The application relates to the technical field of data processing, in particular to a customs, customs code determination and decision tree construction method and medium.
Background
With the development of economy and science, the circulation range, circulation speed and scale of commodities have been substantially increased, and commodities are widely and largely purchased and sold in a country, and there are also a large number of transactions in an international range, so that international trade is an important way of commodity trade. Meanwhile, the transportation technology and the goods transportation technology are rapidly developed, the freight infrastructure is continuously updated, and the goods circulation in the global scope also drives the development of logistics and express industry. The articles circulate among different countries, the articles need to be classified, and classification codes corresponding to the articles are determined, so that each mechanism participating in the circulation process of the articles can efficiently manage the articles.
Disclosure of Invention
The embodiment of the application provides a customs code determination and decision tree construction method and medium, which can improve the classification code of articles and ensure the accuracy of the classification code.
In a first aspect, an embodiment of the present application provides a customs code determining method, including: acquiring customs information of an object to be processed; acquiring a description text of the object to be processed according to the customs information; identifying the content belonging to the set dimension in the description text to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the description information of the plurality of articles and the classification codes; the nodes of the decision tree correspond to a classification code, and the description information of the article corresponding to the classification code is mounted; the decision tree comprises at least one matching node; and determining the customs code of the object to be processed according to the coded depth and the at least one matching node.
In a second aspect, an embodiment of the present application provides a method for determining a customs clearance code, including: obtaining a description text input by a customs declaration aiming at an object to be processed; identifying the content belonging to the set dimension in the description text to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of the commodities, and each node of the decision tree corresponds to one classification code; the decision tree comprises at least one matching node; determining a predicted customs clearance code of the object to be processed according to the code depth and at least one matching node; and determining the customs clearance code of the object to be processed according to the received code rechecking information and the predicted customs clearance code.
In a third aspect, an embodiment of the present application provides a method for constructing a decision tree, including: acquiring sample data containing article classification rules; acquiring classification codes of a plurality of classifications of the articles and description information of the articles corresponding to each classification code according to the sample data; constructing a decision tree according to the relation between the classification codes, so that each offspring node of the decision tree corresponds to one classification code, and the child node of each father node in the decision tree corresponds to the child classification under the classification code to which the father node belongs; and mounting the description information of the article corresponding to each classification code on the corresponding node of each classification code in the decision tree.
In a fourth aspect, embodiments of the present application provide an electronic device including a memory, a processor, and a computer program stored on the memory, the processor implementing the method of any one of the above when the computer program is executed.
In a fifth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored therein, the computer program, when executed by a processor, implementing a method according to any one of the above.
Compared with the prior art, the application has the following advantages:
the customs code determining method provided by the implementation of the application can be used for assisting customs authorities in determining customs codes of the to-be-processed objects according to customs information of the to-be-processed objects. The customs clearance information is processed by text, and the like, so that the content of the descriptive text belonging to the set dimension is extracted, the customs clearance information of the articles to be processed is assisted to be processed, and a large number of customs clearance articles can be managed uniformly and efficiently. Meanwhile, the description text of the commodity to be processed is subjected to multi-dimensional recognition, information of the commodity to be processed for clearance can be comprehensively obtained from the plurality of dimensions, the comprehensiveness of information analysis is guaranteed, meanwhile, the plurality of dimensions can be modified, deleted and added according to requirements, and when customs official regulations change, the dimensions can be adjusted timely. The nodes of the decision tree are applied to the classification codes for determining the customs numbers, and the nodes mount the description information of the corresponding classifications, so that modification and adjustment are facilitated according to the change regulated by authorities. In addition, by the customs code determining method provided by the embodiment of the application, the code depth, namely the number of bits of the code, can be set according to the detailed requirements of customs codes published by authorities, so that the code of the number of bits is conveniently and efficiently positioned to an accurate decision tree matching node, and the code of the required bits is obtained.
The foregoing description is merely an overview of the technical solutions of the present application, and in order to make the technical means of the present application more clearly understood, it is possible to implement the present application according to the content of the present specification, and in order to make the above and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
In the drawings, the same reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily drawn to scale. It is appreciated that these drawings depict only some embodiments according to the application and are not to be considered limiting of its scope.
Fig. 1A is an application scenario schematic diagram of a customs code determining method provided in an embodiment of the present application;
fig. 1B is an application scenario schematic diagram of a customs clearance code determining method provided in an embodiment of the present application;
fig. 2 is a schematic flow chart of a customs code determining method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a classification process according to an example of the present application;
FIG. 4 is a diagram illustrating named entity recognition provided by examples of the present application;
FIG. 5 is a schematic diagram of a multi-dimensional analysis process provided in an embodiment of the present application;
FIG. 6 is a schematic diagram of rule mounting in a data identification and decision tree provided in an embodiment of the present application;
fig. 7 is a schematic flow chart of a customs code determining apparatus according to an embodiment of the present application; and
fig. 8 is a block diagram of an electronic device used to implement an embodiment of the present application.
Detailed Description
Hereinafter, only certain exemplary embodiments are briefly described. As will be recognized by those of skill in the pertinent art, the described embodiments may be modified in various different ways without departing from the spirit or scope of the present application. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
In order to facilitate understanding of the technical solutions of the embodiments of the present application, the following describes related technologies of the embodiments of the present application. The following related technologies may be optionally combined with the technical solutions of the embodiments of the present application, which all belong to the protection scope of the embodiments of the present application.
Fig. 1A is an application scenario schematic diagram of a customs code determining method according to an embodiment of the present application. As shown in fig. 1A, the customs code determining method provided in the embodiment of the present application may be applied to a system including a customs terminal 101, a customs terminal 102, and a customs server 103. The report official terminal 101 may include at least one of a terminal of a general cargo mailer, an international electronic commerce terminal, and an international cargo transportation carrier terminal, among others. Customs terminals 102 may include terminals used by customs, which may be national authorities that exercise import and export supervision authority. The customs terminal 102 may receive customs data transmitted from the customs terminal 101, and may determine tax collected by the customs on the customs, management measures to be taken for customs goods, or perform data processing operations such as statistics on goods information related to the customs, based on the customs data. The customs terminal 102 may send customs data to the customs server 103 for data processing, determine classification codes of the items involved in the customs data, and return the classification codes to the customs terminal 102, so that the customs can determine corresponding measures to be taken for the items involved in the customs data. In the embodiment of the present application, the tax collected by the customs clearance party on the official side may be a tariff, that is, a tax on the lesson of imported goods passing through a national customs clearance. Tariffs generally belong to the highest tax species of the tax rate specified by the highest administrative unit in each country, and for countries with developed foreign trade, tariffs are even national tax and the main income of national finances. The customs code determining method provided by the embodiment of the application is applied to assist customs parties in determining the tariffs of the articles, and is beneficial to improving the accuracy of the tariffs and the efficiency of tariff calculation and verification.
Fig. 1B is an application scenario schematic diagram of a customs clearance code determining method according to an embodiment of the present application. As shown in fig. 1B, the customs clearance code determining method provided in the embodiment of the present application may be applied to a system including an item information submitting end 104, a code service end 105, and an item information receiving end 106. The item information submitting terminal 104 may include at least one of a terminal of a general goods mailer, an international e-commerce terminal, a tax information client, an international e-commerce server, and an international goods transportation carrier terminal. The code service end 105 can analyze the article information submitted by the article information submitting end 104 to obtain article related classified code prediction information, and the user can process the article information submitting end 104 to obtain article related classified codes and submit the article related classified codes to the article information receiving end 106. The item information receiving terminal 106 may include a terminal used by at least one official department of customs departments, financial departments, tax departments. Where the item information submitting end 104 includes a tax information end, the item information receiving end 106 may include a terminal used by a tax department. Tax authorities may be authorities that collect value-added tax in the general circulation of goods. In the embodiment of the application, the value-added tax may be a circulation tax collected by taking the value-added amount generated in the circulation process of the commodity (including the tax service) as a tax-calculating basis. In practice, the added value or added value of the commodity is difficult to calculate accurately during the production and circulation process. Therefore, many countries also use the international method of deduction of tax, i.e. calculate sales tax according to sales of commodity or labor, and then deduct the added tax paid when the commodity or labor is obtained, i.e. the incoming tax, the difference is the tax that the added part should pay. In the process of collecting cross-border import tax, the value-added tax can be collected according to legal discount. By the customs clearance code determining method provided by the embodiment of the application, the customs clearance code is submitted accurately when the customs clearance of the customs clearance article is carried out by the administrative organ of the customs clearance authority, and the customs clearance efficiency is improved.
The classification Code in the embodiment of the present application may refer to HS Code (Harmonized System Code, coordinated system Code). The HS CODE is the abbreviation of the commodity name and CODE coordination system, and the CODE coordination system is formulated by the International customs Congress, and is a system for quantitatively managing the tax rate of various different products on the way of entering and leaving and the customs tax. The trade name and code coordination system, i.e. the coordination system or HS described above, is one of the most successful tools developed by the world customs organization (WCO, world Customs Organization). HS is a multi-purpose commodity naming scheme adopted by WCO, and is also used by more than 200 countries and tariffs or economic alliances as a basis for tariffs and for compiling international trade statistics. The basic elements of customs and commodity entry and exit management institutions in various countries for confirming commodity category, carrying out commodity classification management, checking tariff standard and checking commodity quality index are the general identification of import and export commodity, namely HS codes. The HS codes are category systems of customs management commodities, and each commodity is classified under one HS Code. The HS Code may have a plurality of bits, and may include, for example, 6 HS Code, 8 HS Code, and 10 HS Code. The HS codes of each country can be defined, the first 6 bits of HS codes can be of international general standard, and the later bits of HS codes can be defined according to the actual conditions of each country.
Articles in embodiments of the present application may include goods (which may also be referred to as goods or merchandise) and/or services. In the field of commerce, goods are commonly understood goods which need to be transported into and out of a container, and tax collection of the goods into and out of the container needs to be determined according to HS codes and then tax rate is determined according to the classification. In the field of business, articles are also narrowly defined, which can refer to articles carried by individuals or posted in the environment, and the reasonable and self-use of individuals is emphasized, and the articles in the environment are suitable for simplifying procedures and tax rates. The tariffs of items in the field of customs may also be referred to as an actress tax, which is an abbreviation for the entry tax of baggage and mail items, and is an entry tax imposed by customs on baggage items and personal mail items of inbound passengers. The articles in the embodiments of the present application may include not only the goods in the related field but also the articles in the related field, and in the case where the following embodiments are not specifically described, the "articles" may include the goods in the related field.
The customs code determining method provided by the embodiment of the application, as shown in fig. 2, includes steps S201 to S205. In the embodiment of the present application, the following steps S201 to S205 may be performed by a terminal used by a customs side.
In step S201, customs clearance information of the object to be processed is obtained.
In this embodiment, the to-be-processed object may be an object that needs to determine a processing manner according to customs information for a user (such as a customs party) of the execution subject in steps S201 to S205, and may include an article for trade, or may include an article carried by a person or mailed into a country.
The customs clearance information of the object to be processed can be information submitted by a customs clearance party and used for the object to be processed to enter into the environment and pass through customs clearance. The customs clearance party may be a mailer of the item, a seller of the item, etc.
In one embodiment, the customs side can obtain customs information of the object to be processed through the corresponding terminal. For example, after receiving identity authentication information input by customs processing personnel, acquiring a processing number of data to be processed according to an allocation rule; obtaining the data to be processed related to the object to be processed according to the processing number; and acquiring customs clearance information of the object to be processed from the data to be processed.
In another embodiment, if the customs clearance information is not actively filled in for the to-be-processed object, the special processing personnel can obtain the related information of the to-be-processed object in a certain mode, for example, the customs clearance information of the to-be-processed object can be filled in according to the appearance, the name, the description and the package of the to-be-processed object, and the customs clearance information filled in by the special processing personnel when the to-be-processed object enters the clearance can be obtained through the terminal with the inquiry function.
In another embodiment, the customs clearance information may include customs clearance notes, such as electronic customs clearance notes forms, or information obtained from customs clearance note image recognition, or the like.
In step S202, a description text of the object to be processed is obtained according to the customs clearance information.
The description text of the object to be processed can be text, symbols, characters and/or other information describing the shape, property, name, attribute, material, purpose, specification and the like of the object to be processed.
In this embodiment, a text analysis method may be used to analyze the customs information and extract a description text describing the object to be processed therein. Or extracting items of the customs clearance information according to a set template to obtain a description text of the object to be processed.
In another embodiment, the customs clearance information not provided in a standard manner can be converted and processed, for example, the commodity number in the customs clearance information can be utilized to obtain more complete descriptive text from the producer and seller of the commodity to be processed.
In the case that the customs clearance information includes a customs clearance, acquiring a description text of the object to be processed according to the customs clearance information may include extracting specified items (entries) in the customs clearance to acquire contents of the specified items and acquire the description text. For example, the specified item may be a table item including information such as an item number, commodity code, commodity name, specification model, and the like in the customs clearance.
In step S203, the content belonging to the set dimension in the description text is identified, and an identification result is obtained.
In this embodiment, the set dimension may be a dimension corresponding to information required for specifying customs codes of the commodity to be processed, and may be a type of information, an attribute of the information, or the like. In the method provided by the embodiment of the application, the set dimension can be used for extracting the information in the descriptive text and matching with each node of the decision tree. There may be multiple set dimensions.
For example, the preset dimensions may include names, categories, materials, specifications. The recognition result may include the name: xxx bulbs; category: an electric product; materials, metals, and glass; specification of: xW.
In step S204, the recognition result is matched with a decision tree, and at least one matching node corresponding to the recognition result is determined in the decision tree; the decision tree is generated according to the corresponding relation between the description information of the plurality of articles and the classification codes; the nodes of the decision tree correspond to a classification code, and the description information of the article corresponding to the classification code is mounted; the decision tree comprises at least one matching node.
In this embodiment, the correspondence between the description information of the plurality of articles and the classification code may be a rule for determining the classification code, and the classification code and the customs code in this embodiment may be the same kind of code. For example, both the classification Code and the customs Code may be codes determined according to the HS rules, and then the decision tree may be generated according to the HS Code classification rules. According to the HS Code classification rule, the article classification comprises a plurality of major classes, each major class comprises respective sub-classifications in turn, and the sub-classifications can comprise multiple stages. For example, in the case where the customs code is an HS code, the box product mainly corresponds to "clothes boxes, handbags, and the like containers" according to the classification standard of HS, the HS code is 4202, and the classification of the HS code 4202 further includes three primary sub-classifications: "suitcase, carrying case, small handbag, briefcase, school bag, and similar container", "handbag", whether with shoulder straps or not, includes no handle "and" items normally placed in a pocket or handbag ". Each sub-category further divides the plurality of secondary sub-categories according to material. In the decision tree, the previous level of sub-classification may be a parent node of the next level of sub-classification, i.e. the sub-node of each node a in the decision tree corresponds to the sub-classification of the classification to which the node a belongs.
In the embodiment of the application, each node of the decision tree may correspond to one code. For example, node a of the decision tree corresponds to HS code 4202, and node a represents a classified item of "clothes box, handbag, and the like. The child nodes a1, a2, and a3 of node a correspond in turn to the respective sub-classifications of the classifications corresponding to HS codes 42021, 42022, and 42023, respectively, as well as HS code 4202, respectively. Thus, the encoding corresponding to the parent node of each child node in the decision tree may include the first N bits of the child node encoding, which may be consistent with the encoding of the parent node of the child node.
Further, each node of the decision tree may mount the corresponding encoded descriptive information. In this embodiment, mounting may refer to storing in a certain manner.
The matching of the description text of the object to be processed with the nodes in the decision tree can be that the identification result of each set dimension in the description text is matched with the description information corresponding to each node of the decision tree.
In this embodiment, the nodes of the decision tree may mount description information of multiple dimensions, and the description information mounted by the nodes may be information of the articles for describing or defining the codes attributed to the nodes. For example, according to the 2022 version HS Code classification rules, the description information of the node corresponding to the Code 4202 may include "suitcase, pocket, briefcase, book case, glasses case, telescope case, camera case, musical instrument case, holster, and the like; travel bags, food or beverage thermal packs, cosmetic bags, canvas bags, handbags, shopping bags, wallets, purses, map boxes, cigarette boxes, pouches, tool bags, sports bags, bottle boxes, jewel boxes, powder boxes, cutlery and the like are made of leather or regenerated leather, plastic sheets, textile materials, steel paper or cardboard, or are all mainly made of the materials or paper bags; meanwhile, the description information of the node corresponding to the code 42021 may include "suitcase, small handbag, briefcase, schoolbag, and the like; the description information of the node corresponding to code 420211 may include "leather or synthetic leather outer surface". Further, if the information of one set dimension extracted from the description text of the commodity to be processed comprises: class-suitcase, material-leather. Then the decision tree node corresponding to 4202, the decision tree node corresponding to 42021, and the decision tree node corresponding to 420211 may be used as matching nodes.
In step S205, a customs code of the item to be processed is determined based on the coded depth and the at least one matching node.
In this embodiment, the encoding depth may be set in advance as needed. The preset encoding depth may correspond to the node depth of the decision tree node that should be obtained at maximum. In one embodiment, the coded depth may correspond to the number of coded bits that need to be acquired, such as 6 bits, 8 bits, or 10 bits. In another implementation, the coding depth can be set to 8-bit codes according to the customs clearance requirement of the customs department, and if the information of one set dimension extracted from the descriptive text of the commodity to be processed comprises: class-suitcase, material-leather, node depth of offspring node corresponding to the decision tree node corresponding to the code depth 420211 can be determined.
In this embodiment, the customs code determined in step S205 may include at least one. The customs code may be a classification code for determining management operations related to the clearance of the item to be processed.
The customs code determining method provided by the implementation of the application can be used for assisting customs authorities in determining customs codes of the to-be-processed objects according to customs information of the to-be-processed objects. The customs clearance information is processed by text processing and the like, and the content of the set dimension and the content in the description text of the articles to be processed are extracted, so that customs clearance departments are assisted in processing the description information of the articles to be processed, and unified and efficient management of a large number of customs clearance articles is facilitated.
Meanwhile, the embodiment of the application carries out multi-dimensional recognition on the description text of the commodity to be processed, so that the information of the commodity to be processed for clearance can be comprehensively obtained from the plurality of dimensions, and the comprehensiveness of information analysis is ensured. The plurality of setting dimensions can be modified, deleted and added according to the needs, and the dimensions can be adjusted in time when the customs official regulations change. For example, in the classification rule of the customs, the description information of the article L corresponding to the code XX is not originally distinguished, in the updated classification rule of the customs, the description information of the article L corresponding to the code XX is additionally distinguished, the next-level codes corresponding to the articles L of different materials are XX1 and XX2 and … … respectively, a new setting dimension of "material" can be added, and the updated setting dimension is adopted to identify the description text. The nodes of the decision tree in the embodiment of the application are applied to the classification codes for determining the customs numbers, and the nodes of the decision tree mount description information of the corresponding classifications, so that modification and adjustment are facilitated according to the change regulated by authorities. In addition, the embodiment of the application can set the coding depth, namely the number of bits of coding, according to the detailed requirement of official determination customs coding, so that the accurate decision tree matching nodes can be positioned efficiently, and the coding with the required number of bits can be obtained.
In the embodiment of the present application, both customs codes and classification codes may be HS codes (HS codes) determined according to HS rules. In most cases, due to its versatile structure and versatile nature, HS Code is also used for many other purposes, such as trade policies, origin rules, monitoring of regulated goods, national tax, tariffs, transportation statistics, quota control and economic research and analysis, as a true "international trade language". Thus, HS codes are not only an important tool for WCO, but also for all public or private institutions participating in world trade.
The world trade organization (WTO, world Trade Organization) and some countries are using HS codes as a common language for trade negotiations. The WTO tariff reduction tables in most countries have been written based on HS codes and the process of converting the remaining WTO tariff reduction tables into a coordinated system is continued in most countries. HS Code also provides a basis for new internationally recognized origin rules co-formulated by the world customs organization and the world trade organization. Another example of the increasingly widespread use of HS codes is the use of this regime by some countries as a basis for collection of consumption and sales tax.
In general, the most important uses of HS codes include: as a basis for tariffs; as a basis for collecting international trade statistics; as a basis for origin rules; for collecting domestic tax; as a basis for trade negotiations (e.g., making WTO tariff reduction tables, negotiating based on WTO tariff reduction tables); for transportation tariffs and statistics; for monitoring controlled items (e.g., waste, narcotics, ozone depleting substances, endangered species); important elements in the core customs process area of customs control and programming include risk assessment, information technology, and compliance. The HS regime is used in a versatile classification system for a number of different purposes, the primary purpose of which may include collection of importation tariffs and tax.
The 2022 version of HS Code classification rule has 21 major classes, 97 sections, HS Code accords with the structure of a multi-way tree, and the first 6 internationally common HS Code comprises a plurality of chapter notes as supplements when logic conflicts exist. But in general, the most critical of HS codes is its binary structure [ HS Code: description ] (HS codes: description information), i.e. for each HS code, there is corresponding Description information. In most cases, an item (or other item) may be presorted as a HS Code as long as the item meets the description of that HS Code.
For example, the HS Code class specification includes the following:
| -4202: suitcase, small handbag, briefcase, book case, glasses case, telescope case, camera case, musical instrument case, holster, and the like; travel bags, food or beverage thermal packs, cosmetic bags, canvas bags, handbags, shopping bags, wallets, purses, map boxes, cigarette boxes, pouches, tool bags, sports bags, bottle boxes, jewel boxes, powder boxes, cutlery and similar containers, made of leather or recycled leather, plastic sheets, textile materials, steel paper or cardboard, or all of which are mainly covered with the above materials or paper;
|| -42021: suitcase, pocket, briefcase, school bag, and the like;
|| -420211: leather or synthetic leather outer surface;
|| -420212: an outer surface of a plastic or textile material;
|| -420219: others;
|| -42022: handbags, whether with shoulder straps or not, include no handles;
|| -420221: leather or synthetic leather outer surface;
|| -420222: an outer surface of a plastic sheet or a textile sheet;
|| -420229: others;
|| -42023: items typically placed in a pocket or handbag;
|| -420231: leather or synthetic leather outer surface;
|| -420232: an outer surface of a plastic sheet or a textile sheet;
|| -420239: others;
|| -42029: others;
|| -420291: leather or synthetic leather outer surface;
|| -420292: an outer surface of a plastic sheet or a textile sheet;
|| -420239: others.
Based on the above references from the 2022 HS Code class definition, for example 4202, if a commodity belongs to a leather briefcase, then the 6-position HS Code for that commodity is 420211. In a few scenes, one commodity may conform to the description information of a plurality of HS codes, so some classification principles are involved, such as preferentially looking at the commodity main body, then looking at the purpose, whether the materials match, and the like, and even looking at chapter notes and class notes to solve the logic conflicts. Therefore, in general, the customs department is presorted by the classificators authenticated by the relevant institutions, the automatic classification has a larger challenge, and the method belongs to a natural language recognition task with a higher threshold and a higher professional degree. Although customs departments have classificators, with the increase of economy and the expansion of the circulation scale of goods, the manual labor of the classificators is faced with a huge amount of pressure.
In the embodiment of the application, the description text of the object to be processed is matched by introducing the decision tree, so that the consistency of a matching tool and an HS rule structure can be realized. The data structures of the root node and the multi-stage offspring node of the decision tree can fully express the HS rule classification level, and the decision tree node mounts the description information corresponding to the codes in the classification rule, so that the matching process becomes direct, inquireable and interpretable.
According to the customs code determining method provided by the embodiment of the application, the contents of the description texts of the multiple dimensions of the to-be-processed object can be obtained according to customs information of the to-be-processed object submitted during customs reporting or actively acquired by other personnel, and the matched nodes are determined in the preset decision tree according to the contents of the description texts of the multiple dimensions, so that the whole process of generating customs codes for the to-be-processed object can be whitened, when the problem of customs code identification errors exists, the reason of the matching errors can be determined according to the wrong customs codes and traced to the corresponding decision tree nodes, and therefore, the strong interpretability exists in the determination of the whole customs codes, the efficiency and the accuracy of presorting are improved, the situation of errors is adjusted correspondingly in time, and the specification reporting and the intelligence are further facilitated.
Meanwhile, the customs code determining method provided by the embodiment of the application can also reduce the dependence of the process of determining the customs code of the object to be processed on professional classificators, reduce the dependence on high professional threshold manual labor, reduce the economic cost required by related enterprise specification declaration, and obtain an accurate customs code output result. And loading the code rule identification by using the tax rule structure of the customs code, and carrying out hierarchical progressive concurrent judgment and multidimensional matching on the obtained multidimensional deconstructed result. Meanwhile, the embodiment of the application utilizes the decision tree to match and determine the customs code and customs information, and the determination process of the customs code is boxed.
In one embodiment of the present application, identifying content corresponding to a set dimension in a description text to obtain an identification result includes: carrying out named entity recognition on the description text according to a plurality of set dimensions to obtain a named entity recognition result; identifying the description text according to a preset template item to obtain a template item identification result; and obtaining a recognition result according to the named entity recognition result and the template item recognition result.
According to the embodiment, through named entity recognition and template item recognition, the comprehensive information in the descriptive text can be obtained, and accurate classification of the objects to be processed from multiple angles is facilitated.
In one embodiment of the present application, performing named entity recognition of a plurality of dimension settings on a description text to obtain a named entity recognition result, including: determining a dimension that describes a hit of the text in a plurality of set dimensions; determining at least one dimension value describing the text in the hit dimension; and taking at least one dimension value in the hit dimension and the hit dimension as a named entity recognition result.
In this embodiment of the present application, the dimension value may be content corresponding to the dimension. For example, the corresponding dimension values in the material dimension may include leather, plastic, cloth fiber, and the like. The dimension value may be at least one of a numerical value, a character, and text.
According to the embodiment of the application, the description text can be analyzed through multiple dimensions, so that a comprehensive named entity recognition result can be obtained. For example, for the material ABS (Acrylonitrile Butadiene Styrene, acrylonitrile-butadiene-styrene copolymer), the multi-dimensional named entity recognition approach employed in the examples of the present application can be identified as plastic and ABS.
In one embodiment of the present application, using at least one dimension value of the hit dimension and the hit dimension as the named entity recognition result includes: under the condition that the dimension is consistent with the tax label of the node in the decision tree, or under the condition that the dimension value is consistent with the tax label of the node in the decision tree, taking at least one dimension value among the tax label, the hit dimension and the hit dimension as a named entity recognition result; the tax label is determined according to the same set dimension under the condition that the corresponding description information of different classification codes belongs to the same set dimension in a plurality of set dimensions.
In general, the core of most of the automatized pre-classifying systems with better effect is based on an end-to-end deep learning model, and the training basis is that a large amount of manual labeling data, historical customs data, record data and the like are needed, and the HS Code of a certain title is usually in a binary group structure of a certain Code. For each HS code, a corresponding tuple structure can be obtained. For example, polypropylene cutlery box HS for holding food is encoded as 392410, two-part structure [392410: a polypropylene cutlery box for holding food. These data are mostly obtained by manual labeling. And based on a deep learning model of manually marked data, the problem of 'knowledge island' is difficult to solve. As shown in fig. 3, PU (polyurethane) and TPE (Thermoplastic Elastomer, TPE) are all made of plastic materials, but by manual labeling, HS codes to which products corresponding to the two materials belong are different, so that a pure deep learning model cannot laterally "understand" that PU and TPE are made of plastic materials, and a situation that TPE dinner plates are wrongly classified into non-plastic dinner plates or cannot be classified may occur.
In this embodiment, by setting the tax label, a plurality of dimension values belonging to the same set dimension are laterally expanded, so that if a content corresponding to one dimension hits a tax label, a matched node can be determined according to the tax label, and under the condition that a certain dimension value and another dimension value actually belong to the same category, which is difficult to be known according to personal knowledge of a classifier or knowledge limitation of a deep learning model, the tax label of the decision tree node in this embodiment of the present application can clearly and definitely express the interconnection and common attribute between different dimension values. For example, to the condition that "TPE dinner plate" exists in the descriptive text, can confirm that corresponding tax rule label is "plastics material" according to TPE material, then avoid classifying TPE dinner plate into non-plastics material dinner plate, can classify TPE dinner plate into plastics material dinner plate to the problem of knowledge island has been solved.
In one embodiment of the present application, matching the recognition result with a decision tree, and determining at least one matching node corresponding to the recognition result in the decision tree includes: under the condition that the identification result is matched with the description information corresponding to the nodes of the decision tree, taking the nodes corresponding to the description information matched with the identification result as matched nodes; and taking the parent node of the matching node as the matching node.
Through the embodiment, the nodes with different depths can be determined as the matching nodes, so that customs codes with different digits can be determined according to the requirements.
In one embodiment of the present application, determining a customs code of an item to be processed according to a preset code depth and at least one node includes: screening the at least one matching node according to the preset coding depth and the depth of the at least one matching node to obtain a target matching node; and taking the classification code corresponding to the target matching node as the customs code of the object to be processed.
In this embodiment, at least one matching node is screened according to a preset coding depth, where the coding depth may be a node depth corresponding to a node to be screened in the decision tree. For example, if 6-bit encoding is needed, the set encoding depth is the node depth of the node corresponding to the 6-bit encoding, and in the screening process, ancestor nodes of the node corresponding to the 6-bit encoding can be screened out.
In one embodiment of the present application, the customs code determination method further includes: determining classification information of at least one level of at least one article according to the description information and the labeling information of the at least one article; aiming at single classification information, determining the node depth in the decision tree according to the classification level to which the classification information belongs; based on the node depth and the classification information, nodes corresponding to the classification information are generated in the decision tree.
In general, when determining customs codes according to customs declaration information by using a deep learning model, training the deep learning model by using annotation data is required. The magnitude and coverage rate of the annotation data adopted for training the deep learning model are not necessarily proportional, and it is difficult to estimate how many annotation samples are needed for covering the common HS Code. The coverage rate and the accuracy rate of the classification system can be effectively increased by expanding the manual annotation data according to a certain mode and logic. Meanwhile, the manual intervention cost is high, the release online period is long, and the model needs to be retrained.
According to the embodiment of the application, the nodes of the decision tree are updated according to the description information and the labeling information of the articles, so that one data is needed, the corresponding nodes can execute accurate, efficient and quick updating operation, different data are not needed to enable the model to learn classification rules, the release period is short, the nodes needing to be intervened can be directly searched for to conduct corresponding intervention operation under the condition that manual intervention is needed, the release online period is short, release can be quickly realized along with release of the coding rules, update can be quickly realized along with update of the coding rules, update of the rules or error correction of the nodes is not needed, and the decision tree is not needed to be reconstructed.
In one embodiment of the present application, the customs code determination method further includes: determining new description information according to the new classification rule; determining new description information corresponding to at least one customs code according to the new description information; and updating the corresponding nodes of the customs codes in the decision tree according to the new description information.
According to the customs code determining method, the nodes related to the updated classification rules can be updated timely and accurately according to the new classification rules, so that the customs code determining method can keep pace basically consistent with the latest official regulations.
The embodiment of the application also provides a customs clearance code determining method, which comprises the following steps: obtaining a description text input by a customs declaration aiming at an object to be processed; identifying the content belonging to the set dimension in the description text to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of the commodities, and each node of the decision tree corresponds to one classification code; the nodes of the decision tree comprise at least one matching node; according to the coding depth and at least one node, determining a predictive customs clearance code of the object to be processed; and determining the customs clearance code of the object to be processed according to the received code rechecking information and the predicted customs clearance code.
In this embodiment, the customs clearance party may be a party who needs to carry, transport or mail the article for clearance, for example, may be a receiving point of the electronic commerce in the country of the mailing place.
The classification code and the customs code in the customs code determining method may be codes determined according to HS rules. HS codes are not only important tools for WCO, but also for all government institutions (e.g. customs, finance, auditing departments, etc.) or commercial institutions (e.g. import-export trade, e-commerce, logistics, data analysis departments, etc.) participating in world trade. With the development of world trade, compliance regulations of electronic commerce platforms by countries around the world are becoming more and more strict. For example, the European Union cancels a 22-European non-sign value-added tax threshold for an e-commerce platform from the 7 th and 1 th year 2022, and requires the e-commerce platform to collect value-added tax in part of the transaction scene. The e-commerce platform sequentially designs a value-added tax calculation scheme of the e-commerce platform. The HS Code is obtained by presorting the commodity and the value-added tax rate is related, so that the method is the most direct way for determining the value-added tax. However, for the e-commerce platform, there is extremely high cost for the pre-classifying of the total-station commodities, namely, the total-station commodities of the e-commerce platform can reach millions or even hundreds of millions, and the accumulated orders of magnitude can be increased continuously along with the continuous new-generation of the commodities of the merchant. In the embodiment of the application, the presorted service personnel, also called as presorters, are persons who are subjected to presorted level test examination by China Association of customs and are subjected to import and export goods presorted service, and the presorted service personnel qualification is obtained. In short, the method is a professional technician for carrying out commodity pre-classification for customs enterprises. If the e-commerce platform simply relies on pre-classificators to manually classify, the cost is too high, the price of single commodity classification is different in wool count and tens RMB (RMB), and the task with a magnitude of magnitude cannot be completed manually. In the E-commerce logistics and international parcel links, the need of HS Code for clearance is also faced. It is therefore important to build an automated pre-classification system and related basic capabilities.
According to the customs code determining method, a presorted high-professional classificator (or classificator) is not required to be relied on for manual classification, the customs party only needs to input the description text of the to-be-processed object to be cleared, the customs code of the to-be-processed object can be obtained, and finally, only the output customs code is required to be checked. Compared with the manual operation of the traditional classificators or presorts, the labor intensity and the requirements on the professionals are greatly reduced, meanwhile, the embodiment of the application can rapidly process the description text of the customs clearance by using the computing equipment, and the introduced decision tree allows multithread to process the description text of a plurality of different objects to be processed in parallel, so that the embodiment of the application can not only bear the requirements on the number of tasks, but also meet the accuracy requirements on the customs clearance codes under the condition that the commodity transaction scale of the cross-border electronic commerce is increased day by day, and is beneficial to the development of commodity trade of the cross-border electronic commerce.
In one embodiment, the customs code determination method further comprises: acquiring description information of a corresponding classification rule newly issued by an official according to the demand setting information; determining new description information and/or new labeling information of at least one category of commodities according to the newly issued classification rule description information; and updating at least one node of the decision tree according to the new description information and/or the new labeling information.
The embodiment can update the nodes of the decision tree even if the classification rule newly issued by the official is acquired, thereby ensuring the effectiveness and the validity of the decision basis of the decision tree.
The embodiment of the application also provides a classification code determining method, which comprises the following steps: identifying the content corresponding to the set dimension in the description text of the object to be processed to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of a plurality of commodities, and each node of the decision tree corresponds to one classification code; the nodes of the decision tree comprise at least one matching node; and determining the classified codes of the objects to be processed according to the coded depth and at least one matching node.
The classification code determining method provided by the embodiment of the application can be applied to a terminal or a system of a user who needs to classify the articles. For example, tax departments need to collect value-added tax for the traded commodity according to commodity category; the financial department needs to determine relevant tax policies according to the classification of commodities; the statistics department needs to make statistics on commodity transaction data according to commodity classification. The user who needs to classify the articles can determine the article classification by using the classification code determining method provided by the embodiment of the application.
The embodiment of the application also provides a decision tree construction method, which comprises the following steps: acquiring sample data containing article classification rules; according to the sample data, obtaining classification codes of a plurality of classifications of the articles and description information of the articles corresponding to each classification code; constructing a decision tree according to the relation between the classification codes, so that each offspring node of the decision tree corresponds to one classification code, and the child node of each father node in the decision tree corresponds to the child classification under the classification code to which the father node belongs; and mounting the description information of the article corresponding to each classification code on the corresponding node of each classification code in the decision tree.
In one embodiment, under the condition that the description information corresponding to different classification codes belongs to the same set dimension in the plurality of set dimensions, obtaining the tax label according to the same set dimension, and mounting the tax label on all nodes corresponding to the same set dimension.
The decision tree used in any of the embodiments of the present application may be generated by the decision tree construction method provided in the foregoing embodiment.
In one example of the present application, as shown in fig. 4, for the commodity to be cleared, the descriptive text "fast-fusing glass fuse high-quality 5×20MM glass tube fuse 2a 250v" extracted from the customs information can be identified by the NER (Name Entity Recognition, named entity identification) method. The information which is difficult to extract if the part exists in the descriptive text can be supplemented by template matching. And obtaining the identification results of a plurality of set dimensions such as the material, the class, the column name, the application object, the application, the specification and the like of the commodity to be processed from the description text. Searching and inquiring the obtained dimension values (namely the recognition results of a plurality of set dimensions) in the decision tree, and judging whether the dimension values are matched or not to obtain at least one matched node. And calculating corresponding matching scores for each matched node to recommend. The decision search process can be accelerated in parallel in a slicing way, and assertion information can be added for intervention. In this example, an assertion may refer to a first order logic in a program, typically a logical predicate that results in true or false.
In fig. 4, named entity recognition results of three set dimensions, namely name, scene and material, are obtained by carrying out named entity recognition on the description text; by performing template item recognition (template matching) on the description text, a recognition result of the set template item regarding the specification is obtained. And taking the named entity recognition result and the template item recognition result as recognition results, executing a mapping protocol task (MapReduce Job), and matching with a decision tree. The parent node 8536 decision process of 853610 and 853669 is omitted from the embodiment shown in fig. 4. The NER in the embodiment of the present application belongs to a critical basic task in NLP (Natural Language Processing ), that is, identifies an entity with a specific meaning in a text, and mainly includes a person name, a place name, an organization name, a proper noun, and the like.
Aiming at the commodity to be processed, setting the dimension to be identified of the description text as DIM, wherein the dimension weight is DIMW, the ith dimension is DIMi, and the ith dimension weight is DIMWi. There may be n values in a dimension, where the j value of the i dimension is denoted as dimiv j, and the process of multidimensional analysis of the commodity by the NER is shown in fig. 5, where the dimension value identified in each dimension may be uncertain, which is not limited in the embodiment of the present application. In this example, a dimension, a value in a dimension, may correspond to a class and its multiple sub-classes, and thus may also correspond to a node and its sub-nodes in the decision tree.
In one example of the present application, a multi-way decision tree may be constructed by NER, and specific steps may include the following steps S1-S3.
In step S1, sample data is identified and extracted in the NER mode, so as to obtain description information corresponding to each classification code, and the description information is mounted on nodes of the decision tree. The sample data can comprise labeling data, and the assertion rules of the classification codes corresponding to the corresponding nodes can be configured through the labeling data. Under the condition that the classification codes are HS codes, the structure of the decision tree is the same as the structure of the HS Code classification rule, the node corresponds to the HS Code, and the information of node mounting can comprise the description information of all the child nodes, namely the content obtained by analyzing the child dimension solutions of the dimension corresponding to the node in the HS Code classification rule. The decision tree node can also be provided with an assertion rule (manually configured assertion), as shown in fig. 6, named entity recognition is performed according to a plurality of labeling data, namely labeling data 1-n, so as to obtain n dimension recognition results, wherein each dimension recognition result comprises an indefinite number of dimension values. And according to the recognition results of each dimension and the nodes of the corresponding HS Code generation decision tree (tax tree in fig. 6), mounting the NER recognition result aggregation results (NER result aggregation in fig. 6) corresponding to each node and the manual configuration assertion (assertion rule) on each node.
In step S2, the description information corresponding to each node is sorted. In this application example, the description information of the HS Code classification rule is also identified and extracted by NER, and the description information extracted by different HS codes may overlap or be identical, and this part of identical description information may be referred to as a tax label. Some different HS codes may have the same tax label in the tax label of each node extracted according to the HS Code coding classification rule. If only one tax label exists in one dimension corresponding to one HS Code, all values in the dimension corresponding to the HS Code are mounted under the tax label when the step S1 is executed. If multiple tags are present, the description information of the child node under the HS Code may need to be sorted and assertion rules may be configured. If the collation cost is too great, then the tag can be ignored. The sorting label can transversely expand the DIM values of different HS codes again to solve the problem of knowledge island.
And S3, setting a matching decision of the decision tree. Matching logic that may be provided may include a dimension value perfect match, a dimension value similarity match, and/or a dimension value synonymous match, consistent with an assertion determination, or attribution to a tax label.
The decision tree provided by the present example may be applied to world electronic trading platforms (eWTP, electronic World Trading Platform). eWTP is an initiative that was held by the private sector and co-initiated by all stakeholders, aims to develop electronic trade rules through public and private collaboration, and creates a more efficient and effective policy and business environment for the development of cross-border electronic trade (eTrade). The combination of new technology and globalization creates a brand new trade form, new trade needs new rules, and a world electronic trade platform (Electronic World Trade Platform, eWTP) advocates to be born in the background and is accepted by the international society. The eWTP initiative aims to create a more free, innovative and affordable international trade environment by facilitating public and private conversations and sharing best practices.
In the example of the application, after the decision tree is constructed, the process of pre-classifying the HS Code of the article is converted into the logic for making decisions in the decision tree, and meanwhile, the structure of the decision tree naturally supports the concurrent processing of branches, so that the classification is more efficient. For example, after information of an article is input, extracting and NER operation is performed on a description text of the article, if the content of a certain preset dimension of the description text matches a dimension value of a certain node corresponding dimension in a decision tree or accords with a certain assertion rule preset by the certain node corresponding dimension, the description text representing the article matches the node, and the like, a deeper layer-by-layer decision is continuously made to the decision tree until a certain child node exits, a corresponding weighting and score are calculated according to the dimension weight, and the recommendation trend score of the HS Code can be obtained according to the inverted matching score of the finally exited node.
In other embodiments of the present application, the assertion rule may be a preset dimension, that is, the preset dimension in other embodiments may include the assertion rule, or the dimension value under the preset dimension includes the assertion rule.
It can be seen that the decision tree construction method provided by the example of the application is beneficial to realizing a universal low-cost and landable customs Code presorted recommendation mode, compared with the existing customs Code presorted recommendation mode, the example of the application carries out multidimensional disassembly on HS Code and annotation data based on a named entity recognition algorithm (Named Entity Recognition, NER), and builds a decision tree query engine based on the data; meanwhile, dimension weights and custom assertions are added into the decision tree, and after the decision tree is constructed, the HS Code pre-classifying process is converted into logic for decision in the decision tree.
According to the method and the device, the tax description is deconstructed through NER, and certain information island problems are solved through arranging transverse tax labels. Whether the labeling sample is supplemented or not is judged by analyzing the importance degree of each HS Code and the number of the mounted dimension values, and the number of the labeling to be supplemented can be effectively analyzed. Meanwhile, the search engine and the knowledge base can be searched with a better grip according to the related labels to supplement data. The description text of the article is matched through the nodes of the decision tree, the classifying process can be interpreted, high cost is not needed, and only the decision log and the calculating process in the decision tree are needed to be recorded. The manual intervention cost is low, the customized dimension data or the assertion rule is very convenient, and the method can be specific to a specific HS Code and has a short issuing period.
The embodiment of the application also provides a customs code determining device, as shown in fig. 7, including:
the customs information acquisition module 501 is configured to acquire customs information of an object to be processed; the descriptive text acquisition module 502 is configured to acquire descriptive text of an object to be processed according to the customs declaration information; the recognition result obtaining module 503 is configured to identify content belonging to a set dimension in the description text, so as to obtain a recognition result; a matching module 504, configured to match the identification result with a decision tree, and determine at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the description information of the plurality of articles and the classification codes; the nodes of the decision tree correspond to a classification code, and the description information of the article corresponding to the classification code is mounted; the nodes of the decision tree comprise at least one matching node; a depth module 505 for determining a customs code for the item to be processed based on the coded depth and the at least one matching node.
In one embodiment, the recognition result acquisition module includes: the first recognition unit is used for carrying out named entity recognition on the descriptive text according to a plurality of set dimensions to obtain a named entity recognition result; the second recognition unit is used for recognizing the descriptive text according to a preset template item to obtain a template item recognition result; and the recognition result unit is used for obtaining a recognition result according to the named entity recognition result and the template item recognition result.
In an embodiment, the recognition result unit is further configured to: determining a dimension that describes a hit of the text in a plurality of set dimensions; determining at least one dimension value describing the text in the hit dimension; and taking at least one dimension value in the hit dimension and the hit dimension as a named entity recognition result.
In an embodiment, the recognition result unit is further configured to: under the condition that the dimension is consistent with the tax label of the node in the decision tree, or under the condition that the dimension value is consistent with the tax label of the node in the decision tree, taking at least one dimension value among the tax label, the hit dimension and the hit dimension as a named entity recognition result; the tax label is determined according to the same set dimension under the condition that the corresponding description information of different classification codes belongs to the same set dimension in a plurality of set dimensions.
In one embodiment, the matching module includes: the first matching unit is used for taking the node corresponding to the description information matched with the identification result as a matching node when the identification result is matched with the description information corresponding to the node of the decision tree; and the second matching unit is used for taking the parent node of the matching node as the matching node.
In one embodiment, the depth module includes: the screening unit is used for screening the at least one matching node according to the preset coding depth and the depth of the at least one matching node to obtain a target matching node; and the target matching node processing unit is used for taking the classification code corresponding to the target matching node as the customs code of the object to be processed.
In one embodiment, the customs code determination apparatus further includes: determining classification information of at least one level of at least one article according to the description information and the labeling information of the at least one article; aiming at single classification information, determining the node depth in the decision tree according to the classification level to which the classification information belongs; based on the node depth and the classification information, nodes corresponding to the classification information are generated in the decision tree.
In one embodiment, the customs code determination apparatus further includes: the descriptive information acquisition unit is used for determining new descriptive information according to the new classification rule; the description information updating unit is used for determining new description information corresponding to at least one customs code according to the new description information; and updating the corresponding nodes of the customs codes in the decision tree according to the new description information.
The embodiment of the application also provides a customs clearance code determining device, which comprises: obtaining a description text input by a customs declaration aiming at an object to be processed; identifying the content belonging to the set dimension in the description text to obtain an identification result; matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of the commodities, and each node of the decision tree corresponds to one classification code; the nodes of the decision tree comprise at least one matching node; according to the coding depth and at least one node, determining a predictive customs clearance code of the object to be processed; and determining the customs clearance code of the object to be processed according to the received code rechecking information and the predicted customs clearance code.
In the embodiment of the present application, the customs code determining apparatus further includes: the acquisition rule acquisition module is used for acquiring description information of the corresponding classification rule newly issued by the authorities according to the demand setting information; the description information updating module is used for determining new description information of at least one category of commodities according to the description information of the newly issued classification rule; and the node updating module is used for updating at least one node of the decision tree according to the new description information.
The embodiment of the application also provides a classification code determining device, which comprises: the identification result module is used for identifying the content corresponding to the set dimension in the description text of the object to be processed to obtain an identification result; the matching module is used for matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of a plurality of commodities, and each node of the decision tree corresponds to one classification code; the nodes of the decision tree comprise at least one matching node; and the classification coding module is used for determining classification codes of the objects to be processed according to the coding depth and the at least one matching node.
The embodiment of the application also provides a decision tree construction device, which comprises: the sample data acquisition module is used for acquiring sample data containing article classification rules; the sample data processing module is used for acquiring classification codes of a plurality of classifications of the articles and description information of the articles corresponding to each classification code according to the sample data; the decision tree construction module is used for constructing a decision tree according to the relation between the classification codes, so that each offspring node of the decision tree corresponds to one classification code, and the child node of each father node in the decision tree corresponds to the child classification under the classification code to which the father node belongs; and the mounting module is used for mounting the description information of the article corresponding to each classification code on the corresponding node of each classification code in the decision tree.
In one embodiment, the tax label module is further configured to obtain a tax label according to the same set dimension and mount the tax label on all nodes corresponding to the same set dimension when the description information corresponding to different classification codes belongs to the same set dimension of the multiple set dimensions.
Fig. 8 is a block diagram of an electronic device used to implement an embodiment of the present application. As shown in fig. 8, the electronic device includes: a memory 610 and a processor 620, the memory 610 storing a computer program executable on the processor 620. The processor 620, when executing the computer program, implements the methods of the above-described embodiments. The number of memory 610 and processors 620 may be one or more.
The electronic device further includes:
the communication interface 630 is used for communicating with external devices for data interactive transmission.
If the memory 610, the processor 620, and the communication interface 630 are implemented independently, the memory 610, the processor 620, and the communication interface 630 may be connected to each other and perform communication with each other through buses. The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 610, the processor 620, and the communication interface 630 are integrated on a chip, the memory 610, the processor 620, and the communication interface 630 may communicate with each other through internal interfaces.
The present embodiments provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the methods provided in the embodiments of the present application.
The embodiment of the application also provides a chip, which comprises a processor and is used for calling the instructions stored in the memory from the memory and running the instructions stored in the memory, so that the communication device provided with the chip executes the method provided by the embodiment of the application.
The embodiment of the application also provides a chip, which comprises: the input interface, the output interface, the processor and the memory are connected through an internal connection path, the processor is used for executing codes in the memory, and when the codes are executed, the processor is used for executing the method provided by the application embodiment.
It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or any conventional processor or the like. It is noted that the processor may be a processor supporting an advanced reduced instruction set machine (Advanced RISC Machines, ARM) architecture.
Further alternatively, the memory may include a read-only memory and a random access memory. The memory may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), programmable ROM (PROM), erasable Programmable ROM (EPROM), electrically Erasable EPROM (EEPROM), or flash Memory, among others. Volatile memory can include random access memory (Random Access Memory, RAM), which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available. For example, static RAM (SRAM), dynamic RAM (Dynamic Random Access Memory, DRAM), synchronous DRAM (SDRAM), double Data Rate Synchronous DRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct RAM (DR RAM).
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. Computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Any process or method described in flow charts or otherwise herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process. And the scope of the preferred embodiments of the present application includes additional implementations in which functions may be performed in a substantially simultaneous manner or in an opposite order from that shown or discussed, including in accordance with the functions that are involved.
Logic and/or steps described in the flowcharts or otherwise described herein, e.g., may be considered a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. All or part of the steps of the methods of the embodiments described above may be performed by a program that, when executed, comprises one or a combination of the steps of the method embodiments, instructs the associated hardware to perform the method.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules described above, if implemented in the form of software functional modules and sold or used as a stand-alone product, may also be stored in a computer-readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The foregoing is merely exemplary embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of various changes or substitutions within the technical scope of the present application, which should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

1. A customs code determination method, comprising:
acquiring customs information of an object to be processed;
acquiring a description text of the object to be processed according to the customs clearance information;
identifying the content belonging to the set dimension in the description text to obtain an identification result;
matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree;
the decision tree is generated according to the corresponding relation between the description information of the plurality of articles and the classification codes; the nodes of the decision tree correspond to a classification code, and the description information of the article corresponding to the classification code is mounted; the decision tree comprises the at least one matching node;
and determining the customs code of the object to be processed according to the coded depth and the at least one matching node.
2. The method according to claim 1, wherein the identifying the content corresponding to the set dimension in the description text to obtain the identification result includes:
carrying out named entity recognition on the description text according to a plurality of set dimensions to obtain a named entity recognition result;
identifying the description text according to a preset template item to obtain a template item identification result;
and obtaining the recognition result according to the named entity recognition result and the template item recognition result.
3. The method according to claim 2, wherein performing named entity recognition on the description text with multiple set dimensions to obtain a named entity recognition result includes:
determining a dimension in which the descriptive text hits in the plurality of set dimensions;
determining at least one dimension value of the descriptive text in the hit dimension;
and taking at least one dimension value in the hit dimension and the hit dimension as the named entity recognition result.
4. A method according to claim 3, wherein said assigning at least one of said hit dimension and said hit dimension as said named entity recognition result comprises:
Taking at least one dimension value of the tax label, the hit dimension and the hit dimension as the named entity recognition result under the condition that the dimension is consistent with the tax label of any node in the decision tree or the dimension value is consistent with the tax label of any node in the decision tree;
the tax label is as follows: and under the condition that the different classification coding corresponding description information belongs to the same set dimension in a plurality of set dimensions, determining according to the same set dimension.
5. The method of claim 1, wherein matching the recognition result with a decision tree, determining at least one matching node in the decision tree corresponding to the recognition result, comprises:
under the condition that the identification result is matched with the description information corresponding to the nodes of the decision tree, taking the nodes corresponding to the description information matched with the identification result as matched nodes;
and taking the parent node of the matching node as the matching node.
6. The method of claim 1, wherein said determining a customs code for said item to be processed based on said coded depth and said at least one node comprises:
Screening the at least one matching node according to the coding depth and the depth of the at least one matching node to obtain a target matching node;
and taking the classification code corresponding to the target matching node as the customs code of the object to be processed.
7. The method according to claim 1, wherein the method further comprises:
determining classification information of at least one level of at least one article according to the description information and the labeling information of the at least one article;
aiming at the single classification information, determining the node depth in the decision tree according to the classification level to which the classification information belongs;
and generating nodes corresponding to the classification information in the decision tree according to the node depth and the classification information.
8. The method of claim 7, wherein the method further comprises:
determining new description information according to the new classification rule;
determining new description information corresponding to at least one customs code according to the new description information;
and updating the corresponding nodes of the customs codes in the decision tree according to the new description information.
9. A customs clearance code determination method, comprising:
Obtaining a description text input by a customs declaration aiming at an object to be processed;
identifying the content belonging to the set dimension in the description text to obtain an identification result;
matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of a plurality of commodities, and each node of the decision tree corresponds to one classification code; the decision tree comprises the at least one matching node;
determining a predicted customs clearance code of the object to be processed according to the code depth and the at least one node;
and determining the customs clearance code of the object to be processed according to the received code rechecking information and the predicted customs clearance code.
10. The method according to claim 9, wherein the method further comprises:
acquiring corresponding naming rule description information newly issued by authorities according to the demand setting information;
determining new description information of at least one category of commodities according to the newly issued naming rule description information;
and updating at least one node of the decision tree according to the new description information.
11. A method of determining a classification code, comprising:
identifying the content corresponding to the set dimension in the description text of the object to be processed to obtain an identification result;
matching the identification result with a decision tree, and determining at least one matching node corresponding to the identification result in the decision tree; the decision tree is generated according to the corresponding relation between the classification information and the classification codes of a plurality of commodities, and each node of the decision tree corresponds to one classification code; the decision tree comprises the at least one matching node;
and determining the classification codes of the objects to be processed according to the coded depth and the at least one matching node.
12. A method of decision tree construction, comprising:
obtaining sample data comprising article classification rules;
acquiring classification codes of a plurality of classifications of the articles and description information of the articles corresponding to each classification code according to the sample data;
constructing a decision tree according to the relation between the classification codes, so that each offspring node of the decision tree corresponds to one classification code, and the child node of each father node in the decision tree corresponds to the child classification under the classification code to which the father node belongs;
And mounting the description information of the article corresponding to each classification code on the corresponding node of each classification code in the decision tree.
13. The method as recited in claim 12, further comprising: and under the condition that the description information corresponding to different classification codes belongs to the same set dimension in a plurality of set dimensions, obtaining a tax label according to the same set dimension, and mounting the tax label on all nodes corresponding to the same set dimension.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory, the processor implementing the method of any one of claims 1-13 when the computer program is executed.
CN202310328600.XA 2023-03-27 2023-03-27 Customs, customs code determination, decision tree construction method and medium Active CN116776831B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310328600.XA CN116776831B (en) 2023-03-27 2023-03-27 Customs, customs code determination, decision tree construction method and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310328600.XA CN116776831B (en) 2023-03-27 2023-03-27 Customs, customs code determination, decision tree construction method and medium

Publications (2)

Publication Number Publication Date
CN116776831A CN116776831A (en) 2023-09-19
CN116776831B true CN116776831B (en) 2023-12-22

Family

ID=87993782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310328600.XA Active CN116776831B (en) 2023-03-27 2023-03-27 Customs, customs code determination, decision tree construction method and medium

Country Status (1)

Country Link
CN (1) CN116776831B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109639283A (en) * 2018-11-26 2019-04-16 南京理工大学 Workpiece coding method based on decision tree
CN110858219A (en) * 2018-08-17 2020-03-03 菜鸟智能物流控股有限公司 Logistics object information processing method and device and computer system
WO2020068421A1 (en) * 2018-09-28 2020-04-02 Dow Global Technologies Llc Hybrid machine learning model for code classification
CN111753928A (en) * 2020-07-29 2020-10-09 北京人人云图信息技术有限公司 Customs inspection rule generation method based on knowledge graph and tree model construction
CN112529420A (en) * 2020-12-14 2021-03-19 深圳市钛师傅云有限公司 Intelligent classification method and system for customs commodity codes
CN113779933A (en) * 2021-09-03 2021-12-10 深圳市朗华供应链服务有限公司 Commodity encoding method, electronic device and computer-readable storage medium
CN114548041A (en) * 2020-11-27 2022-05-27 华晨宝马汽车有限公司 Method, electronic device and medium for recommending HS codes for goods
WO2022266013A1 (en) * 2021-06-15 2022-12-22 Avalara, Inc. System for assisting searches for codes corresponding to items using decision trees

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030163447A1 (en) * 2002-02-28 2003-08-28 Monika Sandman Method and tool for assignment of item number by mapping of classification and generation of a decision tree
WO2019055385A1 (en) * 2017-09-12 2019-03-21 Walmart Apollo, Llc Systems and methods for automated harmonized (hs) code assignment
US11531447B1 (en) * 2021-06-15 2022-12-20 Avalara, Inc. System for assisting searches for codes corresponding to items using decision trees

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110858219A (en) * 2018-08-17 2020-03-03 菜鸟智能物流控股有限公司 Logistics object information processing method and device and computer system
WO2020068421A1 (en) * 2018-09-28 2020-04-02 Dow Global Technologies Llc Hybrid machine learning model for code classification
CA3114096A1 (en) * 2018-09-28 2020-04-02 Dow Global Technologies Llc Hybrid machine learning model for code classification
CN112997200A (en) * 2018-09-28 2021-06-18 陶氏环球技术有限责任公司 Hybrid machine learning model for code classification
CN109639283A (en) * 2018-11-26 2019-04-16 南京理工大学 Workpiece coding method based on decision tree
CN111753928A (en) * 2020-07-29 2020-10-09 北京人人云图信息技术有限公司 Customs inspection rule generation method based on knowledge graph and tree model construction
CN114548041A (en) * 2020-11-27 2022-05-27 华晨宝马汽车有限公司 Method, electronic device and medium for recommending HS codes for goods
CN112529420A (en) * 2020-12-14 2021-03-19 深圳市钛师傅云有限公司 Intelligent classification method and system for customs commodity codes
WO2022266013A1 (en) * 2021-06-15 2022-12-22 Avalara, Inc. System for assisting searches for codes corresponding to items using decision trees
CN113779933A (en) * 2021-09-03 2021-12-10 深圳市朗华供应链服务有限公司 Commodity encoding method, electronic device and computer-readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于数据挖掘的海关风险分类预测模型研究;周欣 等;《海关与经贸研究》;第38卷(第02期);22-31 *
机器学习分类算法在中国工业企业数据库和海关数据库匹配上的应用;尹宇星;《中国优秀硕士学位论文全文数据库 (社会科学Ⅱ辑)》(第202208期);H123-432 *
海关商品智能归类算法研究与系统实现;王涛;《中国优秀硕士学位论文全文数据库 信息科技辑》(第202103期);I140-55 *

Also Published As

Publication number Publication date
CN116776831A (en) 2023-09-19

Similar Documents

Publication Publication Date Title
CN108985912A (en) data reconciliation
Singh et al. Does service quality influence operational and financial performance of third party logistics service providers? A mixed multi criteria decision making-text mining-based investigation
CN113784806A (en) Closed loop recovery process and system
CN110288484B (en) Insurance classification user recommendation method and system based on big data platform
US20140136440A1 (en) System and process of associating import and/or export data with a corporate identifier relating to buying and supplying goods
CN109118316B (en) Method and device for identifying authenticity of online shop
Voican Credit Card Fraud Detection using Deep Learning Techniques.
CN106776897A (en) A kind of user's portrait label determines method and device
US20210174150A1 (en) Automated Classification Engine with Human Augmentation
CN110569904A (en) method for constructing machine learning model and computer-readable storage medium
CN112182207A (en) Invoice false-proof risk assessment method based on keyword extraction and rapid text classification
Altaheri et al. Exploring machine learning models to predict harmonized system code
CN107944905A (en) A kind of method and system of construction enterprises' material purchases price analysis
CN111626331A (en) Automatic industry classification device and working method thereof
Shaukat et al. An analysis of blessed Friday sale at a retail store using classification models
CN107992613A (en) A kind of Text Mining Technology protection of consumers' rights index analysis method based on machine learning
CN116776831B (en) Customs, customs code determination, decision tree construction method and medium
Tater et al. AI driven accounts payable transformation
CN117522134A (en) Compliance risk early warning method and system for online blind box operation
CN117391478A (en) Product supply chain management system for cargo transportation
CN115952186A (en) Problem data and link tracing method and device thereof
Chiang Applying a new model of customer value on international air passengers' market in Taiwan
Aguirre-Rodríguez et al. A decision-making framework with machine learning for transport outsourcing based on cost prediction: an application in a multinational automotive company
CN114676253A (en) Metadata hierarchical classification method based on machine learning algorithm
CN114219570A (en) Online and offline combined steel waste and defective material full-process management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.

CP02 Change in the address of a patent holder
TR01 Transfer of patent right

Effective date of registration: 20240103

Address after: 310018 Room 608, floor 6, building 1, Baiyang Street Comprehensive Bonded Zone, Qiantang new area, Hangzhou, Zhejiang Province

Patentee after: Zhejiang wodewei Digital Technology Service Co.,Ltd.

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.

TR01 Transfer of patent right
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40100538

Country of ref document: HK