WO2015027837A1 - Device and method for mailing address completion - Google Patents

Device and method for mailing address completion Download PDF

Info

Publication number
WO2015027837A1
WO2015027837A1 PCT/CN2014/084610 CN2014084610W WO2015027837A1 WO 2015027837 A1 WO2015027837 A1 WO 2015027837A1 CN 2014084610 W CN2014084610 W CN 2014084610W WO 2015027837 A1 WO2015027837 A1 WO 2015027837A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
text
completion
unit
sequence
Prior art date
Application number
PCT/CN2014/084610
Other languages
French (fr)
Chinese (zh)
Inventor
王国印
贾西贝
Original Assignee
深圳市华傲数据技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市华傲数据技术有限公司 filed Critical 深圳市华傲数据技术有限公司
Publication of WO2015027837A1 publication Critical patent/WO2015027837A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/48Message addressing, e.g. address format or anonymous messages, aliases

Definitions

  • the present invention relates to the field of communication addresses, and in particular, to an apparatus and method for complementing a communication address.
  • the e-commerce and logistics industries are inseparable from the communication address (also known as the communication address, referred to as the address) and the postal code. These data need to be provided by the user. However, in practice, the user cannot fully remember the complete communication address or only the input part. Address short name. In order to solve this problem, some e-commerce websites or logistics companies currently provide drop-down menus for users to select when the address is input. This fixed selection is generally given to the city-level address, and other parts need to be manually performed by the user. Input.
  • the above method of prompting is too cumbersome, and the prompt result is not comprehensive, and the randomness of the user input cannot be satisfied. Therefore, it is necessary to implement a method of complementing the input text of the user, and complete the address input by the user arbitrarily.
  • the communication address is convenient for user input and makes the input result accurate.
  • the common address construction rules are as follows: national administrative district + prefecture administrative district + county administrative district + road + house number + building name + room number . For example: Room 2208, International Student Venture Building, No. 29, Gaoxin South Ring Road, Nanshan District, Shenzhen, Guangdong, China.
  • the present invention has been made to solve one of the above drawbacks. Accordingly, the present invention provides an apparatus and method for complementing a communication address by performing input address text After pre-processing, the address is segmented and labeled, and a Query statement is generated for address resolution. The most similar standard address is retrieved and the address is complemented, thereby achieving accurate standardization results after address completion, and satisfying the user's random input. It also eliminates the cumbersome process of manually inputting the complete communication address and improves the user experience.
  • an embodiment of the present invention provides a device for complementing a communication address, the device comprising: an address text preprocessing unit, configured to:
  • Pre-processing the input address text including deleting extra spaces, converting full-width characters of numbers or letters to half-width characters;
  • the address text processed by the address text preprocessing unit is divided into address sequences, and the address sequence is marked with the corresponding address category;
  • the address index file the most similar standard address is obtained, and the address text is complemented.
  • the device includes: the address segmentation and labeling unit pre-establishes an address metabase, and obtains an address text processed by the address text preprocessing unit to perform address segmentation;
  • the address category is the place name level value corresponding to the place name.
  • the apparatus further comprises: marking the segmented address sequence with all possible address levels thereof.
  • the address completion unit includes an address resolver.
  • the address completion unit includes: the address completion unit generates a Query statement by marking the address text;
  • the address resolver obtains and parses the Query statement and retrieves the most similar standard address based on the address index file search.
  • the address completion unit further comprises: the address completion unit generating a Query statement enclosing the address metadata in the address sequence with a half-quotation mark.
  • Another embodiment of the present invention provides a method for complementing a communication address, the method comprising the steps of: preprocessing an input address text, including deleting extra spaces, and converting a full-width character of a number or a letter Change to a half-width character;
  • Address fragmentation of the address text forms an address sequence, and the address sequence is marked with the corresponding address category; according to the address index file, the most similar standard address is obtained, and then the address text is complemented.
  • the address completion includes address resolution; the address completion generates the Query statement by the label address; the address parsing obtains the Query statement and parses, and then obtains the most similar according to the address index file retrieval. Standard address.
  • the invention performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address analysis, retrieves the most similar standard address and performs address completion, thereby realizing accurate standardization result after address completion. It satisfies the user's random input and eliminates the cumbersome process of manually inputting the complete communication address to improve the user experience.
  • FIG. 1 is a schematic diagram of an apparatus for complementing a communication address implemented by an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a method for completing communication address completion according to an embodiment of the present invention.
  • the invention provides a device and a method for complementing a communication address, which performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address resolution, searches for the most similar standard address, and performs address completion.
  • a communication address which performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address resolution, searches for the most similar standard address, and performs address completion.
  • FIG. 1 is a schematic diagram of an apparatus for complementing a communication address implemented by an embodiment of the present invention.
  • the apparatus includes an address text pre-processing unit 10, an address segmentation and labeling unit 20, an address resolver 30, and an address completion unit 40.
  • the address text pre-processing unit 10 obtains the input address text and performs pre-processing on the acquired address text.
  • the pre-processing includes deleting extra spaces and converting numbers or letters into half-width. Character.
  • the address segmentation and labeling unit 20 obtains the address text processed in the address text pre-processing unit 10.
  • the address segmentation and labeling unit 20 pre-establishes an address metabase, and the address segmentation and labeling unit 20 divides the address text according to the address metadata, and the result of the segmentation is the address sequence corresponding to the address metadata.
  • the address segmentation and labeling unit 20 is defined by the following place name categories, as shown in Table 1 below:
  • Table 1 Table of place name definitions.
  • the place name segmentation and labeling unit 20 labels the above-mentioned place name sequence with the corresponding place name category according to the definition of Table 1, and the place name segmentation and the place name category marked by the tag unit 20 are all possible place name levels of the place name sequence.
  • the place name segmentation and labeling unit 20 encloses the address metadata in the address sequence with half-width quotation marks and generates a Query statement and sends it to the address completion unit 40.
  • the address completion unit 40 receives the Query statement of the place name segmentation and labeling unit 20 Sended to the address resolver 30, the address resolver 30 receives the Query statement and parses it, and the address resolver 30 pre-establishes an address index file, and searches the parsed sequence of place names in the address index file to obtain the most similar standard address and sends it to The address completion unit 40 receives the address division and the standard address completion address text sent from the address resolver 30.
  • FIG. 2 it is a schematic flowchart of the method, which is specifically as follows:
  • Step S110 Pre-processing the input address text, including deleting extra spaces, converting full-width characters of numbers or letters into half-width characters.
  • Step S120 Perform address segmentation on the address text to form an address sequence, and mark the address sequence with the corresponding address class.
  • Step S120 obtains the address text processed in step S110, and step S120 divides the obtained address text according to the pre-established address metabase, and the segmentation result is an address sequence corresponding to the address metadata.
  • the address tag labels the address sequence according to the place name category defined in Table 1 above, and the label result is the place name category corresponding to the address sequence, and labels all possible place name levels of the address sequence.
  • Step S130 Obtain the most similar standard address according to the address index file, and then complete the address text.
  • the address completion includes step address resolution, the address completion generates the Query statement by the label address and sends the Query statement to the address resolution step, the address parsing obtains the Query statement and parses, and then obtains the most according to the address index file search. Similar standard address, then feedback the standard address to the address In the full step, the address completion step performs the completion of the address text according to the standard address.
  • the invention performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address analysis, retrieves the most similar standard address and performs address completion, thereby realizing accurate standardization result after address completion. It satisfies the user's random input and eliminates the cumbersome process of manually inputting the complete communication address to improve the user experience.

Abstract

The present invention provides a device for mailing address completion, said device comprising: an address text preprocessing unit, an address segmentation and labeling unit; and an address completion unit. The address completion unit includes an address parser. The present invention further provides a method for mailing address completion, said method comprising: preprocessing an inputted address text, comprising deleting redundant spaces and converting full-width character numbers or letters into half-width characters; performing address segmentation on the address text to form an address sequence, and labeling the address sequence with the corresponding address class; obtaining, according to an address index file, the most similar standardized address, and thus completing the address text. The present invention achieves precise standardized address completion results, allowing for arbitrary input by a user and eliminating the tedious process of manually inputting complete mailing addresses, improving user experience.

Description

一种通信地址补全的装置及方法  Device and method for complementing communication address
技术领域 Technical field
本发明涉及通信地址领域, 尤其涉及一种通信地址补全的装置及方法。  The present invention relates to the field of communication addresses, and in particular, to an apparatus and method for complementing a communication address.
背景技术 Background technique
随着电子商务的突飞猛进和物流行业的信息化, 使得人们在足不出户的情况下 完成购物和邮寄物品, 大大节约了时间和金钱成本。 电子商务和物流行业都离不开 通信地址 (又称为通讯地址, 简称为地址) 和邮编, 这些数据都需要用户提供, 然 而在实际中用户并不能完全记得完整的通信地址或者只会输入部分地址简称。 为了 解决这一问题, 当前一些电子商务网站或者物流公司会在地址输入时提供下拉菜单 让用户进行选择, 这种固定选择一般也就是给到地市级地址为止, 其他部分还是需 要用户自己手工进行输入。 上述这种提示的方法过于繁琐, 且提示结果不全面, 无法满足用户输入的随意 性, 所以需要实现一种对用户输入地址文本进行补全的办法, 将用户随意性输入的 地址补全为标准化的通信地址, 方便用户输入, 并使得输入结果精确。 当前使用的通讯地址使用模式主要有两种: 以道路为中心定位的地址, 常见的 地址构造规则如下:省级行政区 +地级行政区 +县级行政区 +道路 +门牌号 +建筑物名 + 房间号。 如: 广东省深圳市南山区高新南环路 29号留学生创业大厦 2208室。 此种 地址描述模式常见于电子地图中, 如百度地图, 谷歌地图等; 以行政区划为中心地 位的地址, 常见的地址构造规则如下: 省级行政区 +地级行政区 +县级行政区 +乡 /镇  With the rapid development of e-commerce and the informationization of the logistics industry, people can save time and money by completing shopping and mailing items without leaving their homes. The e-commerce and logistics industries are inseparable from the communication address (also known as the communication address, referred to as the address) and the postal code. These data need to be provided by the user. However, in practice, the user cannot fully remember the complete communication address or only the input part. Address short name. In order to solve this problem, some e-commerce websites or logistics companies currently provide drop-down menus for users to select when the address is input. This fixed selection is generally given to the city-level address, and other parts need to be manually performed by the user. Input. The above method of prompting is too cumbersome, and the prompt result is not comprehensive, and the randomness of the user input cannot be satisfied. Therefore, it is necessary to implement a method of complementing the input text of the user, and complete the address input by the user arbitrarily. The communication address is convenient for user input and makes the input result accurate. There are two main modes of communication address currently used: the address centered on the road. The common address construction rules are as follows: provincial administrative district + prefecture administrative district + county administrative district + road + house number + building name + room number . For example: Room 2208, International Student Venture Building, No. 29, Gaoxin South Ring Road, Nanshan District, Shenzhen, Guangdong, China. Such address description modes are commonly found in electronic maps, such as Baidu maps, Google maps, etc.; addresses with administrative divisions as the central position, common address construction rules are as follows: Provincial administrative districts + prefecture-level administrative districts + county-level administrative districts + townships/towns
/街道 +居 (村) 委会 +小区 /自然村。 如: 广东省深圳市宝安区西乡街道流塘居委会 宝民花园。 此种地址描述模式常见于政府部门, 如民政局等。 / Street + Residence (Village) Committee + Community / Natural Village. Such as: Baomin Garden, Liutang Neighborhood Committee, Xixiang Street, Baoan District, Shenzhen, Guangdong, China. Such address description patterns are common in government departments, such as the Civil Affairs Bureau.
发明内容 Summary of the invention
为此, 本发明为了解决上述缺陷之一。 因而, 本发明提供一种通信地址补全的装置及方法, 通过对输入地址文本进行 预处理后进行地址切分和标注, 并生成 Query语句进行地址解析, 检索获得最相似 的标准地址并进行地址补全, 从而实现了地址补全后精确的标准化结果, 满足用户 的随意性输入, 并免去了人工输入完整通信地址的繁琐过程, 提高用户体验。 To this end, the present invention has been made to solve one of the above drawbacks. Accordingly, the present invention provides an apparatus and method for complementing a communication address by performing input address text After pre-processing, the address is segmented and labeled, and a Query statement is generated for address resolution. The most similar standard address is retrieved and the address is complemented, thereby achieving accurate standardization results after address completion, and satisfying the user's random input. It also eliminates the cumbersome process of manually inputting the complete communication address and improves the user experience.
所以, 本发明一个实施例提供一种通信地址补全的装置, 该装置包括: 地址文本预处理单元, 用于:  Therefore, an embodiment of the present invention provides a device for complementing a communication address, the device comprising: an address text preprocessing unit, configured to:
将输入的地址文本进行预处理, 包括删除多余的空格、 将数字或字母的全角字符转 换为半角字符; Pre-processing the input address text, including deleting extra spaces, converting full-width characters of numbers or letters to half-width characters;
地址切分与标注单元, 用于: Address segmentation and labeling unit for:
将经过地址文本预处理单元处理后的地址文本切分成地址序列, 并将地址序列标注 上对应的地址类别; The address text processed by the address text preprocessing unit is divided into address sequences, and the address sequence is marked with the corresponding address category;
地址补全单元, 用于: Address completion unit, used to:
根据地址索引文件, 获得最相似的标准地址, 进而将地址文本进行补全。 According to the address index file, the most similar standard address is obtained, and the address text is complemented.
在本发明一个实施例中, 所述装置包括: 所述地址切分与标注单元预先建立地 址元数据库, 获取地址文本预处理单元处理后的地址文本进行地址切分; 所述地址序列标注上对应的地址类别为地名所对应的地名等级值。  In an embodiment of the present invention, the device includes: the address segmentation and labeling unit pre-establishes an address metabase, and obtains an address text processed by the address text preprocessing unit to perform address segmentation; The address category is the place name level value corresponding to the place name.
优选地, 所述装置还包括: 将切分好的地址序列标注上其所有可能的地址等级。 优选地, 所述地址补全单元包括一个地址解析器。  Preferably, the apparatus further comprises: marking the segmented address sequence with all possible address levels thereof. Preferably, the address completion unit includes an address resolver.
在本发明一个实施例中, 所述地址补全单元包括: 所述地址补全单元将标注好 的地址文本生成 Query语句;  In an embodiment of the present invention, the address completion unit includes: the address completion unit generates a Query statement by marking the address text;
所述地址解析器获得 Query语句并进行解析, 根据地址索引文件检索获得最相似的 标准地址。 The address resolver obtains and parses the Query statement and retrieves the most similar standard address based on the address index file search.
优选地, 所述地址补全单元还包括: 所述地址补全单元生成 Query语句以半角 引号把地址序列中的地址元数据括起来。  Preferably, the address completion unit further comprises: the address completion unit generating a Query statement enclosing the address metadata in the address sequence with a half-quotation mark.
本发明另一个实施例提供一种通信地址补全的方法, 该方法包括以下步骤: 对输入的地址文本进行预处理, 包括删除多余的空格、 将数字或字母的全角字符转 换为半角字符; Another embodiment of the present invention provides a method for complementing a communication address, the method comprising the steps of: preprocessing an input address text, including deleting extra spaces, and converting a full-width character of a number or a letter Change to a half-width character;
对地址文本进行地址切分形成地址序列, 并将地址序列标注上对应的地址类别; 根据地址索引文件, 获得最相似的标准地址, 进而将地址文本进行补全。 Address fragmentation of the address text forms an address sequence, and the address sequence is marked with the corresponding address category; according to the address index file, the most similar standard address is obtained, and then the address text is complemented.
在本发明一个实施例中, 地址补全包括地址解析; 所述地址补全将所述标注地 址生成 Query语句; 所述地址解析获得 Query语句并进行解析, 然后根据地址索引 文件检索获得最相似的标准地址。 本发明通过对输入地址文本进行预处理后进行地 址切分和标注, 并生成 Query语句进行地址解析, 检索获得最相似的标准地址并进 行地址补全, 从而实现了地址补全后精确的标准化结果, 满足用户的随意性输入, 并免去了人工输入完整通信地址的繁琐过程, 提高用户体验。  In an embodiment of the present invention, the address completion includes address resolution; the address completion generates the Query statement by the label address; the address parsing obtains the Query statement and parses, and then obtains the most similar according to the address index file retrieval. Standard address. The invention performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address analysis, retrieves the most similar standard address and performs address completion, thereby realizing accurate standardization result after address completion. It satisfies the user's random input and eliminates the cumbersome process of manually inputting the complete communication address to improve the user experience.
附图说明 DRAWINGS
图 1是本发明实施例实现的一种通信地址补全的装置的示意图。  FIG. 1 is a schematic diagram of an apparatus for complementing a communication address implemented by an embodiment of the present invention.
图 2是本发明实施例实现的一种通信地址补全的方法程示意图。  FIG. 2 is a schematic diagram of a method for completing communication address completion according to an embodiment of the present invention.
具体实施方式 detailed description
为了使本发明的目的、 技术方案及优点更加清楚明白, 以下结合附图及实施例, 对本发明进行进一步的详细说明。 应当理解, 此处所描述的具体实施例仅仅用于解 释本发明, 并不用于限定本发明。  The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
本发明提供一种通信地址补全的装置及方法, 通过对输入地址文本进行预处理 后进行地址切分和标注, 并生成 Query语句进行地址解析, 检索获得最相似的标准 地址并进行地址补全, 从而实现了地址补全后精确的标准化结果, 满足用户的随意 性输入, 并免去了人工输入完整通信地址的繁琐过程, 提高用户体验。  The invention provides a device and a method for complementing a communication address, which performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address resolution, searches for the most similar standard address, and performs address completion. Thus, the accurate standardization result after address completion is realized, the user's random input is satisfied, and the cumbersome process of manually inputting the complete communication address is eliminated, and the user experience is improved.
如图 1 是本发明实施例实现的一种通信地址补全的装置的示意图, 该装置包括 地址文本预处理单元 10, 地址切分与标注单元 20, 地址解析器 30和地址补全单元 40。 在本发明一个实施中, 地址文本预处理单元 10获得输入的地址文本, 并对所获 取的地址文本进行预处理, 预处理包括删除多余的空格和将数字或字母转换成半角 字符。 FIG. 1 is a schematic diagram of an apparatus for complementing a communication address implemented by an embodiment of the present invention. The apparatus includes an address text pre-processing unit 10, an address segmentation and labeling unit 20, an address resolver 30, and an address completion unit 40. In one implementation of the present invention, the address text pre-processing unit 10 obtains the input address text and performs pre-processing on the acquired address text. The pre-processing includes deleting extra spaces and converting numbers or letters into half-width. Character.
在本发明一个实施中, 地址切分与标注单元 20获得地址文本预处理单元 10中 处理后的地址文本。 地址切分与标注单元 20 预先建立地址元数据库, 地址切分与 标注单元 20 根据地址元数据将上述地址文本进行切分, 切分结果为地址元数据对 应的地址序列。  In one implementation of the present invention, the address segmentation and labeling unit 20 obtains the address text processed in the address text pre-processing unit 10. The address segmentation and labeling unit 20 pre-establishes an address metabase, and the address segmentation and labeling unit 20 divides the address text according to the address metadata, and the result of the segmentation is the address sequence corresponding to the address metadata.
在本发明一个实施中, 地址切分与标注单元 20采用以下地名类别定义, 如下表 1所示: In one implementation of the present invention, the address segmentation and labeling unit 20 is defined by the following place name categories, as shown in Table 1 below:
Figure imgf000007_0001
表 1 : 地名类别定义表。
Figure imgf000007_0001
Table 1: Table of place name definitions.
地名切分与标注单元 20根据表 1 的定义将上述地名序列标注上对应的地名类 别, 地名切分与标注单元 20标注的地名类别为地名序列所有可能存在的地名等级。 地名切分与标注单元 20以半角引号把地址序列中的地址元数据括起来并生成 Query 语句发送给地址补全单元 40, 地址补全单元 40接收到地名切分与标注单元 20 的 Query语句后发送给地址解析器 30, 地址解析器 30接收 Query语句并进行解析, 地址解析器 30 预先建立地址索引文件, 并将解析后的地名序列在地址索引文件中 进行检索获得最相似的标准地址发送给地址补全单元 40, 地址补全单元 40接收到 地址切分与地址解析器 30发来的标准地址补全地址文本。  The place name segmentation and labeling unit 20 labels the above-mentioned place name sequence with the corresponding place name category according to the definition of Table 1, and the place name segmentation and the place name category marked by the tag unit 20 are all possible place name levels of the place name sequence. The place name segmentation and labeling unit 20 encloses the address metadata in the address sequence with half-width quotation marks and generates a Query statement and sends it to the address completion unit 40. After the address completion unit 40 receives the Query statement of the place name segmentation and labeling unit 20 Sended to the address resolver 30, the address resolver 30 receives the Query statement and parses it, and the address resolver 30 pre-establishes an address index file, and searches the parsed sequence of place names in the address index file to obtain the most similar standard address and sends it to The address completion unit 40 receives the address division and the standard address completion address text sent from the address resolver 30.
本发明另一个实施例提供一种通信地址补全的方法, 如图 2所示是该方法的具 体流程示意图, 具体为以下步骤:  Another embodiment of the present invention provides a method for complementing a communication address. As shown in FIG. 2, it is a schematic flowchart of the method, which is specifically as follows:
步骤 S 110 : 对输入的地址文本进行预处理, 包括删除多余的空格、 将数字或字母的 全角字符转换为半角字符。 Step S110: Pre-processing the input address text, including deleting extra spaces, converting full-width characters of numbers or letters into half-width characters.
步骤 S 120: 对地址文本进行地址切分形成地址序列, 并将地址序列标注上对应 的地址类别。 步骤 S 120获得步骤 S 110处理后的地址文本, 步骤 S 120根据预先建 立的地址元数据库将获得的地址文本进行切分, 切分结果为地址元数据对应的地址 序列。  Step S120: Perform address segmentation on the address text to form an address sequence, and mark the address sequence with the corresponding address class. Step S120 obtains the address text processed in step S110, and step S120 divides the obtained address text according to the pre-established address metabase, and the segmentation result is an address sequence corresponding to the address metadata.
在本发明一个实施例中, 地址标注根据上表 1 所定义的地名类别将地址序列进 行标注, 标注结果为地址序列对应的地名类别, 并标注出地址序列所有可能存在的 的地名等级。  In one embodiment of the present invention, the address tag labels the address sequence according to the place name category defined in Table 1 above, and the label result is the place name category corresponding to the address sequence, and labels all possible place name levels of the address sequence.
步骤 S 130 : 根据地址索引文件, 获得最相似的标准地址, 进而将地址文本进行 补全。  Step S130: Obtain the most similar standard address according to the address index file, and then complete the address text.
在本发明一个实施例中, 地址补全包括步骤地址解析, 地址补全将上述标注地 址生成 Query语句并发送给地址解析步骤, 地址解析获得 Query语句并进行解析, 然后根据地址索引文件检索获得最相似的标准地址, 然后将标准地址反馈给地址补 全步骤, 地址补全步骤根据标准地址进行地址文本的补全。 本发明通过对输入地址 文本进行预处理后进行地址切分和标注, 并生成 Query语句进行地址解析, 检索获 得最相似的标准地址并进行地址补全, 从而实现了地址补全后精确的标准化结果, 满足用户的随意性输入, 并免去了人工输入完整通信地址的繁琐过程, 提高用户体 验。 In an embodiment of the present invention, the address completion includes step address resolution, the address completion generates the Query statement by the label address and sends the Query statement to the address resolution step, the address parsing obtains the Query statement and parses, and then obtains the most according to the address index file search. Similar standard address, then feedback the standard address to the address In the full step, the address completion step performs the completion of the address text according to the standard address. The invention performs address segmentation and labeling by preprocessing the input address text, generates a Query statement for address analysis, retrieves the most similar standard address and performs address completion, thereby realizing accurate standardization result after address completion. It satisfies the user's random input and eliminates the cumbersome process of manually inputting the complete communication address to improve the user experience.
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明, 不能认 定本发明的具体实施只局限于这些说明。 对于本发明所属技术领域的普通技术人员 来说, 在不脱离本发明构思的前提下, 还可以做出若干简单推演或替换。  The above is a further detailed description of the present invention in conjunction with the specific preferred embodiments. It is not intended that the specific embodiments of the invention are limited to the description. For those skilled in the art to which the present invention pertains, several simple derivations or substitutions may be made without departing from the inventive concept.

Claims

1. 一种通信地址补全的装置, 其特征在于, 该装置包括: A device for complementing a communication address, the device comprising:
地址文本预处理单元, 用于: Address text preprocessing unit for:
将输入的地址文本进行预处理, 包括删除多余的空格、 将数字或字母的全角字符转 换为半角字符; Pre-processing the input address text, including deleting extra spaces, converting full-width characters of numbers or letters to half-width characters;
地址切分与标注单元, 用于: Address segmentation and labeling unit for:
将经过地址文本预处理单元处理后的地址文本切分成地址序列, 并将地址序列标注 上对应的地址类别; The address text processed by the address text preprocessing unit is divided into address sequences, and the address sequence is marked with the corresponding address category;
地址补全单元, 用于: Address completion unit, used to:
根据地址索引文件, 获得最相似的标准地址, 进而将地址文本进行补全。 According to the address index file, the most similar standard address is obtained, and the address text is complemented.
2. 根据权利要求 1所述的装置, 其特征在于, 所述装置包括:  2. The device according to claim 1, wherein the device comprises:
所述地址切分与标注单元预先建立地址元数据库, 获取地址文本预处理单元处理后 的地址文本进行地址切分; The address segmentation and labeling unit pre-establishes an address metabase, and obtains an address text processed by the address text preprocessing unit for address segmentation;
所述地址序列标注上对应的地址类别为地名所对应的地名等级值。 The address sequence is marked with a corresponding address category as a place name level value corresponding to the place name.
3. 根据权利要求 1或 2所述的装置, 其特征在于, 所述装置还包括:  The device according to claim 1 or 2, wherein the device further comprises:
将切分好的地址序列标注上其所有可能的地址等级。 Label the segmented address sequence with all possible address levels.
4. 根据权利要求 1所述的装置, 其特征在于, 所述地址补全单元包括一个地址解析 器。  4. The apparatus according to claim 1, wherein the address completion unit comprises an address resolver.
5. 根据权利要求 1或 4所述的装置, 其特征在于, 所述地址补全单元包括: 所述地址补全单元将标注好的地址文本生成 Query语句;  The device according to claim 1 or 4, wherein the address completion unit comprises: the address completion unit generating a Query statement by using the marked address text;
所述地址解析器获得 Query语句并进行解析, 根据地址索引文件检索获得最相似的 标准地址。 The address resolver obtains and parses the Query statement and retrieves the most similar standard address based on the address index file search.
6. 根据权利要求 4或 5所述的装置, 其特征在于, 所述地址补全单元还包括: 所述地址补全单元生成 Query语句以半角引号把地址序列中的地址元数据括起来。 The device according to claim 4 or 5, wherein the address completion unit further comprises: the address completion unit generating a Query statement enclosing the address metadata in the address sequence with a half-quotation mark.
7. 一种通信地址补全的方法, 其特征在于, 该方法包括以下步骤: 对输入的地址文本进行预处理, 包括删除多余的空格、 将数字或字母的全角字符转 换为半角字符; A method for complementing a communication address, the method comprising the steps of: Pre-processing the input address text, including deleting extra spaces, converting full-width characters of numbers or letters into half-width characters;
对地址文本进行地址切分形成地址序列, 并将地址序列标注上对应的地址类别; 根据地址索引文件, 获得最相似的标准地址, 进而将地址文本进行补全。 Address fragmentation of the address text forms an address sequence, and the address sequence is marked with the corresponding address category; according to the address index file, the most similar standard address is obtained, and then the address text is complemented.
8. 根据权利要求 7所述的方法, 其特征在于, 所述地址补全包括地址解析; 所述地址补全将所述标注地址生成 Query语句; 8. The method according to claim 7, wherein the address completion comprises address resolution; the address completion generates a Query statement by the label address;
所述地址解析获得 Query语句并进行解析, 然后根据地址索引文件检索获得最相似 的标准地址。 The address resolution obtains the Query statement and parses it, and then retrieves the most similar standard address according to the address index file search.
PCT/CN2014/084610 2013-08-30 2014-08-18 Device and method for mailing address completion WO2015027837A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310386689.1 2013-08-30
CN2013103866891A CN103473289A (en) 2013-08-30 2013-08-30 Device and method for completing communication addresses

Publications (1)

Publication Number Publication Date
WO2015027837A1 true WO2015027837A1 (en) 2015-03-05

Family

ID=49798137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/084610 WO2015027837A1 (en) 2013-08-30 2014-08-18 Device and method for mailing address completion

Country Status (2)

Country Link
CN (1) CN103473289A (en)
WO (1) WO2015027837A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145095A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Information of place names matching process, information matching method, device and computer equipment
US10373103B2 (en) 2015-11-11 2019-08-06 International Business Machines Corporation Decision-tree based address-station matching
CN111522901A (en) * 2020-03-18 2020-08-11 大箴(杭州)科技有限公司 Method and device for processing address information in text

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473289A (en) * 2013-08-30 2013-12-25 深圳市华傲数据技术有限公司 Device and method for completing communication addresses
CN105988988A (en) * 2015-02-13 2016-10-05 阿里巴巴集团控股有限公司 Method and device for processing text address
CN106033460A (en) * 2015-03-19 2016-10-19 阿里巴巴集团控股有限公司 Address data processing method and apparatus
CN106156145A (en) * 2015-04-13 2016-11-23 阿里巴巴集团控股有限公司 The management method of a kind of address date and device
CN105468791B (en) * 2016-01-05 2019-11-15 北京信息科技大学 A kind of integrality expression for the geographical location entity known based on interacting Question-Answer community-Baidu
CN107025232A (en) * 2016-01-29 2017-08-08 阿里巴巴集团控股有限公司 The processing method and processing device of address information in logistics system
CN106777300A (en) * 2016-12-30 2017-05-31 深圳市华傲数据技术有限公司 Base address base construction method and system
CN106709065B (en) * 2017-01-19 2020-08-04 国家电网公司 Address information standardization processing method and device
CN107609406A (en) * 2017-08-09 2018-01-19 南京邮电大学 A kind of express delivery address encryption method based on geocoding
CN110826318A (en) * 2019-10-14 2020-02-21 浙江数链科技有限公司 Method, device, computer device and storage medium for logistics information identification
CN113569564B (en) * 2021-07-30 2024-03-19 拉扎斯网络科技(上海)有限公司 Address information processing and displaying method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276327A (en) * 2007-03-27 2008-10-01 富士通株式会社 Address recognition device
CN102073724A (en) * 2011-01-11 2011-05-25 深圳市络道科技有限公司 System and method for automatically identifying Chinese address subscribers
CN102298585A (en) * 2010-06-24 2011-12-28 高德软件有限公司 Address splitting and level marking method and device
CN102750351A (en) * 2012-06-11 2012-10-24 迪尔码国际营销服务(北京)有限公司 Matching method of address information based on rules
CN103473289A (en) * 2013-08-30 2013-12-25 深圳市华傲数据技术有限公司 Device and method for completing communication addresses

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7930430B2 (en) * 2009-07-08 2011-04-19 Xobni Corporation Systems and methods to provide assistance during address input
CN102955833B (en) * 2011-08-31 2015-11-25 深圳市华傲数据技术有限公司 A kind of address identification, standardized method
CN103440312B (en) * 2013-08-27 2019-01-22 深圳市华傲数据技术有限公司 A kind of system and terminal of mailing address inquiry postcode

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276327A (en) * 2007-03-27 2008-10-01 富士通株式会社 Address recognition device
CN102298585A (en) * 2010-06-24 2011-12-28 高德软件有限公司 Address splitting and level marking method and device
CN102073724A (en) * 2011-01-11 2011-05-25 深圳市络道科技有限公司 System and method for automatically identifying Chinese address subscribers
CN102750351A (en) * 2012-06-11 2012-10-24 迪尔码国际营销服务(北京)有限公司 Matching method of address information based on rules
CN103473289A (en) * 2013-08-30 2013-12-25 深圳市华傲数据技术有限公司 Device and method for completing communication addresses

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10373103B2 (en) 2015-11-11 2019-08-06 International Business Machines Corporation Decision-tree based address-station matching
CN109145095A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Information of place names matching process, information matching method, device and computer equipment
CN109145095B (en) * 2017-06-16 2024-03-29 贵州小爱机器人科技有限公司 Place name information matching method, information matching device and computer equipment
CN111522901A (en) * 2020-03-18 2020-08-11 大箴(杭州)科技有限公司 Method and device for processing address information in text
CN111522901B (en) * 2020-03-18 2023-10-20 大箴(杭州)科技有限公司 Method and device for processing address information in text

Also Published As

Publication number Publication date
CN103473289A (en) 2013-12-25

Similar Documents

Publication Publication Date Title
WO2015027837A1 (en) Device and method for mailing address completion
CN110909548B (en) Chinese named entity recognition method, device and computer readable storage medium
EP3153978A1 (en) Address search method and device
CN102902362B (en) Character input method and system
WO2015027836A1 (en) Method and system for place name entity recognition
PH12013000132B1 (en) System and method for address matching
CN102708168A (en) System and method for sorting search results of teaching resources
CN103699623B (en) Geocoding implementation method and device
CN103559313B (en) Searching method and device
CN104317909A (en) Method and device for verifying data of points of interest
CN107609032B (en) Matching method and electronic equipment
CN103076894A (en) Method and equipment for building input entries for object identity information according to object identity information
CN105159885A (en) Point-of-interest name identification method and device
CN108932434B (en) Data encryption method and device based on machine learning technology
CN105138708A (en) Method and device for identifying names of points of interest (POI)
KR20120083646A (en) System and method for displaying application document
JP2023134825A (en) Registration information output system, registration information output method and program
CN101729668A (en) Method and device for processing information and mobile communication terminal
CN1996289A (en) Searching and positioning method using electronic map
CN103064967A (en) Method and device used for establishing user binary relation bases
JP2006023968A (en) Unique expression extracting method and device and program to be used for the same
CN114722824A (en) Address processing method and device, storage medium and electronic equipment
CN108090185A (en) A kind of customer information duplicate checking method
JP4510780B2 (en) LOCATION ANALYSIS DEVICE, LOCATION ANALYSIS METHOD, ITS PROGRAM, AND RECORDING MEDIUM
CN104809172B (en) A kind of webpage representation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14841063

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14841063

Country of ref document: EP

Kind code of ref document: A1