CN109947893A - Address Recognition method and device - Google Patents

Address Recognition method and device Download PDF

Info

Publication number
CN109947893A
CN109947893A CN201711311003.7A CN201711311003A CN109947893A CN 109947893 A CN109947893 A CN 109947893A CN 201711311003 A CN201711311003 A CN 201711311003A CN 109947893 A CN109947893 A CN 109947893A
Authority
CN
China
Prior art keywords
address
character
factor
unrecognizable
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711311003.7A
Other languages
Chinese (zh)
Inventor
孙科武
林文辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino Corp filed Critical Aisino Corp
Priority to CN201711311003.7A priority Critical patent/CN109947893A/en
Publication of CN109947893A publication Critical patent/CN109947893A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the present application provides a kind of Address Recognition method and device, and method includes: to carry out cutting processing to the address to identify according to Hash table to the address, and determine unrecognizable address element in the address;According to Address factor scale model, similitude judgement is carried out to Address factor unrecognizable in the Chinese address, to identify to unrecognizable Address factor in the Chinese address, so as to identify to Address factor unrecognizable in Chinese address.

Description

Address Recognition method and device
Technical field
The invention relates to technical field of data processing more particularly to a kind of Address Recognition method and devices.
Background technique
With quickly propelling for Process of Urbanization Construction process, and mobile Internet of Things rapid development need a large amount of address and Location information, roading changes with the variation of urban planning in the process, address how is quickly identified, by address It matches and one of is a problem to be solved with administrative region.
Since the address information of Chinese more uses habitual edit methods, there is certain personalized describing mode, And not formed unified standard.Existing Address Recognition method, which generally passes through, simply to be segmented, and is carried out to address positive or inverse To the matched strategy of maximum value.But this method can not solve adjacent district, especially adjacent street, small towns for address It influences, i.e., often there is the address element that cannot be identified in Chinese address identification process, such as the different address on a street When may belong to two different administrative divisions, then when Chinese address identifies, the Chinese including the street can not be successfully identified Address.
Summary of the invention
In view of this, one of the technical issues of embodiment of the present invention is solved is to provide a kind of Address Recognition method and dress It sets, to overcome above-mentioned technological deficiency in the prior art.
The embodiment of the present application provides a kind of Address Recognition method comprising:
Cutting processing is carried out to identify according to Hash table to the address to the address, and is determined in the address Unrecognizable address element;
According to Address factor scale model, similitude is carried out to Address factor unrecognizable in the Chinese address and is sentenced It is disconnected, to be identified to unrecognizable Address factor in the Chinese address.
Optionally, in any embodiment of the application, to the address carry out cutting processing with according to Hash table to institute It states address to be identified, and determines that unrecognizable address element includes: in the address
Character cutting is carried out to the address, and is obtained according to the Hash table and to address progress character cutting Character, to the address carry out word segmentation processing;
According to the address carry out word segmentation processing as a result, identified to the address, and determine in the address Unrecognizable address element.
Optionally, in any embodiment of the application, character cutting is carried out to the address, and according to the Hash table And the character that character cutting obtains is carried out to the address, carrying out word segmentation processing to the address includes:
Character cutting is carried out to the address, and is obtained according to the Hash table and to address progress character cutting Character, the character mode for determining the character that cutting obtains and the node relationships in the Hash table;
According to the character mode of character and the node relationships in the Hash table, the address is carried out at participle Reason.
Optionally, in any embodiment of the application, character cutting is carried out to the address, and according to the Hash table And the character that character cutting obtains is carried out to the address, determine the character mode for the character that cutting obtains and in the Kazakhstan Node relationships in uncommon table include: character cutting carried out to the address, and according to the Hash table and to the address into The current character and character late that line character cutting obtains, determine the character late that cutting obtains character mode and The node relationships of current character and character late in the Hash table.
Optionally, in any embodiment of the application, unrecognizable address element in the address is determined: according to institute The index for stating the length of the character string of corresponding address element and character string in the Haas table in address, determines in the address not Identifiable address element.
Optionally, in any embodiment of the application, according to Address factor scale model, in the Chinese address Unrecognizable Address factor carries out similitude judgement, to identify to unrecognizable Address factor in the Chinese address Before further include: Address factor scale model is established according to unrecognizable historical address element.
Optionally, in any embodiment of the application, Address factor is established according to unrecognizable historical address element Scale model includes: the distribution of the distribution probability and the historical address and administrative region according to Address factor and historical address Probability establishes Address factor scale model.
Optionally, in any embodiment of the application, each described historical address is abstracted into document, the document In the corresponding Address factor of each word;The administrative region is abstracted into main body;
Accordingly, according to point of the distribution probability and the historical address of Address factor and historical address and administrative region Cloth probability establishes Address factor scale model
According to the distribution probability of Address factor and historical address, the conditional probability of the theme is determined;
According to the distribution probability of the historical address and administrative region, according to the conditional probability of word in the document;
The conditional probability of word establishes Address factor scale model in the conditional probability of the theme and the document.
Optionally, in any embodiment of the application, according to Address factor scale model, in the Chinese address not Identifiable Address factor carries out similitude judgement, to carry out identification packet to unrecognizable Address factor in the Chinese address It includes:
According to according to Address factor scale model, similitude is carried out to Address factor unrecognizable in the Chinese address Judgement, obtains the probability that the unrecognizable Address factor belongs to different administrative regions;
The probability for belonging to different administrative regions according to the unrecognizable Address factor, to can not in the Chinese address The Address factor of identification is identified.
The embodiment of the present application further includes a kind of address recognition unit comprising:
First unit, for carrying out cutting processing to the address to identify according to Hash table to the address, and Determine unrecognizable address element in the address;
Second unit is used for according to Address factor scale model, to unrecognizable Address factor in the Chinese address Similitude judgement is carried out, to identify to unrecognizable Address factor in the Chinese address.
Address Recognition method and device provided by the embodiments of the present application, by carrying out cutting processing to the address with basis Hash table identifies the address, and determines unrecognizable address element in the address;Further according to Address factor phase Like model, similitude judgement is carried out to Address factor unrecognizable in the Chinese address, in the Chinese address not Identifiable Address factor is identified, so as to identify to unrecognizable Address factor in Chinese address hygiene.
Detailed description of the invention
The some specific of the embodiment of the present application is described in detail by way of example and not limitation with reference to the accompanying drawings hereinafter Embodiment.Identical appended drawing reference denotes same or similar part or part in attached drawing.Those skilled in the art should manage Solution, the drawings are not necessarily drawn to scale.In attached drawing:
Fig. 1 is Address Recognition method flow schematic diagram in the embodiment of the present application one;
Fig. 2 is the structural schematic diagram of address recognition unit in the embodiment of the present application two.
Specific embodiment
Any technical solution for implementing the embodiment of the present invention must be not necessarily required to reach simultaneously above all advantages.
In order to make those skilled in the art more fully understand the technical solution in the embodiment of the present invention, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality Applying example only is a part of the embodiment of the embodiment of the present invention, instead of all the embodiments.Based on the implementation in the embodiment of the present invention The range of protection of the embodiment of the present invention all should belong in example, those of ordinary skill in the art's every other embodiment obtained.
In following embodiments, by taking the identification of Chinese address as an example, and Chinese address is constituted, four constituting portion of address Divide and includes:
(1) administrative division: by the descending sequence in administrative region more than small towns.According to " People's Republic of China (PRC) is administrative Area code " (GB2260-1995), administrative division is divided into level Four: the first order is provinces, autonomous regions and municipalities and special administrative region; The second level is city, area, autonomous prefecture, the affiliated districts under city administration of alliance and national municipality directly under the Central Government and county;The third level be county, districts under city administration, county-level city, Flag;The fourth stage is township, town, village.
(2) street: mainly show the way name and street name etc..
(3) number, building plate, building name and room number etc. the bar trade mark: are referred mainly to.
(4) supplemental information: refer to the organization names added or the vocabulary of representation space relationship after the bar trade mark.
The following embodiments of the present invention are based on the identification that similar above-mentioned rule carries out Chinese address.
Address Recognition method and device provided by the embodiments of the present application, by carrying out cutting processing to the address with basis Hash table identifies the address, and determines unrecognizable address element in the address;Further according to Address factor phase Like model, similitude judgement is carried out to Address factor unrecognizable in the Chinese address, in the Chinese address not Identifiable Address factor is identified, so as to identify to unrecognizable Address factor in Chinese address hygiene.
Below with reference to attached drawing of the embodiment of the present invention the embodiment of the present invention will be further explained specific implementation.
Fig. 1 is Address Recognition method flow schematic diagram in the embodiment of the present application one;As shown in Figure 1, comprising:
S101, cutting processing is carried out to the Chinese address to identify according to Hash table to the Chinese address, and Determine unrecognizable address element in the Chinese address;
In the present embodiment, in step S101 to the Chinese address carry out cutting processing with according to Hash table in described Literary address is identified, and can specifically include when unrecognizable address element in the determining Chinese address:
Firstly, to the Chinese address carry out character cutting, and according to the Hash table and to the Chinese address into The character that line character cutting obtains carries out word segmentation processing to the Chinese address;
Secondly, according to the Chinese address carry out word segmentation processing as a result, identified to the Chinese address, and really Unrecognizable address element in the fixed Chinese address.
Optionally, in the present embodiment, character cutting is being carried out to the Chinese address in step S101, and according to the Kazakhstan Uncommon table and the character that Chinese address progress character cutting is obtained, it is specific when to Chinese address progress word segmentation processing May include:
Firstly, to the Chinese address carry out character cutting, and according to the Hash table and to the Chinese address into The character that line character cutting obtains determines that the character mode for the character that cutting obtains and the node in the Hash table close System;
Secondly, according to the character mode of character and the node relationships in the Hash table, to the Chinese address into Row word segmentation processing.
Optionally, in step S101, character cutting is being carried out to the Chinese address, and according to the Hash table and right The Chinese address carries out the character that character cutting obtains, and determines the character mode for the character that cutting obtains and in the Hash Can specifically be same as carrying out the Chinese address character cutting when node relationships in table, and according to the Hash table and The current character and character late that character cutting obtains are carried out to the Chinese address, determine the character late that cutting obtains Character mode and in the Hash table current character and character late node relationships.
In a concrete application scene, above-mentioned steps S101 is realized based on Forward Maximum Method mechanism, detailed process is as follows:
(1) i-th of character C in Chinese address is readi
(2) current character C is searched from the hash table of relationship between recording address element and address elementi, and formed and worked as Front nodal point;
(3) the i+1 character C in Chinese address is readi+1, in current character CiThe present node in the hash table Child node (or next stage node) search character Ci+1If not finding, participle terminates, and goes to (5);It otherwise will be described The child node of current node reads character mode as new present node, if the character mode is not final state, Jump to above-mentioned (2);Otherwise, following (4) are gone to;
(4) word is extracted, above-mentioned (1) is gone to;
(5) judge current character CiCharacter mode then index plus 1 (search the son section of present node if extension state Point), it goes to above-mentioned (1) and re-reads i+1 character C in Chinese addressi+1
The treatment process of above-mentioned (1)-(5) is a kind of positive maximum matched specific implementation, and Chinese address is split as tool There is the word of longest Address factor.If the index that address is gone here and there after participle is the length of address string, illustrate that the Chinese address is complete Portion's identification, then can directly terminate process, alternatively, the processing that can continue to execute following step S102 carries out address similarity It calculates.
S102, according to Address factor scale model, Address factor unrecognizable in the Chinese address is carried out similar Property judgement, to be identified to unrecognizable Address factor in the Chinese address.
Optionally, in the present embodiment, unrecognizable address element in the Chinese address is determined: according to the Chinese The index of the length of the character string of corresponding address element and character string in the Haas table, determines in the Chinese address in address Unrecognizable address element.
Optionally, in the present embodiment, according to Address factor scale model, to unrecognizable in the Chinese address Address factor carries out similitude judgement, also to wrap before being identified to unrecognizable Address factor in the Chinese address It includes: Address factor scale model is established according to unrecognizable historical address element.
Optionally, in the present embodiment, Address factor scale model packet is established according to unrecognizable historical address element It includes: establishing ground according to the distribution probability and the historical address of Address factor and historical address and the distribution probability of administrative region Location element scale model.
Optionally, in the present embodiment, each described historical address is abstracted into document, each word pair in the document Answer an Address factor;The administrative region is abstracted into main body;
Accordingly, according to point of the distribution probability and the historical address of Address factor and historical address and administrative region Cloth probability establishes Address factor scale model
According to the distribution probability of Address factor and historical address, the conditional probability of the theme is determined;
According to the distribution probability of the historical address and administrative region, according to the conditional probability of word in the document;
The conditional probability of word establishes Address factor scale model in the conditional probability of the theme and the document.
Optionally, in the present embodiment, according to Address factor scale model, in the Chinese address unrecognizablely Location element carries out similitude judgement, includes: to carry out identification to unrecognizable Address factor in the Chinese address
According to according to Address factor scale model, similitude is carried out to Address factor unrecognizable in the Chinese address Judgement, obtains the probability that the unrecognizable Address factor belongs to different administrative regions;
The probability for belonging to different administrative regions according to the unrecognizable Address factor, to can not in the Chinese address The Address factor of identification is identified.
The detailed implementing procedure in a concrete application scene of above-mentioned steps S102 illustrated below:
Each described historical address is abstracted into document, multiple historical address are abstracted into a document sets;In the document The corresponding Address factor of each word;The administrative region is abstracted into main body.
1. being first randomly derived a theme-Document distribution from document-theme distributionThen m documents are obtained N-th of theme zm,n
2. obtaining theme is z in K theme distribution in pair training setm,nTheme, be distributed to obtain word according to theme-word wm,n
3.Corresponding dirichlet distribution, physical meaning is the random mixed distribution of potential theme,It is its elder generation Probability parameter is tested,It being distributed corresponding to multinomial, physical meaning is the multinomial distribution of a potential theme, Entirety is a dirichlet-multinomial conjugated structure;
Obtain the conditional probability calculation formula of theme:
WhereinIndicate a number vector of word in m documents;
5.Meeting dirichlet distribution, physical meaning is the random mixed distribution of word,It is its prior probability ginseng Number, andMeet multinomial distribution, physical meaning is the multinomial distribution of word.
Therefore the condition probability formula of word is obtained:
Wherein,Indicate the word number vector that k-th of theme generates;
6. being distributed based on two above, the word in document is obtained in the joint probability distribution calculation formula of theme It calculates first
After obtaining the joint probability distribution of descriptor, it is trained, is obtained by MCMC algorithm and gibbs sampler process VariableWithValue, complete LDA probabilistic model.
7. for new unrecognizable Address factor, by variableWithValue be updated in the formula of step 6 to obtain the final product Belong to the probability of the administrative division to the address, is finally ranked up according to probability size.
Due in above-mentioned formulaWithTherefore, difference can be obtained in the formula in step 6When it is general Rate value, wherein it is preferred that when taking most probable value unrecognizable Address factor and administrative division syntagmatic, to be identified Chinese address.
Fig. 2 is the structural schematic diagram of address recognition unit in the embodiment of the present application two;As shown in Fig. 2, comprising:
First unit 201, for the Chinese address carry out cutting processing with according to Hash table to the Chinese address It is identified, and determines unrecognizable address element in the Chinese address;
Second unit 202, for being wanted to unrecognizable address in the Chinese address according to Address factor scale model Element carries out similitude judgement, to identify to unrecognizable Address factor in the Chinese address.
Optionally, in any embodiment of the application, the first unit is further used for:
Character cutting is carried out to the Chinese address, and carries out character according to the Hash table and to the Chinese address The character that cutting obtains carries out word segmentation processing to the Chinese address;
According to the Chinese address carry out word segmentation processing as a result, identified to the Chinese address, and determine institute State unrecognizable address element in Chinese address.
Optionally, in any embodiment of the application, the first unit be further used for the Chinese address into Line character cutting, and the character that character cutting obtains is carried out according to the Hash table and to the Chinese address, determine cutting The character mode of obtained character and the node relationships in the Hash table;And according to the character mode of character and Node relationships in the Hash table carry out word segmentation processing to the Chinese address.
Optionally, in any embodiment of the application, the first unit be further used for the Chinese address into Line character cutting, and according to the Hash table and to the Chinese address carry out the obtained current character of character cutting with it is next A character determines the character mode for the character late that cutting obtains and current character and next word in the Hash table The node relationships of symbol.
Optionally, in any embodiment of the application, the first unit is further used for according to the Chinese address The index of character string in the length of the character string of middle corresponding address element and the Haas table, determining can not in the Chinese address The address element of identification.
Optionally, in any embodiment of the application, the first unit is further used for being gone through according to unrecognizable History Address factor establishes Address factor scale model.
Optionally, in any embodiment of the application, optionally, in any embodiment of the application, described first Unit is further used for point of distribution probability and the historical address and administrative region according to Address factor and historical address Cloth probability establishes Address factor scale model.
Optionally, in any embodiment of the application, each described historical address is abstracted into document, the document In the corresponding Address factor of each word;The administrative region is abstracted into main body;
Optionally, in any embodiment of the application, the first unit is further used for: according to Address factor with go through The distribution probability of history address determines the conditional probability of the theme;According to the distribution probability of the historical address and administrative region, According to the conditional probability of word in the document;And the conditional probability of the theme and the conditional probability of word in the document are built Vertical Address factor scale model.
Optionally, in any embodiment of the application, the second unit is further used for according to according to Address factor Scale model carries out similitude judgement to Address factor unrecognizable in the Chinese address, obtains described unrecognizable Address factor belongs to the probability of different administrative regions;And different administrative regions are belonged to according to the unrecognizable Address factor Probability, unrecognizable Address factor in the Chinese address is identified.
The apparatus embodiments described above are merely exemplary, wherein described, module can as illustrated by the separation member It is physically separated with being or may not be, the component shown as module may or may not be physics mould Block, it can it is in one place, or may be distributed on multiple network modules.It can be selected according to the actual needs In some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativeness Labour in the case where, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should Computer software product may be stored in a computer readable storage medium, the computer readable recording medium include for Any mechanism of the readable form storage of computer (such as computer) or transmission information.For example, machine readable media includes only Read memory (ROM), random access memory (RAM), magnetic disk storage medium, optical storage media, flash medium, electricity, light, Sound or the transmitting signal (for example, carrier wave, infrared signal, digital signal etc.) of other forms etc., which includes Some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes respectively Method described in certain parts of a embodiment or embodiment.
Finally, it should be noted that above embodiments are only to illustrate the technical solution of the embodiment of the present application, rather than it is limited System;Although the application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: its It is still possible to modify the technical solutions described in the foregoing embodiments, or part of technical characteristic is equal Replacement;And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution Spirit and scope.
It will be understood by those skilled in the art that the embodiment of the embodiment of the present invention can provide as method, apparatus (equipment) or Computer program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine soft The form of the embodiment of part and hardware aspect.Moreover, it wherein includes to calculate that the embodiment of the present invention, which can be used in one or more, Computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, the optical memory of machine usable program code Deng) on the form of computer program product implemented.
The embodiment of the present invention referring to according to the method for the embodiment of the present invention, device (equipment) and computer program product Flowchart and/or the block diagram describes.It should be understood that can be realized by computer program instructions every in flowchart and/or the block diagram The combination of process and/or box in one process and/or box and flowchart and/or the block diagram.It can provide these computers Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices To generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices execute In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Claims (10)

1. a kind of Address Recognition method characterized by comprising
Cutting processing is carried out to identify according to Hash table to the address to the address, and determining can not in the address The address element of identification;
According to Address factor scale model, similitude judgement is carried out to Address factor unrecognizable in the Chinese address, with Unrecognizable Address factor in the Chinese address is identified.
2. the method according to claim 1, wherein carrying out cutting processing according to Hash table pair to the address The address is identified, and determines that unrecognizable address element includes: in the address
Character cutting is carried out to the address, and carries out the word that character cutting obtains according to the Hash table and to the address Symbol carries out word segmentation processing to the address;
According to the address carry out word segmentation processing as a result, identified to the address, and determining can not in the address The address element of identification.
3. according to the method described in claim 2, it is characterized in that, character cutting is carried out to the address, and according to the Kazakhstan Uncommon table and the character obtained to address progress character cutting, carrying out word segmentation processing to the address includes:
Character cutting is carried out to the address, and carries out the word that character cutting obtains according to the Hash table and to the address Symbol, the character mode for determining the character that cutting obtains and the node relationships in the Hash table;
According to the character mode of character and the node relationships in the Hash table, word segmentation processing is carried out to the address.
4. according to the method described in claim 3, it is characterized in that, character cutting is carried out to the address, and according to the Kazakhstan Uncommon table and the character obtained to address progress character cutting, determine the character mode for the character that cutting obtains and in institute Stating the node relationships in Hash table includes: to carry out character cutting to the address, and according to the Hash table and to describedly Location carries out the obtained current character of character cutting and character late, determine the character mode for the character late that cutting obtains with And in the Hash table current character and character late node relationships.
5. according to the method described in claim 3, it is characterized in that, determining unrecognizable address element in the address: root According to the index of character string in the length of the character string of corresponding address element in the address and the Haas table, the address is determined In unrecognizable address element.
6. the method according to claim 1, wherein according to Address factor scale model, to the Chinese ground Unrecognizable Address factor carries out similitude judgement in location, to carry out to Address factor unrecognizable in the Chinese address Before identification further include: establish Address factor scale model according to unrecognizable historical address element.
7. according to the method described in claim 6, being wanted it is characterized in that, establishing address according to unrecognizable historical address element Plain scale model includes: point of the distribution probability and the historical address and administrative region according to Address factor and historical address Cloth probability establishes Address factor scale model.
8. described the method according to the description of claim 7 is characterized in that each described historical address is abstracted into document The corresponding Address factor of each word in document;The administrative region is abstracted into main body;
Accordingly, the distribution according to the distribution probability and the historical address and administrative region of Address factor and historical address is general Rate establishes Address factor scale model
According to the distribution probability of Address factor and historical address, the conditional probability of the theme is determined;
According to the distribution probability of the historical address and administrative region, according to the conditional probability of word in the document;
The conditional probability of word establishes Address factor scale model in the conditional probability of the theme and the document.
9. according to the method described in claim 8, it is characterized in that, according to Address factor scale model, to the Chinese address In unrecognizable Address factor carry out similitude judgement, to know to unrecognizable Address factor in the Chinese address Do not include:
According to according to Address factor scale model, similitude is carried out to Address factor unrecognizable in the Chinese address and is sentenced It is disconnected, obtain the probability that the unrecognizable Address factor belongs to different administrative regions;
The probability for belonging to different administrative regions according to the unrecognizable Address factor, to not can recognize in the Chinese address Address factor identified.
10. a kind of address recognition unit characterized by comprising
First unit for carrying out cutting processing to the address to identify according to Hash table to the address, and determines Unrecognizable address element in the address;
Second unit, for being carried out to Address factor unrecognizable in the Chinese address according to Address factor scale model Similitude judgement, to be identified to unrecognizable Address factor in the Chinese address.
CN201711311003.7A 2017-12-11 2017-12-11 Address Recognition method and device Pending CN109947893A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711311003.7A CN109947893A (en) 2017-12-11 2017-12-11 Address Recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711311003.7A CN109947893A (en) 2017-12-11 2017-12-11 Address Recognition method and device

Publications (1)

Publication Number Publication Date
CN109947893A true CN109947893A (en) 2019-06-28

Family

ID=67004188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711311003.7A Pending CN109947893A (en) 2017-12-11 2017-12-11 Address Recognition method and device

Country Status (1)

Country Link
CN (1) CN109947893A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080319974A1 (en) * 2007-06-21 2008-12-25 Microsoft Corporation Mining geographic knowledge using a location aware topic model
CN105630765A (en) * 2015-12-21 2016-06-01 浙江万里学院 Place name address identifying method
WO2017040632A4 (en) * 2015-08-31 2017-06-22 Omniscience Corporation Event categorization and key prospect identification from storylines
CN107423295A (en) * 2016-05-24 2017-12-01 张向利 A kind of magnanimity address date intelligence fast matching method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080319974A1 (en) * 2007-06-21 2008-12-25 Microsoft Corporation Mining geographic knowledge using a location aware topic model
WO2017040632A4 (en) * 2015-08-31 2017-06-22 Omniscience Corporation Event categorization and key prospect identification from storylines
CN105630765A (en) * 2015-12-21 2016-06-01 浙江万里学院 Place name address identifying method
CN107423295A (en) * 2016-05-24 2017-12-01 张向利 A kind of magnanimity address date intelligence fast matching method

Similar Documents

Publication Publication Date Title
CN105144164B (en) Scoring concept terms using a deep network
KR101916798B1 (en) Method and system for providing recommendation query using search context
CN104298710B (en) Automatically welcome terrestrial reference is found
CN112329467B (en) Address recognition method and device, electronic equipment and storage medium
CN107168991B (en) Search result display method and device
CN102033880A (en) Marking method and device based on structured data acquisition
CN109255564A (en) Pick-up point address recommendation method and device
CN110069626A (en) Target address recognition method, classification model training method and device
CN111989665A (en) On-device image recognition
CN108399180A (en) A kind of knowledge mapping construction method, device and server
CN108831442A (en) Point of interest recognition methods, device, terminal device and storage medium
CN111563192A (en) Entity alignment method and device, electronic equipment and storage medium
WO2021189977A1 (en) Address coding method and apparatus, and computer device and computer-readable storage medium
CN107918657A (en) The matching process and device of a kind of data source
CN108875090A (en) A kind of song recommendations method, apparatus and storage medium
CN107665188A (en) A kind of semantic understanding method and device
CN108733810A (en) A kind of address date matching process and device
CN110019617B (en) Method and device for determining address identifier, storage medium and electronic device
Vishwakarma et al. A comparative study of K-means and K-medoid clustering for social media text mining
CN111966811A (en) Intention recognition and slot filling method and device, readable storage medium and terminal equipment
CN112652189B (en) Traffic distribution method, device and equipment based on policy flow and readable storage medium
CN108763221A (en) A kind of attribute-name characterizing method and device
CN105138684A (en) Information processing method and device
CN110990451B (en) Sentence embedding-based data mining method, device, equipment and storage device
CN103699568A (en) Method for extracting hyponymy relation of field terms from wikipedia

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190628