CN102651013B - Method and system for extracting area information from enterprise name data - Google Patents

Method and system for extracting area information from enterprise name data Download PDF

Info

Publication number
CN102651013B
CN102651013B CN201210085428.1A CN201210085428A CN102651013B CN 102651013 B CN102651013 B CN 102651013B CN 201210085428 A CN201210085428 A CN 201210085428A CN 102651013 B CN102651013 B CN 102651013B
Authority
CN
China
Prior art keywords
area information
data
enterprise name
information
entry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210085428.1A
Other languages
Chinese (zh)
Other versions
CN102651013A (en
Inventor
陈扬
王绍虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI AGILESC INFORMATION SYSTEMS CO Ltd
Original Assignee
SHANGHAI AGILESC INFORMATION SYSTEMS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=46693021&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=CN102651013(B) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by SHANGHAI AGILESC INFORMATION SYSTEMS CO Ltd filed Critical SHANGHAI AGILESC INFORMATION SYSTEMS CO Ltd
Priority to CN201210085428.1A priority Critical patent/CN102651013B/en
Publication of CN102651013A publication Critical patent/CN102651013A/en
Application granted granted Critical
Publication of CN102651013B publication Critical patent/CN102651013B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for extracting area information from enterprise name data. The method comprises the following steps of: A, completely matching an enterprise name and an enterprise name in an enterprise information database; for the completely matched data, extracting the corresponding area information from the enterprise information database as the area information of the enterprise name data, and performing step B on the enterprise name data which is not matched; B, performing entry split on the enterprise name according to a preset classifying dictionary database, classifying the split entries, performing step D on the enterprise names of which all the entries can be classified, and performing step C on the enterprise names which are not completely classified; C, according to the preset classifying dictionary database, classifying the entries which are split and are not classified in a manually auxiliary way; and D, extracting the area information from the classified entries as the area information of the enterprise name data. The invention also discloses a system for realizing the method, and the working efficiency of acquiring the area information of enterprises can be improved.

Description

A kind of method and system from enterprise name extracting data area information
Technical field
The present invention relates to the data processing technique of business data, particularly a kind of method and system from enterprise name extracting data area information.
Background technology
Business data is being carried out in data handling procedure, and the area information that obtains enterprise is very important step wherein.If lack the area information of enterprise in business data, these business data just cannot be effectively used, and have reduced the quality that enterprise's related data is analyzed, so the eager business data that needs inclusion region information of relevant enterprise.
In most cases, in the enterprise name of province, city's one-level, all comprise the area information of enterprise, such as: No.1 People's Hospital Shanghai City etc.And at the Channel of Downstream of selling, enterprise name is more and more lack of standardization, particularly in the enterprise of marketing channel end, often occurs not inclusion region information or the infull enterprise name of area information.When these enterprise names are carried out data processing in typing business data, need to the area information of enterprise be supplemented complete.
Present stage, when business data is carried out to area information and complement operation, often utilizes relevant sales force at selling period, regional information to be collected and completion, or buys relevant service to professional information investigation company.Obviously, these two kinds of modes all need to drop into a large amount of manpowers and carry out relevant census operations when practical operation, and owing to being pure manual operation, the implementation cycle of whole process can be very long, and enterprise need to spend considerable resource for this reason.
Summary of the invention
In view of this, the object of the invention is to a kind of method and system from enterprise name extracting data area information, to improve the work efficiency of obtaining area information.
For first aspect achieving the above object, the invention provides a kind of method from enterprise name extracting data area information, in advance state administration zoning information is stored by provinces and cities counties and townships village's Pyatyi, by search engine, set up relative index, and the likeness in form that may occur in collecting zone information, sound store relevant conversion character library into like word, described method comprises the steps:
A, the enterprise name in the enterprise name data of reception is mated completely with the enterprise name in the company information database setting in advance; In described company information database, store enterprise name and corresponding complete area information; To the data of mating completely, from company information database, extract corresponding area information, as the area information of these enterprise name data, to extract corresponding area information from company information database, verify, if the verification passes, export this area information; Otherwise, execution step B; To the enterprise name data execution step B not mating.
The classified dictionary database that B, basis set in advance carries out entry fractionation to enterprise name, and the entry splitting out is sorted out; The enterprise name execution step D that can sort out whole entries, to the enterprise name execution step C all not sorting out.
The classified dictionary database that C, basis set in advance, human assistance is sorted out the not classification entry splitting out;
In D, the entry from sorting out, extract area information, area information as these enterprise name data, if more than one of area information, is divided into a plurality of grades by different sales volumes, each grade corresponding one with reference to coefficient, by the regional information and relevant dealer's distance and this dealer's multiplication that obtain, divided by this distance, calculate a ratio value, from a plurality of ratio values again, take out area information corresponding to mxm., as the area information of enterprise;
Described step D comprises:
In entry after D1, judgement are sorted out, whether comprise region category information, if so, perform step D2; Otherwise, execution step D4;
In D2, the entry from sorting out, extract region category information, and according to administrative division, supplement as complete area information, when completion area information, the entry of area information class is searched to element successively according to its position sequencing, search its rank in zoning and collect its higher level's zoning;
D3, area information is verified, if the verification passes, exported this area information; Otherwise, using these data as can not deal with data exporting;
D4, the enterprise name in the enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, the data to coupling extract corresponding area information from company information database, execution step D3; For being designated zoning classification but do not search the data of matching area, be similar to and/or sound like conversion after search again, using the data of coupling not as can not deal with data output.
For another aspect achieving the above object, the invention provides a kind of system from enterprise name extracting data area information, in advance state administration zoning information is stored by provinces and cities counties and townships village's Pyatyi, by search engine, set up relative index, and the likeness in form that may occur in collecting zone information, sound store relevant conversion character library into like word, this system comprises:
Data Matching unit, receive enterprise name data, enterprise name is wherein mated completely with the enterprise name in the company information database setting in advance, to the data of mating completely, from company information database, extract corresponding area information, as the area information of these enterprise name data; The enterprise name data of not mating are exported to entry and split classification unit.
In described company information database, store enterprise name and corresponding complete area information.
Entry splits sorts out unit, according to the classified dictionary database setting in advance, enterprise name is carried out to entry fractionation, and the entry splitting out is sorted out; The enterprise name that can all sort out and classification information are exported to area information extraction unit, and the enterprise name of all not sorting out is exported to and do not sorted out entry classification worktable.
Do not sort out entry and sort out worktable, according to the classified dictionary database setting in advance, human assistance is sorted out the not classification entry splitting out; Entry after sorting out is exported to area information extraction unit.
Area information extraction unit, in entry from sorting out, extract area information, as the area information of these enterprise name data, if more than one of area information, different sales volumes is divided into a plurality of grades, corresponding one of each grade uses with reference to coefficient the regional information and relevant dealer's distance and this dealer's multiplication that obtain, then divided by this distance, calculate a ratio value, from a plurality of ratio values, take out area information corresponding to mxm., as the area information of enterprise;
This system further comprises area information authentication unit;
Described Data Matching unit, first exports to area information authentication unit by extract corresponding area information from company information database;
Described area information authentication unit, verifies extract corresponding area information from company information database, and if the verification passes, this area information is as the area information output of these enterprise name data; Otherwise, this area information is exported to entry and splits classification unit;
Described area information extraction unit comprises: area information judge module, area information extraction module, area information complementary module and fuzzy matching module;
Whether described area information judge module, comprise region category information in the entry after judgement is sorted out, and if so, this region category information exported to area information extraction module, otherwise this region category information is exported to fuzzy matching module;
Described area information extraction module, extracts region category information in the entry from sorting out, and exports to area information complementary module;
Described area information complementary module, according to administrative division, the region category information of reception is supplemented as complete area information, export to area information authentication unit, when completion area information, the entry of area information class is searched to element successively according to its position sequencing, search its rank in zoning and collect its higher level's zoning;
Described fuzzy matching module, enterprise name in enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, data to coupling extract corresponding area information from company information database, export to area information authentication unit; For being designated zoning classification but do not search the data of matching area, be similar to and/or sound like conversion after search again, using the data of coupling not as can not deal with data output;
Described area information authentication unit, further verifies the area information receiving from area information extraction unit, if the verification passes, exports the area information receiving from area information extraction unit; Otherwise, using these data as can not deal with data exporting.
As seen from the above technical solutions, this method and system from enterprise name extracting data area information provided by the invention, by enterprise name and company information database are carried out to Data Matching, to matched data, from company information database, obtain area information; The data of not mating split and sorted out, in the information from sorting out, obtain area information, having improved the work efficiency of obtaining enterprise zone domain information.
Accompanying drawing explanation
Fig. 1 extracts the method flow diagram of area information in a preferred embodiment of the present invention;
Fig. 2 extracts the structural representation of the system of area information in a preferred embodiment of the present invention;
Fig. 3 is the structural representation of middle area information extraction unit embodiment illustrated in fig. 2.
Embodiment
The invention discloses a kind of method and system from enterprise name extracting data area information, can improve the work efficiency of obtaining enterprise zone domain information.
Referring to the accompanying drawing specific embodiment that develops simultaneously, the present invention will be described in detail.
As shown in Figure 1, the method for extracting area information in a preferred embodiment of the present invention comprises the steps:
Step 101, receives enterprise name data.
Step 102, enterprise name in the enterprise name data of reception is carried out to complete similar coupling with the enterprise name in the company information database setting in advance, to the data of mating completely, from company information database, extract corresponding area information, execution step 103, to the enterprise name data execution step 104 of not mating.
In company information database in the present embodiment, store enterprise name and corresponding complete area information.
In the present embodiment, the method for coupling is completely: in company information database, search the enterprise name in these enterprise name data, if found identical enterprise name, for mating completely; Otherwise for not mating.
Step 103, carries out area information checking to extract corresponding area information from company information database, if the verification passes, performs step 111, exports this complete area information; Otherwise, execution step 104.
In actual applications, even if the data of mating are not completely likely unique yet, although or be unique, whether accurately.Therefore, in the present embodiment, increased the step of area validation, further to improve accuracy.
Step 104, carries out entry fractionation according to the classified dictionary database setting in advance to enterprise name, and the entry splitting out is sorted out; The enterprise name that can sort out whole entries and classification information and executing step 106, to the enterprise names execution step 105 of all not sorting out.
Step 105, according to the classified dictionary database setting in advance, human assistance is sorted out the not classification entry splitting out, and generates classification information, and the entry after sorting out is stored in described classified dictionary database.
In the present embodiment, classified dictionary database is by terminological dictionary is set up according to the mode that reads respectively and store of classifying.
In classified dictionary database in the present embodiment, be mainly divided into three major types: region class, brand names class and industrial nature classification.Wherein region class is divided into according to administrative division again: province, city, 5 counties, township and village.Type of business class Further Division is: chain, manufacturing enterprise, circulation end enterprise etc.. industrial nature classifying and dividing is: pharmaceutical, fast consumer etc.
For instance, suppose that enterprise name will obtain result as shown in table 1 for " the Jiangmen city Xinhui District only Lian Mingshantang of sand drift pharmacy " after step 104 splits:
Entry Classification
Jiangmen city Region
Xinhui District Region
Sand drift Unknown
Solely join Unknown
Bright kind hall Brand names
Pharmacy Industrial nature
 
Table 1
In this table, " Jiangmen city ", " Xinhui District ", " Ming Shantang " and " pharmacy " belong to and sort out entry, and " sand drift " and " solely connection " belongs to and do not sort out entry.Therefore,, to this enterprise name execution step 105, manually sort out.Particularly, be exactly by the completion of artificial data check mode, and the information of completion is added in relevant speciality dictionary.If " sand drift " that occur in table 1, " solely connection " are all area information through inquiry for " sand drift town " and " Du Lian village " two, in word segmentation result, revise attribute, and add relevant information in the region category information in classified dictionary database.
Step 106, according to whether comprising region category information in the entry after classification information judgement classification, if so, performs step 107, otherwise execution step 109.
In the present embodiment, the word that exactly each is split out, carries out class discrimination with different marks as classification information.
In practical application, may be completely in the enterprise name after entry splits inclusion region categorical data not.These type of data are as " province builds clinic, the western market of worker's hospital ", and " the large pharmacy of wide benevolence hall " etc., these type of data cannot be disassembled by title and obtain area information, need to obtain area information by performing step 109.
Step 107, extracts region category information in the entry from sorting out.
Step 108, according to administrative division, supplements the region category information extracting into complete area information, execution step 110.
In step 107, be exactly one by one area information search for and mate with national administrative planning title, determine the zoning information that it is relevant, if the quantity of information containing in enterprise name may occur a plurality of results more at least.As: Pengjiang District agricultural cross road benefit people's medicine shop, more than one of the region that Pengjiang District is contained in the whole nation, single cannot unique definite area from the result disassembled, now just need to, preserve after a plurality of result completions simultaneously, by the area information of step 110, verify and confirm.
Step 109, enterprise name in enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, data to coupling, from company information database, extract corresponding area information, execution step 110, using the data of not mating as can not deal with data exporting.
In this step, system is carried out fuzzy similarity by the data in enterprise name and company information storehouse and is searched, and obtains the more than 90% data area information of matching degree and as a result of carries out follow-up area information checking and process.
Step 110, verifies area information, if the verification passes, performs step 111, exports this complete area information; Otherwise, using these data as can not deal with data exporting.
In the present embodiment, in step 103 and step 110, area information is verified as: calculate the distance between the dealer region that this region He Yugai enterprise is relevant, judge that whether the dealer region of this region Yu Gai enterprise is in same administrative division, if, be verified, otherwise checking is not passed through.
Because general non-its sales range of nationwide one-level dealer is many in its surrounding area, otherwise can produce very high logistics cost.Therefore, the present invention is based on this concept, the area information finding is calculated to its distance to relevant dealer region, judge that whether dealer and enterprise are in same zoning, if in same zoning, be verified, otherwise checking is not passed through.
Because the area information obtaining before checking may not be one, now just need to be using between both sides the moon sales volume etc. as relevant reference coefficient, calculate the validity score of each result, using the highest data of score as enterprise region.For example: different sales volumes is divided into a plurality of grades, each grade corresponding one with reference to coefficient, by the regional information and relevant dealer's distance and this dealer's multiplication that obtain, then divided by this distance, calculate a ratio value.From a plurality of ratio values, take out area information corresponding to mxm., as the area information of enterprise.
Step 111, exports complete area information.
In the present embodiment, in order to extract area information completion from enterprise name, in advance state administration zoning information is stored by provinces and cities counties and townships village's Pyatyi, by search engine, set up relative index, and the likeness in form that may occur in collecting zone information, sound store relevant conversion character library into like word.
In the present embodiment, when completion area information, the entry of area information class is searched to element successively according to its position sequencing, search its rank in zoning and collect its higher level's zoning.As when " Gaochun County " searched for, find that it is zoning at county level, can completion two-stage region on it, be recorded as " Gaochun County, Nanjing ".The like as search for township level zoning, completion province cities and counties are three grades.
In addition, in the present embodiment also for being designated zoning classification but whether search the data of matching area, be similar to and/or sound like conversion after search again.This is because the enterprise name providing mostly is manual entry, very easily occurs this situation, for example " Bozhou " and " Bo continent ", " Binhu District " and " guest lake region " etc.These type of data are used to the likeness in form of its correspondence, sound substitutes former data like son and mates.
The present invention provides a kind of system from enterprise name extracting data area information simultaneously, and this system is used for realizing above-mentioned flow process.As shown in Figure 2, the system of a preferred embodiment of the present invention comprises: Data Matching unit 201, company information database 202, entry split classification unit 203, classified dictionary database 204, do not sort out entry classification worktable 205, area information extraction unit 206 and area information authentication unit 207.
Wherein, Data Matching unit 201, receive enterprise name data, enterprise name is wherein mated completely with the enterprise name in the company information database 202 setting in advance, to the data of mating completely, from company information database, 202 extract corresponding area information, as the area information of these enterprise name data, export to area information authentication unit 207, the enterprise name data of not mating are exported to entry and split classification unit 203.
In described company information database 202, store enterprise name and corresponding complete area information.
Described entry splits sorts out unit 203, carries out entry fractionation, and the entry splitting out is sorted out according to 204 pairs of enterprise names of the classified dictionary database setting in advance; The enterprise name that whole entries can be sorted out and classification information are exported to area information extraction unit, and the enterprise name of all not sorting out is exported to and do not sorted out entry classification worktable 205.
Worktable 205 sorted out in the described entry of not sorting out, and according to the classified dictionary database 204 setting in advance, human assistance is sorted out the not classification entry splitting out; Entry after sorting out is exported to area information extraction unit 206.
Described area information extraction unit 206, extracts area information in the entry from sorting out, and as the area information of these enterprise name data, exports to area information authentication unit 207.
Described area information authentication unit 207, verifies extract corresponding area information from company information database, and if the verification passes, this area information is as the area information output of these enterprise name data; Otherwise, this area information is exported to entry and splits classification unit 203.Area information authentication unit 207 in the present embodiment, also verifies the area information receiving from area information extraction unit 206, if the verification passes, exports the area information receiving from area information extraction unit 206; Otherwise, using these data as can not deal with data exporting.
As shown in Figure 3, the area information extraction unit in the present embodiment comprises: area information judge module 301, area information extraction module 302, area information complementary module 303 and fuzzy matching module 304.
Wherein, whether area information judge module 301, comprise region category information in the entry after judgement is sorted out, and if so, this region category information exported to area information extraction module 302, otherwise this region category information is exported to fuzzy matching module 303.
Described area information extraction module 302, extracts region category information in the entry from sorting out, and exports to area information complementary module 303.
Described area information complementary module 303, according to administrative division, supplements the region category information of reception into complete area information, exports to area information authentication unit.
Described fuzzy matching module 303, enterprise name in enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, data to coupling extract corresponding area information from company information database, export to area information authentication unit; Using the data of not mating as can not deal with data exporting.
In the present embodiment, for system output can not deal with data, can carry out once again artificial treatment completely, by the area information that artificially collects and extract enterprise.
In addition, in the whole system of the present embodiment in actual use because terminological dictionary and company information storehouse can constantly be upgraded and expand, and relevant information is all passed through and is searched plain engine and set up relative index and be optimized, so improve constantly in the process that the treatment effeciency of whole system can be processed at big data quantity.
Meanwhile, system can also be disposed many covers by the mode of distributed structure/architecture when disposing, and like this by the fractionation to large data, uses many cover systems to process simultaneously and can make processing power improve several times, with this, tackles the processing of Volume data.
From the above embodiments, this method and system from enterprise name extracting data area information of the present invention, can improve the work efficiency of obtaining enterprise zone domain information.

Claims (6)

1. the method from enterprise name extracting data area information, it is characterized in that, in advance state administration zoning information is stored by provinces and cities counties and townships village's Pyatyi, by search engine, set up relative index, and the likeness in form that may occur in collecting zone information, sound store relevant conversion character library into like word, described method comprises the steps:
A, the enterprise name in the enterprise name data of reception is mated completely with the enterprise name in the company information database setting in advance; In described company information database, store enterprise name and corresponding complete area information; To the data of mating completely, from company information database, extract corresponding area information, as the area information of these enterprise name data, to extract corresponding area information from company information database, verify, if the verification passes, export this area information; Otherwise, execution step B; To the enterprise name data execution step B not mating;
The classified dictionary database that B, basis set in advance carries out entry fractionation to enterprise name, and the entry splitting out is sorted out; The enterprise name execution step D that can sort out whole entries, to the enterprise name execution step C all not sorting out;
The classified dictionary database that C, basis set in advance, human assistance is sorted out the not classification entry splitting out;
In D, the entry from sorting out, extract area information, area information as these enterprise name data, if more than one of area information, is divided into a plurality of grades by different sales volumes, each grade corresponding one with reference to coefficient, by the regional information and relevant dealer's distance and this dealer's multiplication that obtain, divided by this distance, calculate a ratio value, from a plurality of ratio values again, take out area information corresponding to mxm., as the area information of enterprise;
Described step D comprises:
In entry after D1, judgement are sorted out, whether comprise region category information, if so, perform step D2; Otherwise, execution step D4;
In D2, the entry from sorting out, extract region category information, and according to administrative division, supplement as complete area information, when completion area information, the entry of area information class is searched for successively according to its position sequencing, searched its rank in zoning and collect its higher level's zoning;
D3, area information is verified, if the verification passes, exported this area information; Otherwise, using these data as can not deal with data exporting;
D4, the enterprise name in the enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, the data to coupling extract corresponding area information from company information database, execution step D3; For being designated zoning classification but do not search the data of matching area, be similar to and/or sound like conversion after search again, using the data of coupling not as can not deal with data output.
2. the method for claim 1, it is characterized in that: described area information is verified as: calculate the distance between the dealer region that this region He Yugai enterprise is relevant, judge that whether the dealer region of this region Yu Gai enterprise is in same administrative division, if, be verified, otherwise checking is not passed through.
3. the method as described in claim 1-2 any one, is characterized in that: described in steps A, coupling is completely: in company information database, search the enterprise name in these enterprise name data, if found identical enterprise name, for mating completely; Otherwise for not mating.
4. the method as described in claim 1-2 any one, is characterized in that: described classified dictionary database is that the mode by terminological dictionary is read respectively and stored according to classification is set up.
5. method as claimed in claim 4, is characterized in that: in described step C, the entry of further human assistance being sorted out, according to its classification, stores in described classified dictionary database.
6. the system from enterprise name extracting data area information, it is characterized in that, in advance state administration zoning information is stored by provinces and cities counties and townships village's Pyatyi, by search engine, set up relative index, and the likeness in form that may occur in collecting zone information, sound store relevant conversion character library into like word, this system comprises:
Data Matching unit, receive enterprise name data, enterprise name is wherein mated completely with the enterprise name in the company information database setting in advance, to the data of mating completely, from company information database, extract corresponding area information, as the area information of these enterprise name data; The enterprise name data of not mating are exported to entry and split classification unit;
In described company information database, store enterprise name and corresponding complete area information;
Entry splits sorts out unit, according to the classified dictionary database setting in advance, enterprise name is carried out to entry fractionation, and the entry splitting out is sorted out; The enterprise name that can all sort out and classification information are exported to area information extraction unit, and the enterprise name of all not sorting out is exported to and do not sorted out entry classification worktable;
Do not sort out entry and sort out worktable, according to the classified dictionary database setting in advance, human assistance is sorted out the not classification entry splitting out; Entry after sorting out is exported to area information extraction unit;
Area information extraction unit, in entry from sorting out, extract area information, as the area information of these enterprise name data, if more than one of area information, different sales volumes is divided into a plurality of grades, corresponding one of each grade uses with reference to coefficient the regional information and relevant dealer's distance and this dealer's multiplication that obtain, then divided by this distance, calculate a ratio value, from a plurality of ratio values, take out area information corresponding to mxm., as the area information of enterprise;
This system further comprises area information authentication unit;
Described Data Matching unit, first exports to area information authentication unit by extract corresponding area information from company information database;
Described area information authentication unit, verifies extract corresponding area information from company information database, and if the verification passes, this area information is as the area information output of these enterprise name data; Otherwise, this area information is exported to entry and splits classification unit;
Described area information extraction unit comprises: area information judge module, area information extraction module, area information complementary module and fuzzy matching module;
Whether described area information judge module, comprise region category information in the entry after judgement is sorted out, and if so, this region category information exported to area information extraction module, otherwise this region category information is exported to fuzzy matching module;
Described area information extraction module, extracts region category information in the entry from sorting out, and exports to area information complementary module;
Described area information complementary module, according to administrative division, the region category information of reception is supplemented as complete area information, export to area information authentication unit, when completion area information, the entry of area information class is searched for successively according to its position sequencing, searched its rank in zoning and collect its higher level's zoning;
Described fuzzy matching module, enterprise name in enterprise name in these enterprise name data and described company information database is carried out to fuzzy matching, data to coupling extract corresponding area information from company information database, export to area information authentication unit; For being designated zoning classification but do not search the data of matching area, be similar to and/or sound like conversion after search again, using the data of coupling not as can not deal with data output;
Described area information authentication unit, further verifies the area information receiving from area information extraction unit, if the verification passes, exports the area information receiving from area information extraction unit; Otherwise, using these data as can not deal with data exporting.
CN201210085428.1A 2012-03-23 2012-03-23 Method and system for extracting area information from enterprise name data Active CN102651013B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210085428.1A CN102651013B (en) 2012-03-23 2012-03-23 Method and system for extracting area information from enterprise name data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210085428.1A CN102651013B (en) 2012-03-23 2012-03-23 Method and system for extracting area information from enterprise name data

Publications (2)

Publication Number Publication Date
CN102651013A CN102651013A (en) 2012-08-29
CN102651013B true CN102651013B (en) 2014-04-16

Family

ID=46693021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210085428.1A Active CN102651013B (en) 2012-03-23 2012-03-23 Method and system for extracting area information from enterprise name data

Country Status (1)

Country Link
CN (1) CN102651013B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279523A (en) * 2013-05-29 2013-09-04 北京京东尚科信息技术有限公司 Method and device for processing address information
CN104036344A (en) * 2014-05-16 2014-09-10 上海倍通医药科技咨询有限公司 Method for standardizing enterprise names
CN107463583A (en) * 2016-06-06 2017-12-12 广州泰尔智信科技有限公司 Application developer region determines method and apparatus
CN106126614A (en) * 2016-06-21 2016-11-16 山东合天智汇信息技术有限公司 A kind of method and system reviewing Liang Ge enterprise multi-layer associated path
CN109408561A (en) * 2018-10-17 2019-03-01 杭州骑轻尘信息技术有限公司 Business Name matching process and device
CN109637671A (en) * 2018-11-13 2019-04-16 郭金荣 A kind of Adverse reaction monitoring management analysis method
CN109902148B (en) * 2019-02-21 2023-05-26 陈包容 Automatic enterprise name completion method for address book contacts
CN110990427B (en) * 2019-12-16 2024-05-10 北京智游网安科技有限公司 Method, system and storage medium for counting application program affiliated area
CN112597284B (en) * 2021-03-08 2021-06-15 中邮消费金融有限公司 Company name matching method and device, computer equipment and storage medium
CN113268986B (en) * 2021-05-24 2024-05-24 交通银行股份有限公司 Unit name matching and searching method and device based on fuzzy matching algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127050A (en) * 2007-07-03 2008-02-20 北京大学 Method for automatically extracting website owner administrative apanage information from web page
CN101206121A (en) * 2006-09-20 2008-06-25 高德软件有限公司 Placename retrieval device
CN101388023A (en) * 2008-09-12 2009-03-18 北京搜狗科技发展有限公司 Electronic map interest point data redundant detecting method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9311649B2 (en) * 2009-12-20 2016-04-12 Iheartmedia Management Services, Inc. System and method for managing media advertising enterprise data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101206121A (en) * 2006-09-20 2008-06-25 高德软件有限公司 Placename retrieval device
CN101127050A (en) * 2007-07-03 2008-02-20 北京大学 Method for automatically extracting website owner administrative apanage information from web page
CN101388023A (en) * 2008-09-12 2009-03-18 北京搜狗科技发展有限公司 Electronic map interest point data redundant detecting method and system

Also Published As

Publication number Publication date
CN102651013A (en) 2012-08-29

Similar Documents

Publication Publication Date Title
CN102651013B (en) Method and system for extracting area information from enterprise name data
CN102156740B (en) SQL (structured query language) statement processing method and system
CN103064970B (en) Optimize the search method of interpreter
CN102622396B (en) A kind of web services clustering method based on label
CN102402615B (en) Method for tracking source information based on structured query language (SQL) sentences
CN111127068B (en) Automatic pricing method and device for engineering quantity list
CN103823896A (en) Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm
CN103793422A (en) Methods for generating cube metadata and query statements on basis of enhanced star schema
CN104679827A (en) Big data-based public information association method and mining engine
CN104866576A (en) Method and apparatus for automatically constructing Data Vault-modeled data warehouse
CN106294498A (en) A kind of data processing method and equipment
CN106407394A (en) A patent database management analysis method
CN105095436A (en) Automatic modeling method for data of data sources
CN108345689B (en) Trademark registration success rate query method and device, and trademark registration method and device
CN103294820A (en) WEB page classifying method and system based on semantic extension
CN106980639B (en) Short text data aggregation system and method
CN101256594A (en) Method and system for measuring graph structure similarity
CN116881430A (en) Industrial chain identification method and device, electronic equipment and readable storage medium
CN107305555A (en) Data processing method and device
CN116739626A (en) Commodity data mining processing method and device, electronic equipment and readable medium
EP2518668A1 (en) Apparatus and method for visualizing technology transition
CN104573098B (en) Extensive object identifying method based on Spark systems
Caron et al. Identification of organization name variants in large databases using rule-based scoring and clustering: With a case study on the web of science database
CN116561345A (en) Information knowledge graph construction method based on multi-mode data company
CN111090630A (en) Data fusion processing method based on multi-source spatial point data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant