201040752 六、發明說明: 【發明所屬之技術領域】 本發明涉及網際網路技術領域,特別涉及一種提供地 區化資訊的方法和系統。 【先前技術】 網際網路技術中,可以提供各種形式的資訊,諸如新 Ο 聞、體育 '娛樂資訊等。還有透過論壇(BBS )、博客( Blog)、相冊、視頻等網站提供的豐富的資訊。 目前,這些資訊的提供都是網站自主提供的。即使存 在提供按照關鍵字搜索或分類的網際網路資訊,例如一些 門戶網站透過搜索引擎提供的按照關鍵字搜索得到的網際 網路資訊,也僅僅能做到從原始抓取的來源豐富的網頁中 收集出現該關鍵字的資訊。例如透過搜索引擎中搜索“北 京”關鍵字的新聞,則搜索到的內容是所有包含“北京” 〇 關鍵字的新聞,而往往這類搜索到的新聞中並不都是發生 在北京本地的新聞,也就不能滿足用戶真正的按照地區搜 索新聞的意圖。 隨著網際網路技術的發展和網際網路用戶需求的增加 ,需要一種技術來實現地區化資訊的提供。但是現有技術 中還沒有這樣一種技術來滿足用戶的需求。 【發明內容】 本發明實施例的目的是提供一種提供地區化資訊的方 -5- 201040752 法和系統,以實現地區化資訊的提供。 爲解決上述技術問題,本發明實施例提供一種提供地 區化資訊的方法和系統這樣實現的: 一種提供地區化資訊的方法,包括: 擷取文件檔資料中的地理資訊; 根據擷取的地理資訊在預設的地理資料庫搜尋對應的 地理屬性,並爲所述文件檔資料標記所述搜尋到的地理屬 性; 獲取用戶的地理屬性; 將標記的地理屬性與用戶的地理屬性匹配的文件檔資 料提供給用戶。 一種提供地區化資訊的方法,包括: 擷取文件檔資料中的地理資訊; 獲取用戶的地理屬性; 將地理資訊與用戶的地理屬性匹配的文件檔資料提供 給用戶。 一種提供地區化資訊的系統,包括: 文件檔資料地理資訊擷取模組,用於擷取文件檔資料 中的地理資訊; 地理資料庫,用於儲存地名名稱和表示地理資訊的詞 以及地理資訊之間的隸屬關係; 標記模組,用於根據擷取的地理資訊在地理資料庫搜 尋對應的地理屬性,並爲所述文件檔資料標記所述搜尋到 的地理屬性; -6 - 201040752 用戶地理屬性獲取模組,用於獲取用戶的地理屬性; 輸出模組,用於將標記的地理屬性與用戶的地理屬性 匹配的文件檔資料提供給用戶。 一種提供地區化資訊的系統,包括: 文件檔資料地理資訊擷取模組,用於擷取文件檔資料 中的地理資訊; 用戶地理屬性獲取模組,用於獲取用戶的地理屬性; 〇 輸出模組,用於將與用戶的地理屬性匹配的文件檔資 料提供給用戶。 由以上本發明實施例提供的技術方案可見,擷取文件 檔資料中的地理資訊,根據擷取的地理資訊在預設的地理 資料庫搜尋對應的地理屬性,並爲所述文件檔資料標記所 述地理屬性,獲取用戶的地理屬性,將標記的地理屬性與 用戶的地理屬性匹配的文件檔資料提供給用戶,這樣,可 以提供適於用戶的地區化資訊。而且,該方法實施例中, Ο 實現了對用戶和文件檔資料資訊的地區化,從而幫助用戶 更快更精準的找到需要的資訊。 【實施方式】 具體實施方式 本發明實施例提供一種提供地區化資訊的方法和系統 〇 爲了使本技術領域的人員更好地理解本發明方案,下 面結合附圖和實施方式對本發明實施例作進一步的詳細說 -7- 201040752 明。 以下介紹本發明提供地區化資訊的方法實施例。圖i 示出了該實施例的流程,如圖1所示’該方法實施例包括 S101 :擷取文件檔資料中的地理資訊。 網際網路中’存在於網頁形式中的新聞、體育、娛樂 、博客、論壇、相冊、視頻等的內容,很多存在地理資訊 。例如’存在省、市、區之類資訊,而網頁中的這些資訊 一般都是文件檔資料。該步驟中,就是要將文件檔資料內 容中的地理資訊擷取出來。 以下例舉該步驟的兩種具體的實現方式。 以下介紹方式一: 該方式中,可以預設地名詞庫,該地名詞庫中儲存有 地名名稱。例如省級行政區類的省、直轄市、自治區、特 別行政區,地區級行政區類的地區市、地區、自治州、盟 名稱等,縣級行政區類的市轄區、縣級市、縣、自治縣、 旗、自治旗、特區、林區名稱等,鄕級行政區類的鎭、鄕 、街道、蘇木名稱等,村級行政區類的社區、居委會、村 名稱等。 當然,這個地名詞庫中還可以包括任何表示地理資訊 的詞,例如高校名稱、興趣點資料(Point Of Interest, P〇 I)名稱、企業名稱、特產名稱、社區名稱、景點名稱 等,因爲這些詞也都可以代表地理資訊。例如高校,清華 大學可以代表北京市海澱區五道口地區這一地理資訊;例 -8 - 201040752 如興趣點資料,毛家飯店藍堡店可以代 藍堡國際中心這一地理資訊;例如企業 代表浙江省杭州市文二路3 9 1號這一地 名稱,西湖龍井可以代表浙江省杭州市 訊;例如景點名稱,頤和園可以代表北 理資訊;例如社區名稱,陽光100可以 西大望路這一地理資訊;等等。 〇 則該方式可以包括:根據預設的地 資料中的地理資訊。 該步驟,簡單地說,即是搜尋出文 預設的地名詞庫中的地名名稱。關於如 詞庫搜尋文件檔資料中的地名名稱或其 其他地理資訊可以如上面描述的高校名 稱、企業名稱、特產名稱、社區名稱、 的,所述根據預設的地名詞庫擷取文件 〇 訊可以有多種方式實現,這裏不再具體 以下介紹方式二: 該方式中,可以預設地名尾碼詞庫 中儲存有地名尾碼。例如,該地名尾碼 、縣 '鄕、區、路、街等地名尾碼。 則該方式可以包括:根據預設的地 件檔資料中存在的地名尾碼,並將地名 詞作爲該文件檔資料的地理資訊。 例如文件檔資料中可以搜尋到地名 表北京市西大望路 名稱,淘寶網可以 理資訊;例如特產 西湖區這一地理資 足市海殿區這一地 代表北京市朝陽區 名詞庫擷取文件檔 件檔資料中出現在 何利用預設的地名 他地理資訊,所述 稱、興趣點資料名 景點名稱等。具體 檔資料中的地理資 展開描述。 ,該地名尾碼詞庫 詞庫中包括省、市 名尾碼詞庫搜尋文 尾碼前固定出現的 尾碼詞庫中的“市 -9- 201040752 ”,並且在該尾碼“市”之前固定出現的詞爲“北京”, 則可以將“北京”作爲該文件檔資料的地理資訊。 這裏,作爲實施例,給出了兩種擷取文件檔資料中地 理資訊的方式,當然,本領域技術人員應當知道,還存在 其他擷取文件檔資料中地理資訊的方式,而本發明涵蓋的 範圍應當包括該步驟的這些不同實施方式。 實際當中,還可能存在同一文件檔資料中出現多個不 同地理資訊的情況。這時候,按照上述兩種方式或其他方 式擷取地理資訊後,該文件檔資料中可能擷取出若干不同 的地理資訊。而一般地,同一文件檔資料中,描述的內容 應當具有一個中心地理資訊。例如,在談到四川地震新聞 的文件檔資料中,會擷取得到四川這一地理資訊,但是, 同時該新聞中還可能談到其他省、市對四川的援助,這樣 還會擷取得到例如廣東、北京這些地理資訊。而擷取到的 這些地理資訊中,四川應當是中心地理資訊。 那麼,以下給出確定擷取的多個地理資訊中的中心地 理資訊的一種實現方式: 對於同一文件檔資料中擷取出的多個地理資訊’將這 些地理資訊中出現次數最多的作爲該文件檔資料的中心地 理資訊,也就是作爲該文件檔資料最終的地理資訊。 例如上述例子中,四川在該文件檔資料中出現6次’ 北京出現2次,廣東出現1次,則將出現次數最多的’即 出現6次的四川確定爲該文件檔資料的中心地理資訊’也 就是確定爲該文件檔資料最終的地理資訊。 -10- 201040752 仍然以四川地震新聞的文件檔資料爲例,可能在該文 件檔資料中,談到較多的是四川境內發生災情的市、縣、 自治區等。同時,還可能談到北京、廣東等省市對災區的 援助。該情況下,四川、北京、廣東在該文件檔資料中出 現的次數可能相同,但是,四川應當是該文件檔資料中的 中心地理資訊。 那麼,以下給出確定擷取的多個地理資訊中的中心地 〇 理資訊的一種實現方式: 對於擷取出的多個地理資訊,按照行政區劃隸屬關係 統計隸屬的地理資訊出現次數;將擷取到的地理資訊和統 計的隸屬於的地理資訊中出現次數最多的作爲該文件檔資 料的中心地理資訊,也就是作爲該文件檔資料最終的地理 資訊。 如文件檔資料中出現1次四川,1次汶川,1次綿竹 ,1次北川,1次北京,1次廣東,則,由於汶川、綿竹、 〇 北川都隸屬於四川這一行政區劃,因此統計爲3次四川, 加上出現的1次四川,則四川共統計出現4次,而北京和 廣東各出現1次,這樣,四川出現次數最多,將四川作爲 該文件檔資料中擷取的中心地理資訊,也就是該文件檔資 料最終的地理資訊。 這裏的行政區劃隸屬關係,可以透過預設的地理資料 庫實現。該地理資料庫中,除了具備前述預設的地名詞庫 的全部地理名稱外,還有所有地理資訊之間的隸屬關係。 例如,該地理資訊詞庫中包括四川這一省級區劃,四川之 -11 - 201040752 下包括所有市級的行政區劃’每個市下面包括縣級的行政 區劃’每個縣下面包括區給的行政區劃,依次類推,並且 其他省級行政區劃也類似。當然,所述地理資料庫中還可 以包括國家級別的地理資訊,並且,不同國家之下包括各 自的州、省等行政區劃,在此不再贅述。 這樣,按照預設的地理資料庫,可以實現將擷取出的 多個地理資訊按照行政區劃隸屬關係統計隸屬的地理資訊 出現次數。 需要說明的是,前述方式一中的預設的地名詞庫,可 以採用這裏的地理資料庫。 S102:根據擷取的地理資訊在預設的地理資料庫搜尋 對應的地理屬性,並爲所述文件檔資料標記所述地理屬性 〇 這裏的預設的地理資料庫,可以與前述S101中的地 理資料庫相同。其中儲存有地名名稱。例如省級行政區類 的省、直轄市、自治區、特別行政區,地區級行政區類的 地區市、地區、自治州、盟名稱等,縣級行政區類的市轄 區、縣級市、縣、自治縣、旗、自治旗、特區、林區名稱 等,鄕級行政區類的鎭、鄕、街道、蘇木名稱等,村級行 政區類的社區、居委會、村名稱等。不同地名名稱之間, 還存在行政區劃的隸屬關係。以中國爲例,其下可以包括 省、直轄市、特別行政區、自治區這類省級行政區劃,省 級行政區下包括地區市、地區、自治州、盟等地區級行政 區劃,地區級行政區下包括市轄區、縣級市、縣、自治縣 -12- 201040752 、旗、自治旗、特區、林區及其它縣級行政區劃,縣級行 政區下包括鎭、鄕、街道、蘇木及其它鄕級行政區劃,鄕 級行政區下包括社區、居委會、村等村級行政區劃。圖2 示出了預設的地理資料庫的結構圖。上述地理資料庫中的 地名名稱與行政區劃的隸屬關係可以如圖2中組織。 特別的,預設的地理資料庫中,還可以包括任何表示 地理資訊的詞,例如高校名稱、興趣點資料名稱、企業名 〇 稱、特產名稱、社區名稱、景點名稱等,因爲這些詞也都 可以代表地理資訊。這樣,與前面類似的,例如高校,清 華大學可以代表北京市海澱區五道口地區這一地理資訊; 例如興趣點資料,毛家飯店藍堡店可以代表北京市西大望 路藍堡國際中心這一地理資訊;例如企業名稱,淘寶網可 以代表浙江省杭州市文二路3 9 1號這一地理資訊;例如特 產名稱,西湖龍井可以代表浙江省杭州市西湖區這一地理 資訊;例如景點名稱,頤和園可以代表北京市海澱區這一 〇 地理資訊;例如社區名稱,陽光100可以代表北京市朝陽 區西大望路這一地理資訊;等等。當然,這些表示地理資 訊的詞也有與預設的地理資料庫中地理資訊之間的行政區 劃隸屬關係。 前述S101中擷取出文件檔資料中的地理資訊後,可 以在預設的地理資料庫中搜尋對應的地理屬性,並將地理 屬性標記在所述文件檔資料上。 例如擷取到文件檔資料中的地理資訊爲“大望路”, 根據預設的地理資料庫中可以搜尋到地理屬性爲“北京 -13- 201040752 市-朝陽區-大望路”,這樣,可以對該文件檔資料標記地 理屬性,例如完整的“北京市-朝陽區-大望路”。 si 03 :獲取用戶的地理屬性。 用戶具有一定的地理屬性。例如,用戶操作終端接入 網際網路過程中,所處的地理位置。這一地理位置可以透 過用戶終端接入網際網路的IP位址表明。 例如當前用戶終端的IP位址爲202.115.33.3,透過網 際協議(Internet Protocol,IP)位址的查詢,可以得知該 IP位址來自“四川大學工程設計中心”,而該地址的完整 地址爲“四川省-成都市-四川大學工程設計”,則該位址 可以作爲用戶的地理屬性。 這樣,透過查詢用戶終端的IP位址,可以獲取用戶 的地理屬性。 用戶的地理屬性,還可以是用戶登記的位址,如登記 家庭住址、學校地址、工作地址等。透過査詢用戶登記的 位址,可以獲取用戶的地理屬性。 此外,用戶的地理屬性,還可以是用戶制定的地理位 置。例如,用戶制定了廈門這一地理位置,貝[J s 1 03中’ 透過查詢用戶制定的地理位置,可以獲取到這一地理屬性 〇 另外,用戶的地理屬性,還可以是透過獲取用戶的經 緯度資訊得到用戶的地理屬性。例如用戶透過掌上型GPS 定位了當前的經緯度資訊,則可以透過獲取該經緯度資$ 得到用戶當前的地理位置。 -14- 201040752 再者,還可以是透過搜集用戶的上網流覽焦點得到用 戶的地理屬性。例如,用戶在一定時間段內透過網際網路 搜索或查詢九寨溝這一地理位置,則很有可能用戶希望在 未來的一段時期內去該地旅行,則透過搜集用戶這一段時 間內搜索或查詢的這一地理位置得到用戶的地理屬性。 獲取用戶地理屬性的方式有很多種,上面僅例舉出了 幾種,本領域技術人員應當知道,本發明並不限於上述幾 〇 種方式。只要可以獲取用戶的地理屬性,無論該地理屬性 是用戶當前的IP地址,或用戶登記的地理位置,或用戶 制定的地理位置,或用戶的經緯度資訊,或用戶的上網流 覽焦點,或者其他方式的用戶地理屬性,都應當涵蓋在本 發明實施例的範圍內。 s 1 04 :將標記的地理屬性與用戶的地理屬性匹配的文 件檔資料提供給用戶。 該步驟中,首先對文件檔資料上標記的地理屬性與獲 〇 取的用戶地理屬性進行匹配,如果匹配,則將對應的文件 檔資料提供給用戶。 需要說明的是,由於S101、S102中的文件檔資料可 以有很多,例如類似於搜索引擎,透過網路爬蟲抓取的當 天網際網路中所有的文件檔資料,或是幾天內的網際網路 中所有的文件檔資料。現有的網站提供的服務中,完全可 以有能力做到收集網際網路上所有的文件檔資料。 前述提到,標記到文件檔資料上的地理屬性中,可以 包括不同的行政區劃等級,例如一些文件檔資料上標記的 -15- 201040752 完整地理屬性爲“北京市-朝陽區-大望路” ’而另—些文 件檔資料上標記的完整地理屬性爲“北京市-朝陽區-建國 門”,還有一些文件檔資料上標記的完整地理屬性爲“北 京市-朝陽區”。如果用戶的地理位置爲大望路’則可以 提供標記爲“北京市-朝陽區-大望路”的文件檔資料給用 戶,而不提供標記爲“北京市-朝陽區-建國門”的文件檔 資料給用戶。 當然,也可以提供標記爲“北京市-朝陽區”的文件 檔資料給用戶。此時,標記爲“北京市-朝陽區”的文件 檔資料,可以包括標記爲“北京市-朝陽區-大望路”的文 件檔資料和標記爲“北京市-朝陽區-建國門”的文件檔資 料,當然還可以包括其他標記中包括“北京市-朝陽區” 的文件檔資料。 上述不同行政區劃等級的文件檔資料,可以分級提供 給用戶,即爲用戶的流覽進行逐級地區導航,或提供由用 戶選擇,例如提供給用戶選擇某一等級的行政區劃的文件 檔資料,如選擇北京市這一行政區劃等級的文件檔資料, 或選擇朝陽區這一行政區劃等級的文件檔資料。 所述將文件檔資料提供給用戶,本領域技術人員和一 般用戶都可以理解,包括將包含文件檔資料的網頁透過標 題鏈結的方式發送到用戶終端的流覽器頁面上。 需要說明的是,上述S101、S102與S103之間並沒有· 嚴格的先後順序,也可以是先執行S 1 03,再執行S 1 0 1和 S102’也就是說只要在S104之前得到S102和S103的結 -16- 201040752 果即可。 從上述本發明方法實施例可以看出,擷取文件檔資料 中的地理資訊,根據擷取的地理資訊在預設的地理資料庫 搜尋對應的地理屬性,並爲所述文件檔資料標記所述地理 屬性,獲取用戶的地理屬性,將標記的地理屬性與用戶的 地理屬性匹配的文件檔資料提供給用戶,這樣,可以提供 適於用戶的地區化資訊。而且,該方法實施例中,實現了 〇 對文件檔資料的地區化,從而幫助用戶更快更精準的找到 需要的資訊。 本發明提供地區化資訊的方法實施例,應用範圍廣泛 ’例如可以應用到以下幾種產品和服務中: η新聞搜索 透過上述方法實施例,可以自動對新聞按照地區進行 分類’生成地方新聞整合,再根據訪問用戶的地區屬性, 主動推薦相關的新聞內容給用戶,真正實現“講述老百姓 〇 身邊的故事”,同時還可以透過給新聞標注完整地區資訊 的優勢,爲用戶的流覽進行逐級地區導航。 2)生活服務資訊提供 生活服務是當前中國網際網路熱點。同其他資訊相比 ,生活服務資訊更強調地區化、準確、及時。利用上述本 發明的方法實施例,可以對生活服務資訊按照地區進行有 效的整合,當用戶訪問時,能夠有效地識別用戶的地區屬 性’配合經過地區化處理的資料,可以主動將本地生活服 務資訊推送給用戶,以便於用戶對生活服務資訊使用的便 -17- 201040752 利度,提高生活服務資訊的服務效果和效率。例如關於某 一地區內的餐飲資訊,打折資訊、房屋租售資訊、小時工 資訊等生活服務資訊,針對性的提供給與該地區相關的用 戶,可以利用用戶便利的瞭解該地區的生活服務資訊,從 而提高生活服務資訊的服務效果和效率。 3)社區(Community)和社會化網路服務(Society Network Service,SNS) 當前的社區服務包括論壇,博客,相冊,群組等服務 ,目前多是以話題來整合資訊,用戶在檢索資料也大多是 透過關鍵字進行。採用本發明上述方法實施例對這些資料 進行地區化處理後,就可以分不同地方、不同區域等級來 整合資料,同樣是根據用戶的地區屬性進行有效引導和推 薦,讓用戶方便地瞭解到身邊網友都在關注什麼,並能夠 按地區來整合朋友圈,形成社會化網路,加強用戶的網上 社交和互動。 綜上,透過對網際網路資訊的地區化處理,並對用戶 地區屬性的識別,可以有效的將網際網路中的大量資料根 據用戶地理屬性提供給用戶,從而提高了用戶獲取資料和 資訊的效率和效果,在網際網路服務中具有廣闊的應用前 景。 以下介紹本發明提供地區化資訊的另一方法實施例, 圖3示出了該方法實施例的流程圖,如圖3中所述,該方 法實施例可以包括: S 3 0 1 :擷取文件檔資料中的地理資訊。 -18- 201040752 該步驟與前述S101類似,可以包括兩種實現方式: 方式一:根據預設的地名詞庫擷取文件檔資料中的地 理資訊;所述地名詞庫中儲存有地名名稱和表示地理資訊 的詞。 方式二·根據預設的地名尾碼詞庫搜尋文件檔資料中 存在的地名尾碼,並將地名尾碼前固定出現的詞作爲所述 文件檔資料的地理資訊。 〇 上述兩種方式具體請參見S101中對應的兩種方式, 在此不再贅述。 實際當中,還可能存在同一文件檔資料中出現多個不 同地理資訊的情況。這時候,按照上述兩種方式或其他方 式擷取地理資訊後,該文件檔資料中可能擷取出若干不同 的地理資訊。而一般地,同一文件檔資料中,描述的內容 應當具有一個中心地理資訊。 那麼,以下給出確定擷取的多個地理資訊中的中心地 〇 理資訊的兩種實現方式: 方式一:對於同一文件檔資料中擷取出的多個地理資 訊,將這些地理資訊中出現次數最多的作爲該文件檔資料 最終的地理資訊。 方式二:對於擷取出的多個地理資訊,根據預設的地 理資料庫,按照行政區劃隸屬關係統計隸屬的地理資訊出 現次數;所述地理資料庫中儲存有地名名稱和表示地理資 訊的詞以及地理資訊之間的隸屬關係;將擷取到的地理資 訊和統計的隸屬於的地理資訊中出現次數最多的作爲該文 -19- 201040752 件檔資料最終的地理資訊。 上述確定擷取的多個地理資訊中的中心地理資訊的兩 種實現方式與前述S101中對應的兩種方式類似’在此也 不再贅述。 S3 02 :獲取用戶的地理屬性。 透過查詢用戶終端的ip位址獲取用戶的地理屬性; 或, 透過查詢用戶登記的位址獲取用戶的地理屬性;或, 透過查詢用戶制定地理位置獲取到這一地理屬性;或 > 透過獲取用戶的經緯度資訊得到用戶的地理屬性;或 透過搜集用戶的上網流覽焦點得到用戶的地理屬性。 該步驟與前述S103類似。 S3 03 :將地理資訊與用戶的地理屬性匹配的文件檔資 料提供給用戶。 該步驟中,直接將文件檔資料的地理資訊與獲取的用 戶地理屬性進行匹配,如果匹配,則將對應的文件檔資料 提供給用戶。 上述不同行政區劃等級的文件檔資料,可以分級提供 給用戶,即爲用戶的流覽進行逐級地區導航,或提供由用 戶選擇’例如提供給用戶選擇某一等級的行政區劃的文件 檔資料。 這樣’具體的,可以將文件檔資料的地理資訊與用戶 -20- 201040752 的地理屬性匹配的文件檔資料,按照不同行政區劃等級分 級提供給用戶,或提供由用戶選擇。 所述將文件檔資料提供給用戶,本領域技術人員和一 般用戶都可以理解,包括將包含文件檔資料的網頁透過標 題鏈結的方式發送到用戶終端的流覽器頁面上。 需要說明的是,上述S3 01與S3 02間並沒有嚴格的先 後順序’也可以是先執行S 3 02,再執行s 3 Ο 1,也就是說 〇 只要在S303之前完成S301的結果和S302的結果即可。 以下介紹本發明提供地區化資訊的系統的一個實施例 ,圖4示出了該系統實施例的框圖,如圖4所示,該系統 實施例可以包括: 文件檔資料地理資訊擷取模組41,用於擷取文件檔資 料中的地理資訊; 地理資料庫42,用於儲存地名名稱和表示地理資訊的 詞以及地理資訊之間的隸屬關係; 〇 標記模組43,用於根據擷取的地理資訊在地理資料庫 搜尋對應的地理屬性,並爲所述文件檔資料標記所述搜尋 到的地理屬性; 用戶地理屬性獲取模組44,用於獲取用戶的地理屬性 9 輸出模組45,用於將標記的地理屬性與用戶的地理屬 性匹配的文件檔資料提供給用戶。 優選地,所述系統還可以如圖5所示,在圖4的基礎 上進一步包括地名詞庫51,其中儲存有地名名稱和表示地 -21 - 201040752 理資訊的詞; 這樣,所述文件檔資料地理資訊擷取模組41可以根 據地名詞庫擷取文件檔資料中的地理資訊。 需要說明的是,地名詞庫5 1,可以位於文件檔資料地 理資訊擷取模組41之內,也可以位於文件檔資料地理資 訊擷取模組4 1之外的系統中。 所述系統還可以如圖6所示,在圖4的基礎上進一步 包括地名尾碼詞庫61,所述文件檔資料地理資訊擷取41 模組根據預設的地名尾碼詞庫61搜尋文件檔資料中存在 的地名尾碼,並將地名尾碼前固定出現的詞作爲所述文件 檔資料的地理資訊。 需要說明的是,地名尾碼詞庫6 1,可以位於文件檔資 料地理資訊擷取模組41之內,也可以位於文件檔資料地 理資訊擷取模組4 1之外的系統中。 所述系統還可以在圖4、或圖5、或圖6的基礎上進 一步包括中心地理資訊確定模組71,這裏僅示出在圖4基 礎上包括中心地理資訊確定模組71的情況,如圖7所示 ,用於將文件檔資料地理資訊擷取模組41對同一文件檔 資料中擷取出的多個地理資訊中的出現次數最多的確定爲 該文件檔資料最終的地理資訊。 所述系統還還可以在圖4、或圖5、或圖6的基礎上 進一步包括中心地理資訊確定模組8 1和預設的地理資料 庫82,這裏僅示出在圖4基礎上包括中心地理資訊確定模 組8 1和預設的地理資料庫8 2的情況,如圖8所示。 -22- 201040752 中心地理資訊確定模組8 1,用於將文件檔資料地理資 訊擷取模組41對同一文件檔資料中擷取出的多個地理資 訊,根據預設的地理資料庫8 2,按照行政區劃隸屬關係統 計隸屬的地理資訊出現次數’並將擷取到的地理資訊和統 計的隸屬於的地理資訊中出現次數最多的作爲該文件檔資 料最終的地理資訊;所述地理資料庫82中儲存有地名名 稱和表示地理資訊的詞以及地理資訊之間的隸屬關係。 0 優選地,所述系統中,所述用戶地理屬性獲取模組44 ,透過查詢用戶終端的ip位址獲取用戶的地理屬性;或 透過查詢用戶登記的位址獲取用戶的地理屬性;或, 透過查詢用戶制定地理位置獲取到這一地理屬性;或 y 透過獲取用戶的經緯度資訊得到用戶的地理屬性;或 , Ο 透過搜集用戶的上網流覽焦點得到用戶的地理屬性。 以下介紹本發明提供地區化資訊的系統的另一實施例 ,圖9示出了該系統實施例的框圖,如圖9所示,該系統 實施例可以包括: 文件檔資料地理資訊擷取模組9 1,用於擷取文件檔資 料中的地理資訊; 用戶地理屬性獲取模組92,用於獲取用戶的地理屬性 » 輸出模組93,用於將與用戶的地理屬性匹配的文件檔 -23- 201040752 資料提供給用戶。 優選地,所述系統還可以如圖1 〇所示,進一步包括 地名詞庫101,其中儲存有地名名稱和表示地理資訊的詞 這樣,所述文件檔資料地理資訊擷取模組91根據地 名詞庫擷取文件檔資料中的地理資訊。 需要說明的是,地名詞庫1 0 1,可以位於文件檔資料 地理資訊擷取模組91之內,也可以位於文件檔資料地理 資訊擷取模組9 1之外的系統中。 優選地,所述系統還可以如圖11所示,進一步包括 地名尾碼詞庫1 1 1,所述文件檔資料地理資訊擷取模組91 根據預設的地名尾碼詞庫1 1 1搜尋文件檔資料中存在的地 名尾碼,並將地名尾碼前固定出現的詞作爲所述文件檔資 料的地理資訊。 需要說明的是,地名尾碼詞庫〗11,可以位於文件檔 資料地理資訊擷取模組91之內,也可以位於文件檔資料 地理資訊擷取模組91之外的系統中。 優選地,所述系統還可以在圖9、或圖10、或圖11 的基礎上進一步包括中心地理資訊確定模組121,這裏僅 示出在圖9基礎上包括中心地理資訊確定模組121的情況 ,如圖1 2所示,中心地理資訊確定模組1 2 1用於將文件 檔資料地理資訊擷取模組9 1對同一文件檔資料中擷取出 的多個地理資訊中的出現次數最多的確定爲該文件檔資料 最終的地理資訊。 -24 - 201040752 需要說明的是,中心地理資訊確定模組121 ’可以位 於文件檔資料地理資訊擷取模組91之內’也可以位於文 件檔資料地理資訊擷取模組9 1之外的系統中。 優選地,所述系統還可以在圖9、或圖10、或圖 的基礎上進一步包括中心地理資訊確定模組131和預設的 地理資料庫132,這裏僅示出在圖9基礎上包括中心地理 資訊確定模組1 3 1和預設的地理資料庫1 3 2的情況’如圖 〇 1 3所示。 中心地理資訊確定模組1 3 1,用於將文件檔資料地理 資訊擷取模組91對同一文件檔資料中擷取出的多個地理 資訊,根據預設的地理資料庫1 3 2,按照行政區劃隸屬關 係統計隸屬的地理資訊出現次數,並將擷取到的地理資訊 和統計的隸屬於的地理資訊中出現次數最多的作爲該文件 檔資料最終的地理資訊;所述地理資料庫132中儲存有地 名名稱和表示地理資訊的詞以及地理資訊之間的隸屬關係201040752 VI. Description of the Invention: [Technical Field] The present invention relates to the field of Internet technologies, and in particular, to a method and system for providing regionalized information. [Prior Art] Internet technology can provide various forms of information, such as new news, sports and entertainment information. There is also a wealth of information available through websites such as forums (BBS), blogs, photo albums, and videos. Currently, the provision of this information is provided by the website itself. Even if there is Internet information that is searched or classified by keyword, for example, some portal websites provide keyword-based Internet information through keyword search, and can only be obtained from the source-rich webpages that were originally crawled. Collect information about this keyword. For example, by searching for news of the "Beijing" keyword in the search engine, the searched content is all the news containing the "Beijing" keyword, and often the news in this type of search is not all news happening in Beijing. , it can not meet the user's intention to search for news by region. With the development of Internet technology and the increasing demand of Internet users, a technology is needed to provide regionalized information. However, there is no such technology in the prior art to meet the needs of users. SUMMARY OF THE INVENTION It is an object of embodiments of the present invention to provide a method and system for providing regionalized information to realize the provision of regionalized information. In order to solve the above technical problem, an embodiment of the present invention provides a method and system for providing localized information. The method and system for providing localized information are as follows: A method for providing regionalized information includes: extracting geographic information in file file data; Searching for a corresponding geographic attribute in a preset geographic database, and marking the searched geographic attribute for the file file; obtaining a user's geographic attribute; and matching the marked geographic attribute with the user's geographic attribute Provided to the user. A method for providing localized information, comprising: extracting geographic information in a file file; obtaining a user's geographic attribute; and providing the user with a file file that matches the geographic information with the geographic attribute of the user. A system for providing localized information, including: a file file data geographic information capture module for capturing geographic information in file file data; a geographic database for storing place name names and words representing geographic information and geographic information a affiliation relationship; a tagging module, configured to search for a corresponding geographic attribute in the geographic database according to the captured geographic information, and mark the searched geographic attribute for the file file; -6 - 201040752 User Geography The attribute acquisition module is configured to obtain the geographic attribute of the user; and the output module is configured to provide the file file data that matches the geographical attribute of the tag and the geographic attribute of the user to the user. A system for providing localized information, comprising: a file file data geographic information capture module for capturing geographic information in a file file data; a user geographic attribute acquisition module for obtaining a user's geographic attribute; A group for providing file files that match the geographic attributes of the user to the user. According to the technical solution provided by the embodiment of the present invention, the geographic information in the file file data is retrieved, and the corresponding geographic attribute is searched in the preset geographic database according to the captured geographic information, and the file information is marked for the file file. The geographic attribute is obtained, the geographic attribute of the user is obtained, and the file file data matching the geographical attribute of the tag and the geographic attribute of the user is provided to the user, so that the localized information suitable for the user can be provided. Moreover, in the embodiment of the method, Ο realizing the localization of information about the user and the file file, thereby helping the user to find the required information faster and more accurately. EMBODIMENT OF THE INVENTION The embodiments of the present invention provide a method and system for providing localized information. In order to enable those skilled in the art to better understand the solution of the present invention, the embodiments of the present invention are further described below with reference to the accompanying drawings and embodiments. The detailed description of -7- 201040752 Ming. The following describes an embodiment of the method of the present invention for providing localized information. Figure i shows the flow of this embodiment, as shown in Figure 1. The method embodiment includes S101: extracting geographic information in the file file data. There are many geographical information in the Internet, such as news, sports, entertainment, blogs, forums, photo albums, videos, etc. that exist in the form of web pages. For example, there are information such as provinces, cities, and districts, and the information in web pages is generally file information. In this step, the geographic information in the file data content is extracted. Two specific implementations of this step are exemplified below. The following describes the first method: In this method, a noun library can be preset, and the name of the place name is stored in the local noun library. For example, provinces, municipalities directly under the Central Government, autonomous regions, special administrative regions, regional administrative regions, regional cities, regions, autonomous prefectures, alliance names, etc., county-level administrative districts, municipal districts, counties, counties, counties, autonomous counties, flags, and autonomy Flags, SARs, forest names, etc., names of 鎭, 鄕, streets, and Sumu in the administrative districts of the 鄕, the names of communities, neighborhood committees, and villages in the village-level administrative districts. Of course, this local noun library can also include any words that represent geographic information, such as the name of the university, the name of the point of interest (P〇I), the name of the company, the name of the specialty, the name of the community, the name of the attraction, etc., because these Words can also represent geographic information. For example, colleges and universities, Tsinghua University can represent the geographical information of Wudaokou District, Haidian District, Beijing; Example-8 - 201040752 For example, the information of interest points, Maojiadian Lanbao Store can represent the geographic information of the Blue Fort International Center; for example, the representative of Zhejiang Province Name of the land at No. 3, 1 Wen 2 Road, Hangzhou, West Lake Longjing can represent Hangzhou City, Zhejiang Province; for example, the name of the attraction, the Summer Palace can represent Beili Information; for example, the name of the community, Sunshine 100 can be the geographic information of Xi Dawang Road; and many more. 〇 The method may include: according to the geographical information in the preset local data. This step, in a nutshell, is to search for the name of the place name in the predicate local noun library. For the name of the place name in the vocabulary search file file or other geographic information, as described above, the name of the university, the name of the enterprise, the name of the special product, the name of the community, and the file may be obtained according to the preset local noun library. It can be implemented in a variety of ways. Here, the second method is not specifically described. In this method, the name end code can be stored in the preset name code ending lexicon. For example, the name of the place name, the county '鄕, district, road, street and other places name end code. Then, the method may include: using the place name end code existing in the preset feature file data, and using the place name word as the geographic information of the file file data. For example, the file file information can be found in the name of the Beijing West Dawang Road name, Taobao can manage the information; for example, the special product Xihu District, the geographical capital of the city, Haidian District, this place represents the Beijing Chaoyang District noun library to obtain documents In the file data, the geographical information of the place name, the name of the interest, the name of the attraction, and the like are used. A description of the geographic assets in the specific file. The name code ending lexicon vocabulary includes the “city-9-201040752” in the tail code lexicon of the provincial and municipal end code lexicons, and before the end code “city” If the fixed word is “Beijing”, then “Beijing” can be used as the geographic information of the file. Here, as an embodiment, two ways of extracting geographic information in the file file data are given. Of course, those skilled in the art should know that there are other ways to retrieve the geographic information in the file file data, and the present invention covers the The scope should include these different implementations of this step. In practice, there may be multiple different geographic information in the same file. At this time, after the geographic information is obtained by the above two methods or other methods, a certain number of different geographic information may be extracted from the file data. In general, the description of the same document file should have a central geographic information. For example, when it comes to the file information of the Sichuan Earthquake News, it will get the geographic information of Sichuan. However, at the same time, the news may also talk about the assistance of other provinces and cities to Sichuan, so that Geographic information such as Guangdong and Beijing. Among these geographic information, Sichuan should be the central geographic information. Then, the following gives an implementation of determining the central geographic information in the plurality of geographic information captured: a plurality of geographic information extracted from the same file profile, and the most frequently occurring of the geographic information is used as the file file. The central geographic information of the data, that is, the final geographic information of the file. For example, in the above example, Sichuan appeared 6 times in the file file, '2 occurrences in Beijing, 1 occurrence in Guangdong, and then the most frequently occurring 'that is 6 times Sichuan is determined as the central geographic information of the file file'. That is to determine the final geographic information of the file. -10- 201040752 Still taking the file data of the Sichuan Earthquake News as an example, it may be mentioned in the document file that the cities, counties, and autonomous regions in Sichuan have a lot of disasters. At the same time, it may also talk about the assistance of provinces and cities such as Beijing and Guangdong to the disaster areas. In this case, Sichuan, Beijing, and Guangdong may have the same number of times in the file, but Sichuan should be the central geographic information in the file. Then, an implementation method for determining the central location information in the plurality of geographic information retrieved is given below: For the plurality of geographic information extracted, the number of geographic information belonging to the administrative division is counted according to the administrative division membership relationship; The geographic information that belongs to the geographic information and statistics belongs to the most frequently used as the central geographic information of the file, that is, the final geographic information of the file. For example, there are 1 Sichuan in the file, 1 Wenchuan, 1 Mianzhu, 1 Beichuan, 1 Beijing, 1 Guangdong, because Wenchuan, Mianzhu, and Beibeichuan belong to the administrative division of Sichuan, so statistics For Sichuan 3 times, plus 1 Sichuan, there are 4 statistics in Sichuan, and 1 in Beijing and Guangdong. In this way, Sichuan has the most occurrence, and Sichuan is the central geography of the file. Information, which is the final geographic information of the file. The administrative divisional affiliation here can be achieved through a preset geographic database. In the geographic database, in addition to all the geographical names of the aforementioned predicate nouns, there is also a affiliation between all geographic information. For example, the geographic information vocabulary includes Sichuan Province-level zoning, and all of the municipal-level administrative divisions are included under Sichuan -11 - 201040752. Each city includes county-level administrative divisions. Administrative divisions, and so on, and other provincial administrative divisions are similar. Of course, the geographic database may also include geographic information at the national level, and the administrative divisions of the states, provinces, and the like are included in different countries, and will not be described herein. In this way, according to the preset geographic database, the number of occurrences of geographic information to which the plurality of extracted geographic information is statistically affiliated according to the administrative division membership relationship can be realized. It should be noted that the preset local noun library in the foregoing mode 1 can use the geographic database here. S102: Search for a corresponding geographic attribute in a preset geographic database according to the captured geographic information, and mark the geographic attribute of the file attribute data, the preset geographic database here, and may be the same as the foregoing geographic information in S101. The database is the same. It stores the name of the place name. For example, provinces, municipalities directly under the Central Government, autonomous regions, special administrative regions, regional administrative regions, regional cities, regions, autonomous prefectures, alliance names, etc., county-level administrative districts, municipal districts, counties, counties, counties, autonomous counties, flags, and autonomy Flags, SARs, forest names, etc., names of 鎭, 鄕, streets, and Sumu in the administrative districts of the 鄕, the names of communities, neighborhood committees, and villages in the village-level administrative districts. There are also affiliation of administrative divisions between different names. Taking China as an example, it may include provincial-level administrative divisions such as provinces, municipalities directly under the Central Government, special administrative regions, and autonomous regions. The provincial-level administrative districts include regional-level administrative divisions such as regional cities, regions, autonomous prefectures, and alliances, and district-level administrative districts include municipal jurisdictions. County-level cities, counties, and autonomous counties -12- 201040752, flags, autonomous flags, special zones, forest areas, and other county-level administrative divisions. The county-level administrative districts include administrative divisions including 鎭, 鄕, street, Sumu, and other 鄕. Under the administrative level, there are village-level administrative divisions such as communities, neighborhood committees and villages. Figure 2 shows a block diagram of a preset geographic database. The affiliation relationship between the name of the geographical name and the administrative division in the above geographical database can be organized as shown in Fig. 2. In particular, the preset geographic database may also include any words indicating geographic information, such as university name, point of interest data name, company name nickname, specialty name, community name, attraction name, etc., because these words are also Can represent geographic information. In this way, similar to the previous ones, such as colleges and universities, Tsinghua University can represent the geographical information of Wudaokou District, Haidian District, Beijing; for example, the information of interest points, Maojia Hotel Lanbao Store can represent the geography of Beijing Blue Star International Center Information; for example, the name of the company, Taobao can represent the geographic information of No. 39, Wenji Road, Hangzhou, Zhejiang Province; for example, the name of the specialty, West Lake Longjing can represent the geographical information of Xihu District, Hangzhou, Zhejiang Province; for example, the name of the attraction, The Summer Palace can represent this geographical information in Haidian District of Beijing; for example, the name of the community, Sunshine 100 can represent the geographical information of Xi Dawang Road, Chaoyang District, Beijing; Of course, these words representing geographic information also have administrative affiliation with geographic information in a predetermined geographic database. After extracting the geographic information in the file file data in the foregoing S101, the corresponding geographic attribute may be searched in the preset geographic database, and the geographical attribute is marked on the file file. For example, the geographic information in the file file data is “Dawang Road”. According to the preset geographic database, the geographic attribute can be found as “Beijing-13-201040752 City-Chaoyang District-Dawang Road”, so that The document file data indicates geographical attributes, such as the complete "Beijing-Chaoyang District-Dawang Road". Si 03 : Get the user's geographic properties. Users have certain geographic attributes. For example, the location in which the user operates the terminal to access the Internet. This geographic location can be indicated by the IP address of the user terminal accessing the Internet. For example, the IP address of the current user terminal is 202.115.33.3. Through the query of the Internet Protocol (IP) address, it can be known that the IP address is from the "Engineering Design Center of Sichuan University", and the complete address of the address is “Sichuan Province-Chengdu-Sichuan University Engineering Design”, the address can be used as the geographic attribute of the user. In this way, the geographic attribute of the user can be obtained by querying the IP address of the user terminal. The geographic attribute of the user can also be the address registered by the user, such as the registered home address, school address, work address, and the like. The geographic attributes of the user can be obtained by querying the address registered by the user. In addition, the geographic attributes of the user can also be the geographic location defined by the user. For example, the user has developed the geographical location of Xiamen, and [J s 1 03] can obtain this geographical attribute by querying the geographical location defined by the user. In addition, the geographic attribute of the user can also be obtained by obtaining the latitude and longitude of the user. The information gets the user's geographic attributes. For example, if the user locates the current latitude and longitude information through the handheld GPS, the user can obtain the current geographic location by acquiring the latitude and longitude. -14- 201040752 Furthermore, it is also possible to obtain the user's geographic attributes by collecting the user's Internet access focus. For example, if a user searches or queries the location of Jiuzhaigou through the Internet within a certain period of time, it is very likely that the user wishes to travel to the place in a certain period of time, and then search or query by collecting the user for a certain period of time. This geographic location gets the user's geographic attributes. There are many ways to obtain the geographic attributes of the user, and only a few of them are exemplified above, and those skilled in the art should understand that the present invention is not limited to the above. As long as the user's geographic attribute can be obtained, whether the geographic attribute is the user's current IP address, or the user's registered geographic location, or the user-defined geographic location, or the user's latitude and longitude information, or the user's Internet access focus, or other means User geographic attributes should be covered within the scope of embodiments of the present invention. s 1 04 : Provides the file data of the tag's geographic attribute and the user's geographic attribute to the user. In this step, the geographic attribute marked on the file data is first matched with the obtained user geographic attribute, and if it is matched, the corresponding file file data is provided to the user. It should be noted that because there are many file files in S101 and S102, for example, similar to search engines, all file files in the Internet on the same day crawled by web crawlers, or Internet in a few days. All file files in the road. The services provided by the existing website are fully capable of collecting all the files on the Internet. As mentioned above, the geographical attributes marked on the file file may include different administrative division levels. For example, some document files are marked with -15-201040752. The complete geographical attribute is “Beijing-Chaoyang District-Dawang Road”. The complete geographical attribute marked on the other documents is “Beijing-Chaoyang District-Jianguomen”, and some of the complete geographical attributes marked on the document file are “Beijing-Chaoyang District”. If the user's geographic location is Dawang Road, you can provide the file information marked "Beijing-Chaoyang District-Dawang Road" to the user, without providing the file information marked "Beijing-Chaoyang District-Jianguomen" To the user. Of course, you can also provide the file information marked “Beijing-Chaoyang District” to the user. At this time, the document file data marked “Beijing-Chaoyang District” may include documents filed as “Beijing-Chaoyang District-Dawang Road” and documents marked as “Beijing-Chaoyang District-Jianguo Gate”. The file information may of course include other documents including "Beijing-Chaoyang District". The file data of the above-mentioned different administrative division levels may be provided to the user in a hierarchical manner, that is, the user's browsing is carried out step by step, or the file is provided by the user, for example, providing the user with a certain level of administrative division. For example, select the document file of the administrative division level in Beijing, or select the document file of the administrative division level of Chaoyang District. The file file information is provided to the user, which can be understood by those skilled in the art and the general user, including sending the web page containing the file file information to the browser page of the user terminal through the title link. It should be noted that there is no strict sequence between S101, S102 and S103, and S 1 03 may be executed first, and S 1 0 1 and S102′ may be executed, that is, S102 and S103 are obtained before S104. The knot -1640752 can be. As can be seen from the foregoing method embodiment of the present invention, the geographic information in the file file data is retrieved, and the corresponding geographic attribute is searched for in the preset geographic database according to the captured geographic information, and the file attribute is marked for the file file. The geographic attribute obtains the geographic attribute of the user, and provides the file file data matching the geographical attribute of the tag and the geographic attribute of the user to the user, so that the localized information suitable for the user can be provided. Moreover, in the embodiment of the method, the localization of the file file data is realized, thereby helping the user to find the required information more quickly and accurately. The invention provides a method embodiment for regionalized information, and the application range is wide, for example, can be applied to the following products and services: η news search through the above method embodiment, the news can be automatically classified according to the region to generate local news integration, According to the regional attributes of the visiting users, the company actively recommends relevant news content to the users, and truly realizes “telling the stories of the people around them”. At the same time, it can also give the news to the users through the advantages of the complete regional information. navigation. 2) Life service information provision Life service is the current Chinese Internet hotspot. Compared with other information, life service information emphasizes regionalization, accuracy and timeliness. By using the method embodiment of the present invention, the life service information can be effectively integrated according to the region, and when the user accesses, the user's regional attribute can be effectively recognized to cooperate with the regionalized data, and the local life service information can be actively taken. Push to the user to facilitate the user's use of the life service information, and improve the service efficiency and efficiency of the life service information. For example, information on food and beverage in a certain area, discount information, housing rental information, hourly work information and other life service information, targeted to the relevant users in the region, you can use the user's convenience to understand the life service information of the region To improve the service efficiency and efficiency of life service information. 3) Community and Social Network Service (SNS) The current community services include forums, blogs, photo albums, groups, etc. Currently, most of the topics are used to integrate information, and users are mostly searching for information. It is done by keyword. By using the above method embodiment of the present invention to localize the data, the data can be integrated into different localities and different regional levels, and the information is effectively guided and recommended according to the user's regional attributes, so that the user can easily understand the netizens. They are all concerned about and can integrate their circle of friends by region to form a social network and enhance users' online social interaction and interaction. In summary, through the regionalization of Internet information and the identification of user area attributes, a large amount of data in the Internet can be effectively provided to users according to user geographic attributes, thereby improving user access to information and information. Efficiency and effectiveness have broad application prospects in Internet services. The following describes another method embodiment of the present invention for providing localized information. FIG. 3 is a flowchart of an embodiment of the method. As shown in FIG. 3, the method embodiment may include: S 3 0 1 : extracting files Geographic information in the file data. -18- 201040752 This step is similar to the foregoing S101, and may include two implementation manners: Method 1: extracting geographic information in the file file according to the preset local noun library; the name and representation of the place name are stored in the local noun library The word of geographic information. Method 2: Searching for the place name end code existing in the file file data according to the preset place name tail code vocabulary, and using the word fixed before the end name of the place name as the geographic information of the file file data. For details, refer to the corresponding two methods in S101. The details are not mentioned here. In practice, there may be multiple different geographic information in the same file. At this time, after the geographic information is obtained by the above two methods or other methods, a certain number of different geographic information may be extracted from the file data. In general, the description of the same document file should have a central geographic information. Then, the following two ways of determining the central location information in the plurality of geographic information retrieved are as follows: Method 1: For the multiple geographic information extracted from the same file profile, the number of occurrences in the geographic information is The most geographical information that is the final data of the file. Method 2: for the plurality of geographic information extracted, according to the preset geographic database, the number of geographic information belonging to the administrative division is counted according to the administrative division affiliation; the geographical database stores the name of the geographical name and the word indicating the geographical information and The affiliation between geographic information; the geographical information that will be captured in the geographic information and statistical subordinates of the most frequently used as the final geographic information of the article -19-201040752. The two implementations of determining the central geographic information of the plurality of geographic information captured are similar to the two corresponding methods in the foregoing S101, and are not described herein again. S3 02 : Get the geographic attributes of the user. Obtaining the geographic attribute of the user by querying the IP address of the user terminal; or obtaining the geographic attribute of the user by querying the address registered by the user; or obtaining the geographic attribute by querying the user to establish a geographic location; or > by acquiring the user The latitude and longitude information obtains the user's geographic attributes; or obtains the user's geographic attributes by collecting the user's Internet access focus. This step is similar to the aforementioned S103. S3 03: Provides the file information of the geographic information and the geographic attributes of the user to the user. In this step, the geographic information of the file data is directly matched with the acquired user geographic attribute, and if it matches, the corresponding file file data is provided to the user. The file data of the above different administrative division levels can be provided to the user in a hierarchical manner, that is, step-by-level area navigation for the user's browsing, or file information provided by the user to select, for example, the user to select a certain level of administrative division. In this way, the document file data matching the geographic information of the file file and the geographic attributes of the user -20-201040752 can be provided to the user according to different administrative division levels, or provided by the user. The file file information is provided to the user, which can be understood by those skilled in the art and the general user, including sending the web page containing the file file information to the browser page of the user terminal through the title link. It should be noted that there is no strict sequence between the above S3 01 and S3 02. It is also possible to execute S 3 02 first, and then execute s 3 Ο 1, that is, as long as the result of S301 and S302 are completed before S303. The result is fine. The following is a block diagram of a system for providing localized information according to the present invention. FIG. 4 is a block diagram of an embodiment of the system. As shown in FIG. 4, the system embodiment may include: a file file data geographic information capture module. 41, for extracting geographic information in the file file data; a geographic database 42 for storing the name of the place name and the word indicating the geographic information and the affiliation relationship between the geographic information; the 〇 mark module 43 for extracting according to The geographic information is searched for the corresponding geographic attribute in the geographic database, and the searched geographic attribute is marked for the file file; the user geographic attribute obtaining module 44 is configured to acquire the geographic attribute 9 output module 45 of the user. File file data for matching the geographic attributes of the tag with the geographic attributes of the user is provided to the user. Preferably, the system may further include a noun library 51 on the basis of FIG. 4, wherein the name of the place name and the word indicating the information of the place 21 - 201040752 are stored; thus, the file file The data geographic information capturing module 41 can extract the geographic information in the file file according to the local noun library. It should be noted that the local noun library 5 1 may be located in the file file data geographic information capturing module 41 or in a system other than the file file data geographic information capturing module 41. The system may further include a place name tail code vocabulary 61 on the basis of FIG. 4, and the file file data geographic information capture 41 module searches for a file according to the preset place name tail code vocabulary 61. The name of the place name in the file data, and the word that appears fixed before the end of the place name is used as the geographic information of the file file. It should be noted that the geographical name end code vocabulary 61 may be located in the file file information geographic information capturing module 41, or may be located in a system other than the file file data geographic information capturing module 41. The system may further include a central geographic information determining module 71 on the basis of FIG. 4, or FIG. 5, or FIG. 6, where only the central geographic information determining module 71 is included on the basis of FIG. 4, such as As shown in FIG. 7, the most frequently occurring number of occurrences of the plurality of pieces of geographic information extracted from the same file file data by the file file data geographic information capturing module 41 is determined as the final geographic information of the file file data. The system may further include a central geographic information determining module 8 1 and a preset geographic database 82 on the basis of FIG. 4, or FIG. 5 or FIG. 6, where only the center is included on the basis of FIG. The case of the geographic information determining module 8 1 and the preset geographic database 8 2 is as shown in FIG. -22- 201040752 The central geographic information determining module 8 1 is configured to use the file file data geographic information capturing module 41 to extract a plurality of geographic information from the same file file according to the preset geographic database 8 2 According to the administrative division affiliation, the geographical information occurrence number of the membership is counted, and the geographical information of the geographical information and the statistical subordinates that are captured is the most frequently used as the final geographic information of the file data; the geographic database 82 The affiliation between the name of the place name and the word indicating geographic information and the geographic information is stored. Preferably, in the system, the user geographic attribute obtaining module 44 obtains the geographic attribute of the user by querying the ip address of the user terminal, or obtains the geographic attribute of the user by querying the address registered by the user; or Query the user to obtain the geographic location to obtain the geographic attribute; or y to obtain the user's geographic attribute by obtaining the user's latitude and longitude information; or, 得到 obtain the user's geographic attribute by collecting the user's online browsing focus. Another embodiment of the system for providing localized information according to the present invention is described below. FIG. 9 is a block diagram showing an embodiment of the system. As shown in FIG. 9, the system embodiment may include: file file data geographic information capture module The group 9 1 is used for extracting geographic information in the file file data; the user geographic attribute obtaining module 92 is configured to obtain the user's geographic attribute » output module 93 for matching the file attribute with the user's geographic attribute - 23- 201040752 Information is provided to the user. Preferably, the system may further include a noun library 101, wherein the name of the place name and the word indicating the geographic information are stored, and the file file data geographic information capturing module 91 is based on the local noun library. Capture geographic information from the file data. It should be noted that the local noun library 101 may be located in the file data geographic information capturing module 91 or in a system other than the file data geographic information capturing module 91. Preferably, the system may further include a place name code code lexicon 1 1 1 as shown in FIG. 11, and the file file data geographic information capture module 91 searches according to the preset place name code code lexicon 1 1 1 The last name of the place name existing in the file file data, and the word that appears fixed before the end of the place name is used as the geographic information of the file file data. It should be noted that the place name end code lexicon 11 may be located in the file file data geographic information capturing module 91 or in a system other than the file file data geographic information capturing module 91. Preferably, the system further includes a central geographic information determining module 121 on the basis of FIG. 9, or FIG. 10 or FIG. 11, where only the central geographic information determining module 121 is included on the basis of FIG. In the case, as shown in FIG. 12, the central geographic information determining module 1 2 1 is configured to use the file file data geographic information capturing module 9 1 to generate the most occurrences of the plurality of geographic information extracted from the same file file data. Determine the final geographic information for the file. -24 - 201040752 It should be noted that the central geographic information determining module 121' may be located in the file file data geographic information capturing module 91. The system may also be located outside the file file data geographic information capturing module 91. in. Preferably, the system further includes a central geographic information determining module 131 and a preset geographic database 132 on the basis of FIG. 9, or FIG. 10, or the figure, where only the center is included on the basis of FIG. The case of the geographic information determining module 1 31 and the preset geographic database 1 3 2 is as shown in FIG. The central geographic information determining module 1 3 1 is configured to use the file file data geographic information capturing module 91 to extract a plurality of geographic information from the same file file according to the preset geographic database 1 3 2 according to the administrative The geographical information of the affiliation affiliation statistics belongs to the number of occurrences, and the most frequently occurring geographic information belonging to the geographical information and statistics belongs to the final geographic information of the file data; the geographic database 132 stores Affiliation between names of places and words that represent geographic information and geographic information
G 優選地,所述系統中,所述用戶地理屬性獲取模組92 ,可以透過查詢用戶終端的IP位址獲取用戶的地理屬性 ;或, 透過査詢用戶登記的位址獲取用戶的地理屬性;或, 透過查詢用戶制定地理位置獲取到這一地理屬性;或 j 透過獲取用戶的經緯度資訊得到用戶的地理屬性;或 -25- 201040752 透過搜集用戶的上網流覽焦點得到用戶的地理屬性。 雖然透過實施例描繪了本發明,本領域普通技術人員 知道,本發明有許多變形和變化而不脫離本發明的精神, 希望所附的申請專利範圍包括這些變形和變化而不脫離本 發明的精神。 【圖式簡單說明】 圖1爲本發明方法一個實施例的流程圖; 圖2爲本發明地理資料庫的組織結構圖; 圖3爲本發明方法另一實施例的流程圖; 圖4爲本發明系統一個實施例的框圖; 圖5爲本發明系統一個實施例的框圖; 圖6爲本發明系統一個實施例的框圖; 圖7爲本發明系統一個實施例的框圖; 圖8爲本發明系統一個實施例的框圖; 圖9爲本發明系統一個實施例的框圖; 圖10爲本發明系統一個實施例的框圖; 圖11爲本發明系統一個實施例的框圖; 圖12爲本發明系統一個實施例的框圖; 圖13爲本發明系統一個實施例的框圖。 【主要元件符號說明】 4 1 :文檔資料地理資訊擷取模組 42 :地理資料庫 -26- 201040752 43 :標記模組 44 :用戶地理屬性獲得模組 45 :輸出模組 5 1 :文檔資料地理資訊擷取模組 6 1 :地名尾碼詞庫 7 1 :中心地理資訊確定模組 8 1 :中心地理資訊確定模組 〇 8 2 :地理資料庫 9 1 :文檔資料地理資訊擷取模組 92 :用戶地理屬性獲得模組 93 :輸出模組 1 〇 1 :地名詞庫 1 1 1 :地名尾碼詞庫 1 2 1 :中心地理資訊確定模組 1 3 1 :中心地理資訊確定模組 Ο 1 3 2 :地理資料庫 -27-Preferably, in the system, the user geographic attribute obtaining module 92 can obtain the geographic attribute of the user by querying the IP address of the user terminal; or obtain the geographic attribute of the user by querying the address registered by the user; or The geographic attribute is obtained by querying the user to determine the geographic location; or j is obtained by obtaining the user's latitude and longitude information; or -25- 201040752 to obtain the user's geographic attribute by collecting the user's online browsing focus. Although the present invention has been described by way of example, it is understood by those of ordinary skill in the art . BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a flow chart of an embodiment of a method according to the present invention; FIG. 2 is a structural diagram of a geographic database of the present invention; FIG. 3 is a flow chart of another embodiment of the method of the present invention; Figure 5 is a block diagram of an embodiment of the system of the present invention; Figure 6 is a block diagram of an embodiment of the system of the present invention; Figure 7 is a block diagram of an embodiment of the system of the present invention; Figure 9 is a block diagram of an embodiment of the system of the present invention; Figure 10 is a block diagram of an embodiment of the system of the present invention; Figure 11 is a block diagram of an embodiment of the system of the present invention; Figure 12 is a block diagram of one embodiment of a system in accordance with the present invention; Figure 13 is a block diagram of one embodiment of a system in accordance with the present invention. [Main component symbol description] 4 1 : Document data geographic information capture module 42 : Geographic database -26- 201040752 43 : Marker module 44 : User geographic attribute acquisition module 45 : Output module 5 1 : Document data geography Information capture module 6 1 : Place name end code lexicon 7 1 : Center geographic information determination module 8 1 : Center geographic information determination module 〇 8 2 : Geographic database 9 1 : Document data geographic information capture module 92 : User Geographic Property Acquisition Module 93: Output Module 1 〇1: Local Noun Library 1 1 1 : Place Name End Code Dictionary 1 2 1 : Center Geographic Information Determination Module 1 3 1 : Center Geographic Information Determination Module Ο 1 3 2: Geographical Database 27-