TWI453608B - System and method for managing a large number of multiple data - Google Patents

System and method for managing a large number of multiple data Download PDF

Info

Publication number
TWI453608B
TWI453608B TW101103471A TW101103471A TWI453608B TW I453608 B TWI453608 B TW I453608B TW 101103471 A TW101103471 A TW 101103471A TW 101103471 A TW101103471 A TW 101103471A TW I453608 B TWI453608 B TW I453608B
Authority
TW
Taiwan
Prior art keywords
data
database
interest
point
module
Prior art date
Application number
TW101103471A
Other languages
Chinese (zh)
Other versions
TW201333722A (en
Original Assignee
Chunghwa Telecom Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chunghwa Telecom Co Ltd filed Critical Chunghwa Telecom Co Ltd
Priority to TW101103471A priority Critical patent/TWI453608B/en
Publication of TW201333722A publication Critical patent/TW201333722A/en
Application granted granted Critical
Publication of TWI453608B publication Critical patent/TWI453608B/en

Links

Description

大量多元資料篩選管理的系統及其方法System and method for managing a large number of multiple data screening management

本發明係關於一種大量多元資料篩選管理的系統及其方法服務系統,特別是指利用可同時爬取多個資料來源的功能,並結合興趣點搜尋引擎、大量資料分類、大量資料管理、大量資料比對等多項軟體技術,及搭配大量資料抽取模組來查詢資料庫,可以滿足客戶所提出的大量店家資料或地理景點查詢需求,並自動回報查詢結果給客戶(E-mail或簡訊),是一種可以減輕大量人力的查詢服務系統,並加以提高資料正確性。The invention relates to a system for mass multivariate data screening management and a method service system thereof, in particular to a function of simultaneously crawling multiple data sources, combined with a point of interest search engine, a large amount of data classification, a large amount of data management, and a large amount of data. Compared with a number of software technologies, and a large number of data extraction modules to query the database, you can meet the large number of store information or geographic location query requests, and automatically report the results to customers (E-mail or newsletter). A query service system that can reduce a large amount of manpower and improve the correctness of the data.

在本發明推出之前,查詢網路熱門店家資料或地理景點仍須要使用人工搜尋熱門網站,提供客戶所需要的店家資料或地理景點,搜尋速度相當緩慢,等查到店家資料之後再另外確認店家地址和電話是否正確(或者是有在繼續營業中),並且尚未採用店家資料比對檢索技術,正確率的提升相當困難,遇到錯誤的資料只能依賴人工來做錯誤分析,進而去改進資料正確率無法提升的缺點,且查詢程式結束之後,並無法立刻將查詢結果寄送給客戶,仍須依賴人力將資料整理完之後再轉寄。Before the launch of the present invention, it is still necessary to manually search for popular websites and provide the website information or geographical attractions that the customers need, and the search speed is quite slow, and the store is confirmed after the store information is found. Whether the address and phone number are correct (or are in the process of continuing to operate), and the store data has not been used to compare the search technology. The improvement of the correct rate is quite difficult. The data that encounters the error can only rely on the manual to do error analysis, and then improve the data. The shortcomings of the correct rate cannot be improved, and after the query program is finished, the query result cannot be sent to the client immediately. It is still necessary to rely on the manpower to organize the data and then forward it.

由此可見,上述使用方式仍有諸多可以改善之地方,實非良善之設計,而亟待加以改良。It can be seen that there are still many places where the above-mentioned methods of use can be improved. It is not a good design and needs to be improved.

經搜尋台灣公開專利,與本發明較相似的專利案有二 件:分別為”本地搜尋服務使用與提供方法及其程式產品、目錄服務平台與架構”及”資料獲取裝置、資料獲取系統和獲取資料方法”專利案。”本地搜尋服務使用與提供方法及其程式產品、目錄服務平台與架構”專利案主要是一種本地搜尋服務的提供方法,係由一目錄服務平台來執行,目錄服務平台儲存多數個本地搜尋引擎的資料,而於自一終端設備收到一清單下載要求時,選擇該等本地搜尋引擎資料中的至少一筆來形成一服務清單以回傳至該終端設備,該服務清單含有分別與各選定本地搜尋引擎對應的本地搜尋引擎選項,以於各該本地搜尋引擎選項被選擇時可連線至對應的本地搜尋引擎,以達到自動提供本地搜尋引擎資料之功效。本篇所提專利則非著重於搜尋本地服務資料,而是可選擇輸入『關鍵字』,搜尋各大網站所刊登資訊,且本發明所提出的搜尋方法可一次性讀入大量網頁所刊登資訊,並設定比對原有資料庫,準確找出資料庫未包含資料,若有異動資料可進入人工審核,因此”本地搜尋服務使用與提供方法及其程式產品、目錄服務平台與架構”跟本發明所提出的發明概念有極大的不同。After searching for Taiwan's public patents, there are two patent cases similar to the present invention. The articles are: "Local Search Service Use and Provision Method and Its Program Products, Directory Service Platform and Architecture" and "Data Acquisition Device, Data Acquisition System and Access Data Method" patent case. The "Local Search Service Use and Provision Method and Its Program Products, Directory Service Platform and Architecture" patent case is mainly a method of providing local search service, which is executed by a directory service platform, which stores most local search engines. Data, and when receiving a list download request from a terminal device, selecting at least one of the local search engine data to form a service list for returning to the terminal device, the service list containing each selected local search The local search engine option corresponding to the engine can be connected to the corresponding local search engine when each local search engine option is selected, so as to automatically provide local search engine data. The patents in this article are not focused on searching for local service materials. Instead, they can choose to enter "keywords" to search for information published on major websites. The search method proposed by the present invention can read information published on a large number of web pages at one time. And set the original database to match, accurately find out that the database does not contain data, if there is any change data can enter the manual review, so "local search service use and supply methods and its program products, directory service platform and architecture" The inventive concepts proposed by the invention are greatly different.

而”資料獲取裝置、資料獲取系統和獲取資料方法”專利案之內容,主要是透過該通信單元配合GPS獲取資料。而本發明只要透過興趣點(Point of Interest,POI)搜尋引擎模組獲取資料,且透過該位置計算其經緯度資料,不需額外硬體設施;且”資料獲取裝置、資料獲取系統和獲取資料方法”此專利,從該案之圖二十觀察整個流程,其 中關於店家營業時間的正確率,是從使用者在興趣點停留一段充足之時間長度,來判斷時間資料是否正確,其缺點為使用者會到了目的地才發覺店家無營業,但本案發明概念可透過事先資料庫儲存之資料,和網路上獲取資料一一比對,若有差異性會進入人工審核階段,藉此提高其正確率。The content of the patent acquisition device, the data acquisition system and the method for obtaining data is mainly through the communication unit and GPS to obtain data. The present invention only needs to acquire data through a Point of Interest (POI) search engine module, and calculates its latitude and longitude data through the location, without additional hardware facilities; and "data acquisition device, data acquisition system, and data acquisition method" "This patent, from the 20th of the case, observes the entire process, The correct rate of store business hours is that the user stays at the point of interest for a sufficient period of time to judge whether the time data is correct. The disadvantage is that the user will find that the store has no business when he arrives at the destination, but the concept of the invention can be By comparing the data stored in the prior database with the data obtained on the Internet, if there is a difference, it will enter the manual review stage, thereby improving its correct rate.

本發明除了採用大量資料比對模組和企業應用系統整合(Enterprise Application Integration,EAI)資料比對模組來提升正確率之外,並結合爬取網頁的興趣點搜尋引擎模組,以及自行研發的大量資料分類模組和大量資料管理模組,可以做到爬取網頁資料後,開始一連串的大量多元資料篩選管理流程;首先輸入多個欲爬取網站的全球資源定址器(Uniform Resource Locator)來查詢資料、大量資料分類和管理、資料比對、快速人工審核、資料回傳和歸納、客戶提出申請、大量資料抽取、計費、及傳送查詢資料結果給客戶。In addition to using a large number of data comparison modules and Enterprise Application Integration (EAI) data matching modules to improve the accuracy rate, the present invention combines the search engine modules of the points of interest for crawling web pages, and self-developed. A large number of data classification modules and a large number of data management modules can start a series of multi-variable data screening management processes after crawling webpage data; first input multiple global resource locators for crawling websites (Uniform Resource Locator) To query data, large-scale data classification and management, data comparison, rapid manual review, data return and induction, customer application, mass data extraction, billing, and transmission of query data results to customers.

本案發明人鑑於上述方式所衍生的各項缺點,乃亟思加以改良創新,並經多年苦心孤詣潛心研究後,決定採用目前先進的軟體技術,來研發電信業務用的興趣點多元資料搜尋比對系統,終於成功研發完成本案”大量多元資料篩選管理的系統及其方法”。In view of the shortcomings derived from the above-mentioned methods, the inventors of the present invention have improved and innovated, and after years of painstaking research, they decided to adopt the current advanced software technology to develop the multi-data search and comparison system for the interest points of the telecommunication business. Finally, we successfully developed and completed the system and method of a large number of multi-dimensional data screening management.

本發明之目的即在於提供一種大量多元資料篩選管理的系統及其方法,係建立一個可以解決『大量多元資料來源卻造成多個資料重複』的系統,即可以自動化作業的電信業務興趣點服務系統,而且自動分類和管理資料、分析客戶所傳來的查詢申請單,並且可以在短時間內查詢大量網路上熱門店家和地理資訊資料,並利用大量資料抽取模組的功能,依據客戶所提出的需求去擷取資料庫店家資料,可在短時間之內得到精確的結果。The object of the present invention is to provide a system and method for screening and managing a large number of multiple data, and to establish a system capable of solving "a large number of diverse data sources but causing multiple data repetitions", that is, a telecommunication service point of interest service system that can automate operations. And automatically classify and manage data, analyze customer's inquiry request form, and can query a large number of popular stores and geographic information materials on the Internet in a short time, and use the functions of a large number of data extraction modules, according to the customer's The need to extract data from the database store can get accurate results in a short period of time.

本發明的次一目的係在於結合大量資料比對模組和企業應用系統整合資料比對模組,並提出大量資料來源人工審核介面和大量資料抽取人工審核介面,比較與其原有資料庫中店家資料差異性;待人工將審核結果儲存進資料庫之後,會自動更新資料庫中舊有的店家資料和地理資訊,便以提高資料正確性。The second object of the present invention is to integrate a large number of data comparison modules and an enterprise application system to integrate data comparison modules, and propose a large number of data source manual review interfaces and a large number of data extraction manual review interfaces, and compare with the original database. Data difference; after manually storing the audit results into the database, the old store information and geographic information in the database will be automatically updated to improve the correctness of the data.

達成上述發明目的之大量多元資料篩選管理的系統及其方法,係利用可同時爬取多個資料來源的能力,並結合網頁搜尋引擎、大量資料分類、大量資料比對等多項軟體技術,及搭配大量資料抽取模組來查詢資料庫,以滿足客戶所提出的大量店家資料或地理景點查詢需求,並自動回報查詢結果給客戶(E-mail或簡訊),是一種可以減輕大量人力的查詢服務系統,並加以提高資料正確性。其中包括: a.興趣點搜尋引擎模組,主要功能係執行定時排程作業,利用興趣點搜尋引擎系統去抓取店家資料,歸類至索引興趣點網頁資料庫(Index POI WEB DB)或索引興趣點資料庫(Index POI DB),並把索引興趣點網頁資料庫做為待審核店家資料,而索引興趣點資料庫做為待分類店家資料;b.大量資料分類模組,主要功能為接受興趣點搜尋引擎模組傳遞過來的指令,根據抓取到的店家資料內容或地理資訊,若有新資料傳入的話,將主動分類屬於店家基本資料(店家名稱、店家電話、店家地址)或者是店家加值資料(營業時間、產品價格、交通方式、店家簡介、房間數量等加值資料),對資料作整合性規劃,並具有把審核過和已分類的店家資料,回傳至興趣點主機資料庫之能力,並通知大量資料比對模組有新的資料進來,可以啟動此模組來處理資料;c.大量資料比對模組,主要功能為接受大量資料分類模組傳遞過來的指令,將執行網路上爬取到店家資料存進索引興趣點網頁資料庫和索引興趣點資料庫進行比對作業,進而尋找到興趣點資料庫未含有的店家資料或差異性的店家資料,並且將尋找到的店家資料逕行發送給大量資料來源人工審核介面做審核動作;d.唯一標籤(Unique Identification)定義模組,主要功能為接受大量資料來源主機資料庫所傳遞過來的資料,並且將整理過的資料寫入至資料庫,指定的某特定標籤欄位為唯一標籤,格式為ASCII編碼方式,將此新特定標籤欄位 移放到興趣點主機資料庫管理,並通知大量資料管理模組;e.大量資料管理模組,主要功能為接受唯一標籤定義模組傳遞過來的指令,係到指定的資料庫下讀取資料並分類,以及規劃定時排程將店家資料或地理資訊傳送至興趣點搜尋引擎資料庫,且為系統可接受的標準格式;f.企業應用系統整合資料比對模組,主要功能係執行資料比對產生作業,會接受大量管理模組傳遞的指令產生相對應的資料,其資料來源是興趣點搜尋引擎的資料庫和興趣點主機資料庫,並且會將產生完的資料逕行發送給大量資料抽取人工審核介面做審核動作,並且通知大量資料管理模組資料處理作業已經完成,此企業應用系統整合資料比對模組是本專利的核心功能;g.大量資料抽取模組,係將標準化過的店家資料從興趣點主機資料庫讀取,並通知計費模組,以及發簡訊通知客戶已經開始在處理此份申請案件;h.計費模組,係執行費用計算以及告知客戶,並且將計算完的結果逕行發送給客戶以及相關處理人員。A large number of multi-dimensional data screening management systems and methods for achieving the above-mentioned objects are based on the ability to simultaneously crawl multiple sources of data, combined with web search engines, large-scale data classification, large-scale data comparison, and the like, and matching A large number of data extraction modules to query the database to meet the large number of store information or geographic location query requests made by customers, and automatically report the results to customers (E-mail or SMS), which is a query service system that can reduce a lot of manpower. And to improve the correctness of the information. These include: a. The point of interest search engine module, the main function is to perform scheduled work, use the point of interest search engine system to capture store information, and classify it into index POI WEB DB or index point of interest data. Library (Index POI DB), and the index point of interest web database as the store information to be reviewed, and the index point of interest database as the store information to be classified; b. a large number of data classification module, the main function is to accept the point of interest search The instructions transmitted by the engine module, according to the captured store data content or geographic information, if new data is introduced, will be actively classified into the store basic information (store name, store phone, store address) or store value added Information (business hours, product prices, transportation methods, store profiles, number of rooms, etc.), for integrated planning of the data, and with the reviewed and classified store information, back to the point of interest host database Ability, and inform a large amount of data to have new data in the module, you can start this module to process data; c. A large number of data comparison modules, mainly In order to accept the instructions transmitted by a large number of data classification modules, the crawling network data of the execution network is stored in the index interest point webpage database and the index interest point database for comparison operations, thereby finding that the interest point database does not contain Store information or differential store information, and send the found store data path to a large number of data source manual review interface for audit action; d. Unique Identification (Unique Identification) definition module, the main function is to accept a large number of data source host The data transmitted by the database, and the collated data is written into the database, and a specific label field specified is a unique label, and the format is ASCII encoding, and the new specific label field is set. Move to the point of interest host database management, and notify a large number of data management modules; e. A large number of data management modules, the main function is to accept the instructions passed by the unique label definition module, to read data under the specified database And classify, and plan the scheduled schedule to transfer the store data or geographic information to the POI search engine database, and is a standard format acceptable to the system; f. Enterprise application system integration data comparison module, the main function is the execution data ratio For the production operation, the instructions transmitted by a large number of management modules are generated to generate corresponding data, and the data source is the database of the point of interest search engine and the host database of the point of interest, and the generated data path is sent to the large amount of data extraction. The manual review interface performs the auditing action, and informs a large number of data management module that the data processing operation has been completed. The enterprise application system integration data comparison module is the core function of this patent; g. a large number of data extraction modules, which will be standardized The store information is read from the point of interest host database, and the billing module is notified, and the short message is notified to the customer. In this process begun applications filed parts; H billing module executed based costing and inform the client, and the end of the calculation result sent to the client track line and associated personnel.

本發明可以同時處理多方資料來源、分析網頁所傳來的各種不同格式的店家資料或地理資訊,可以在短時間內查詢大量的資料,利用我們自行研發的企業應用系統整合資料比對模組,自行跟大量資料來源主機資料庫中的原有店家資料比對,可得到更精確的結果。並且結合收費模組,告知客戶送來的案件已經在處理中,以及本次查詢將會收取 之費用,等查詢結果輸出標準格式之後,例如:(xml、pdf),會自動按照客戶所留下的EMAIL位址,自動回寄給客戶。主要功能係以物件導向(Object Oriented)觀念為設計架構,使用中文搜尋引擎FAST ESP來搜尋各個網站之網頁上的內容,並利用Microsoft VB.Net語言作為程式設計工具,使用MS SQL Server 2008作為資料庫,搭配MS SQL Server Integration Services的轉換、封裝技術,以HTTP以及TCP/IP的方式作為系統與其他介接系統之通信媒介,綜合上述整體概念以開發本系統。The invention can simultaneously process multi-party data sources, analyze various store formats or geographic information transmitted by different webpages, and can query a large amount of materials in a short time, and integrate the data matching modules by using the enterprise application system developed by ourselves. Compare the original store data in the host database with a large number of sources to get more accurate results. And combined with the charging module, the customer is informed that the case has been processed, and this inquiry will be charged. The cost, after the query result output standard format, for example: (xml, pdf), will automatically be returned to the customer according to the EMAIL address left by the customer. The main function is based on the object orientation (Object Oriented) concept, using the Chinese search engine FAST ESP to search the content of the web pages of each website, and using Microsoft VB.Net language as a programming tool, using MS SQL Server 2008 as data. The library, combined with the transformation and encapsulation technology of MS SQL Server Integration Services, uses HTTP and TCP/IP as the communication medium between the system and other interface systems, and integrates the above overall concept to develop the system.

整體來說,本系統結合了電信業務查詢系統、興趣點搜尋引擎系統、大量資料分類及管理模組、大量資料比對模組、大量資料收取模組、收費模組以及電信業務加值服務的多項功能,使其成為單一系統,且其具備圖形化與自動化之功能,大幅提升了電信業務維運管理之便利性。Overall, the system combines a telecom service inquiry system, a point of interest search engine system, a large number of data classification and management modules, a large number of data comparison modules, a large number of data collection modules, a charging module, and a telecom service value-added service. A number of functions make it a single system, and it has the functions of graphics and automation, which greatly enhances the convenience of telecom business maintenance management.

為了詳細說明本發明之結構以及特點所在,茲舉以下一較佳實例,並配合圖示說明如後:In order to explain in detail the structure and features of the present invention, the following preferred examples are illustrated and illustrated with the following description:

請先參閱圖一及圖二,為本發明大量多元資料篩選管理的系統及其方法所提供之操作人員輸入網頁種子的方式,此乃操作人員要加入大量爬取網頁種子服務時,必須先行在『種子收集』之關鍵字欄位輸入任一關鍵字,並按照規定功能儲存該網址,之後會持續收集該網址的大量店家資料內容。Please refer to FIG. 1 and FIG. 2 first, which is a method for inputting a webpage seed by an operator of a plurality of multi-data screening management systems and methods thereof, and the operator must first join a large number of crawling webpage seed services. Enter any keyword in the keyword field of "Seed Collection" and save the URL according to the specified function. After that, collect a large amount of store information of the URL.

店家基本資料必要欄位有:店家名稱、店家地址、店家聯 絡電話;其他主要的加值資料欄位(非必要)有:營業時間、消費價位、捷運站(查詢店家附近之捷運站)、產品特色(為何會有人推廌)、交通資訊、房間數(查詢住宿之房間數量)、地區位置(查詢相對位置)、景點類型、服務項目、開車路線等加值資料,這一份店家基本資料和加值資料會由操作人員處理,處理完成之後會設定在興趣點搜尋引擎模組1中。The necessary fields for the basic information of the store are: store name, store address, store association Network phone; other major value-added data fields (non-essential) are: business hours, consumer price, MRT station (check out the MRT station near the store), product features (why will someone push), traffic information, room The number of the number of rooms (the number of rooms in the accommodation), the location of the location (the relative location of the inquiry), the type of the attraction, the service item, the driving route, etc., the basic information and the value-added information of the store will be handled by the operator. Set in the point of interest search engine module 1.

請參閱圖三,為本發明大量多元資料篩選管理的系統及其方法多之系統架構圖,其中包括客戶申請查詢之相關組件(2,21)、業者搭配之相關組件(3,31,32,33)及本發明之系統組件(1,11-19)。本發明之系統組件包括:Please refer to FIG. 3, which is a system architecture diagram of a system and a method for filtering and managing a large number of multiple data according to the present invention, including relevant components of the customer application inquiry (2, 21), and related components of the operator matching (3, 31, 32, 33) and system components (1, 11-19) of the present invention. The system components of the present invention include:

a.興趣點搜尋引擎模組1,係執行定時排程作業,利用興趣點搜尋引擎系統去爬取店家資料,歸類至索引資料庫31,並把索引興趣點網頁資料庫做為待審核店家資料,而索引興趣點資料庫做為待分類店家資料,其運作包括:(a)在使用者操作錯誤或輸入參數無效時,即會產生錯誤之訊息;(b)遇到網路斷線,或資料庫當機沒有回應時,產生告警訊息;(c)檢索範圍含括各大網站、以及種子列表內所涵蓋的網站內容;(d)提供定時排程檢查功能,若作業處理中需要其他功能模組配合進行,則產生處理訊息,並下 達指令,送交相關功能模組;(e)可將資料庫內容(工商店家基本資料之部分欄位)導入搜尋引擎,亦即興趣點之基本資料必須常駐在搜尋引擎的索引資料庫31內,並掃瞄網站列表(Seed List)內的網站是否存有索引資料庫31內相關店家之相關資訊,並產出相關報表;(f)依據其他功能模組的要求,顯示指定訊息於螢幕上,或更新特定畫面顯示;(g)依據收到其他功能模組的要求,將其錯誤訊息顯示於螢幕上、列印於報表上、並記錄於系統事件日誌資料庫;(h)依據操作員的需求,可以不限次數的查詢系統的事件記錄、呼叫記錄、目前有哪些網站列表(Seed List)的資料正在處理中,並可隨時產生報表;(i)依據操作員的需求,將全自動化改成人工作業啟動,直接操作系統處理某份全球資源定址器(Uniform Resource Locator)。a. The point of interest search engine module 1 performs a scheduled work, uses the point of interest search engine system to crawl the store data, classifies it into the index database 31, and uses the index point of interest web database as the store to be audited. The information, and the index point of interest database as the store information to be classified, the operation includes: (a) when the user operates incorrectly or the input parameters are invalid, an error message is generated; (b) the network is disconnected, Or when the database fails to respond, an alarm message is generated; (c) the search scope includes the contents of the websites covered by the major websites and the seed list; (d) the scheduled schedule checking function is provided, and if the job processing requires other When the function module is coordinated, a processing message is generated and The instruction is sent to the relevant function module; (e) the content of the database (some fields of the basic information of the shop) can be imported into the search engine, that is, the basic information of the point of interest must be resident in the search engine index database 31. And scan the website list (Seed List) whether the website contains the relevant information of the relevant store in the index database 31, and output related reports; (f) display the specified message on the screen according to the requirements of other function modules , or update the specific screen display; (g) according to the requirements of other functional modules, display its error message on the screen, print it on the report, and record it in the system event log database; (h) according to the operator The requirements can be used to query the system's event records, call records, and which list of sites is currently being processed, and can generate reports at any time; (i) fully automated according to the operator's needs The adult job industry is started, and the direct operating system processes a Uniform Resource Locator.

b.大量資料分類模組11,係到指定的資料庫接收新的資料,並通知大量資料比對模組12啟動其他模組來處理檔案,其運作包括:(a)若有新資料傳入,指定的主機IP不存在,或指定的目錄無效時,產生錯誤訊息於螢幕上;(b)接收或更新資料過程處理失敗或錯誤時,產 生錯誤訊息於螢幕上;(c)接收或更新資料完成時,檢查一下接收資料是否齊全,若不齊全則產生錯誤訊息;(d)接收到其他功能模組訊息時,將其訊息直接顯示於螢幕上;(e)接收索引資料庫31完成時,會將資料依據其特性而分類並導入至興趣點搜尋引擎資料庫32各個相關之資料表格中;(f)可接收及更新大量資料來源人工審核介面13,所審核過的大量正確資料,會逕行儲存至興趣點搜尋引擎資料庫32之該資料表格;(g)每日利用SQL Server Integration Services 提供定時排程功能,可接收及輸出資料至興趣點主機資料庫33。b. The mass data classification module 11 receives the new data from the designated database and notifies the large data comparison module 12 to start other modules to process the file, and the operation includes: (a) if new data is introduced If the specified host IP does not exist, or the specified directory is invalid, an error message is generated on the screen; (b) an error message is generated on the screen when the receiving or updating data process fails or is incorrect; (c) receiving or updating the data When finished, check if the received data is complete. If it is not complete, it will generate an error message. (d) When receiving other function module messages, display the message directly on the screen; (e) When the receiving index database 31 is completed, The data will be classified according to its characteristics and imported into the relevant information tables of the POI search engine database 32; (f) The manual review interface 13 of a large number of sources can be received and updated, and a large amount of correct data reviewed will be followed. POI search engine to the storage database 32 of the data table; (G) daily using SQL Server Integration Services provide timing scheduling function, and can receive the output data to the main point of interest Database 33.

c.大量資料比對模組12,係將執行網路上爬取到店家資料存進索引資料庫31進行比對作業,其運作包括:(a)主要功能係執行網路上爬取到店家資料存進索引資料庫31進行資料庫比對作業,僅需比對出索引興趣點網頁資料庫和索引興趣點資料庫內『基本資料』不一致之資訊,然後將資料設定為待審核資訊;(b)可以透過審核畫面來檢視這些差異資料或索引資料庫31未含有的店家資料,逕行發送給大量資料來源人工審核介面13做審核參考; (c)大量資料來源人工審核介面13可支援多人同時使用的功能,避免某筆資料同時被競爭鎖定,導致資料無法審核的情況發生,進入審核介面後,可記錄按下存檔按鈕的使用者登入ID,以及當時的存檔時間,然後寫入至興趣點臨時表格內;(d)且審核畫面顯示資料差異時,會將有差異的資料改成紅色字體顯示,並且提供每個欄位都具備可以編輯的功能,若雙方資料結果有差異,最後以編輯過的興趣點網頁資料庫資訊為主,並寫回索引資料庫31(需含加值內容欄位)。c. A large number of data comparison module 12, which performs the online crawling to store data stored in the index database 31 for comparison operations, and the operations include: (a) the main function is to perform network crawling to store data storage. The index database 31 performs the database comparison operation, and only needs to compare the information of the "basic data" in the index point of interest database database and the index point of interest database, and then set the data as information to be reviewed; (b) The difference information or the store information not included in the index database 31 can be viewed through the audit screen, and sent to a large number of sources of manual review interface 13 for audit reference; (c) A large number of sources The manual review interface 13 can support multiple users at the same time, avoiding the fact that a certain data is locked by the competition at the same time, resulting in unrecognizable data. After entering the review interface, the user who presses the archive button can be recorded. The login ID, and the current archive time, are then written to the temporary table of interest points; (d) When the review screen displays the data difference, the difference data will be changed to a red font display, and each field is provided. The function that can be edited, if there is a difference in the data results of the two parties, the editor-based information of the interest point webpage database is mainly used, and the index database 31 (requires the value-added content field) is written back.

d.唯一標籤定義模組14,係將定義化過的資料寫入到興趣點主機資料庫33,其運作包括:(a)將接受大量資料來源主機資料庫3所傳遞過來的資料,並且將整理過的資料寫入至興趣點主機資料庫33,且指定的某特定標籤欄位為唯一標籤,格式為類似於ASCII編碼方式,將此新特定標籤欄位儲存進興趣點主機資料庫33管理;(b)寫入過程中,若發生資料格式錯誤,或任何異常錯誤,將這些錯誤資料寫入記錄。d. The unique tag definition module 14 writes the defined data to the point of interest host database 33, and its operations include: (a) accepting a large amount of data source host database 3, and will The sorted data is written to the point of interest host database 33, and the specified specific label field is a unique label, and the format is similar to the ASCII encoding mode, and the new specific label field is stored in the point of interest host database 33 management. (b) In the process of writing, if a data format error occurs, or any abnormal error occurs, the error data is written to the record.

e.大量資料管理模組15,係把資料擷取至興趣點主機資料庫33且需要進行管理的動作,並判斷輸入的資料特性,來決定歸類於哪一類興趣點主機資料庫33之資料表格中,其運作包括:(a)可以依據特定標籤欄位,管理資興趣點主機 資料庫33內所有的興趣點資料;(b)接收大量資料來源主機資料庫3的資料後,可根據資料特性,逕行決定此筆資料儲存於興趣點主機資料庫33之哪一類資料表格;(c)可接收及更新大量資料抽取人工審核介面17,所審核過的大量正確資料,會逕行儲存至興趣點主機資料庫33之該資料表格;(d)遇到網路斷線,或資料庫當機沒有回應時,產生告警訊息;(e)每日利用SQL erver Integration Services 提供定時排程功能,可接收及輸出資料至興趣點搜尋引擎的資料庫32。e. A large number of data management modules 15 are data extracted from the point of interest host database 33 and need to be managed, and determine the characteristics of the input data to determine which type of point of interest host database 33 is classified. In the form, its operation includes: (a) managing all points of interest data in the host database 33 of the point of interest based on the specific tab field; (b) receiving data from a large number of sources of the host database 3, The characteristics determine which type of data table is stored in the point of interest host database 33; (c) can receive and update a large number of data extraction manual review interface 17, a large amount of correct data reviewed, will be stored to the point of interest The data table of the host database 33; (d) generating an alarm message when the network is disconnected, or when the database fails to respond; (e) daily using SQL erver Integration Services to provide a scheduled scheduling function, which can receive And output the data to the database 32 of the point of interest search engine.

f.企業應用系統整合資料比對模組16,係執行資料比對產生作業,資料來源是興趣點搜尋引擎資料庫32和興趣點主機資料庫33,其運作包括:(a)將產生完的資料逕行發送給大量資料抽取人工審核介面17做審核動作;(b)透過企業應用系統整合資料比對模組16檢查大量資料來源人工審核介面13回傳的資料,該店家電話號碼是否存在,若不存在則予以刪除,反之,若存在則跳至下一步驟;(c)接著讀取興趣點資料庫表格,並檢查該號碼是否存在於該表格中,若不存在則標示為待審核的新資料;若存在則與該表格資料內容做比對, 不相同則標示為待審核的差異資料;(d)可依據查詢條件查詢興趣點主檔歸戶檔資料及待審核/已審核FAST UGC來源資料。f. The enterprise application system integration data comparison module 16 performs the data comparison generation operation. The data source is the point of interest search engine database 32 and the point of interest host database 33, and the operations thereof include: (a) the generated The data path is sent to a large amount of data extraction manual review interface 17 for review actions; (b) through the enterprise application system integration data comparison module 16 to check the data returned by the large number of sources of manual review interface 13, whether the store phone number exists, if If it does not exist, delete it. Otherwise, if it exists, skip to the next step; (c) then read the POI database table and check if the number exists in the table. If it does not exist, it is marked as new to be reviewed. Information; if it exists, compare it with the content of the form, If it is different, it will be marked as the difference data to be reviewed; (d) According to the query conditions, the information of the main point of the interest point and the FAST UGC source data to be reviewed/reviewed can be inquired.

g.大量資料抽取模組18,將標準化過的店家資料從興趣點主機資料庫33讀取,並通知計費模組19,其運作包括:(a)根據客戶所提出申請文件21,抽取興趣點主機資料庫33之需求資料,並把查詢結果輸出標準格式之後,例如:(xml、pdf、xls);(b)自動發簡訊通知客戶2已經開始在處理此份申請案件;(c)若處理好之後,自動通知計費模組19。The mass data extraction module 18 reads the standardized store data from the point of interest host database 33 and notifies the billing module 19, the operation of which includes: (a) extracting the interest according to the application file 21 filed by the client. Point to the demand data of the host database 33, and output the query result to the standard format, for example: (xml, pdf, xls); (b) automatically send a short message to notify the customer 2 that the application has been processed; (c) if After processing, the billing module 19 is automatically notified.

h.計費模組19,係執行費用計算以及告知客戶2,其運作包括:(a)根據案件編號所查詢的資料筆數來計價;(b)若過到大量需求查詢的資料,價格另外計算;(c)自動帶出客戶2列帳號碼,並傳送給帳務系統;(d)自動發收費簡訊通知客戶2。h. Billing module 19, which performs cost calculation and informs customer 2, whose operation includes: (a) pricing based on the number of documents queried by the case number; (b) if there is a large amount of information required for inquiry, the price is additionally Calculate; (c) automatically bring out the customer 2 ledger number and transfer it to the accounting system; (d) automatically send the billing notice to notify the customer 2.

本發明完成大量多元資料篩選管理的系統及其方法之步驟為:The steps of the system and method for completing a large number of multivariate data screening management according to the present invention are as follows:

a.操作者從興趣點搜尋引擎模組1 GUI介面,輸入欲搜尋網站之關鍵字或全球資源定址器(Uniform Resource Locator) (格式如圖一所介紹),並且將資料按照規定格式操作完畢。a. The operator searches the engine module 1 GUI interface from the point of interest, and inputs the keyword or global resource locator of the website to be searched (Uniform Resource Locator) (The format is as shown in Figure 1), and the data is processed in the prescribed format.

b.操作完GUI介面之後,可至興趣點搜尋引擎模組1另一個GUI介面,可編輯、刪除各個全球資源定址器(Uniform Resource Locator)之內容(格式如圖二所介紹)。b. After the GUI interface is operated, another GUI interface of the search engine module 1 can be accessed to the point of interest, and the contents of each global resource locator (the format of the Uniform Resource Locator) can be edited and deleted (the format is as shown in FIG. 2).

c.係將執行定時排程作業,利用興趣點搜尋引擎模組1去抓取店家資料,歸類至索引資料庫31,並把索引興趣點網頁資料庫做為待審核店家資料,而索引興趣點資料庫做為待分類店家資料,並把資料送至大量資料分類模組11。c. The department will perform scheduled scheduling operations, use the point of interest search engine module 1 to capture the store data, classify it into the index database 31, and use the index point of interest web database as the pending store information, and the index interest The point database is used as the store information to be classified, and the data is sent to the mass data classification module 11.

d.大量資料分類模組11將會根據資料特性,主動分類屬於店家基本資料或者是店家加值資料,對資料作整合性規劃,並傳送及接收資料至興趣點主機資料庫33。d. The mass data classification module 11 will actively classify the basic data belonging to the store or the value-added data of the store according to the characteristics of the data, integrate the data, and transmit and receive the data to the host database 33 of the point of interest.

e.大量資料比對模組12需比對出興趣點網頁資料庫和興趣點資料庫內『基本資料』不一致之資訊:(a)首先比對電話號碼全碼FullTel欄位,並檢查興趣點資料庫中的表格,此筆電話號碼的POI_ENABLE欄位是否為0,代表使用中,而一筆電話號碼可能有許多筆歷史記錄,假若電話號碼歷史紀錄其中有一筆資料為『使用中』,則繼續下列檢驗流程;假若都不存在有使用中的資料,則直接跳離下列檢驗程序,並捨棄資料,代表電話已經拆機,網路上的內容也不需信任;(b)再來是比對裝機地址device_address欄位,詳細程度到幾號幾樓,若地址相同,則繼續比對店家名稱資料;(c)接著是比對店家名稱POI_name欄位,且比對其店家 名稱的相似度,判斷出是否為同一家店;(d)若網路上搜尋回來的資訊,其電話號碼、裝機地址、店家名稱都相同,且無加值資料存在,直接捨棄這一筆;反之若網路上搜尋回來的資訊,其電話號碼、裝機地址、店家名稱都相同,且有加值資料存在,需導入資料至待審核資料區;(e)若網路上搜尋回來的資訊,其電話號碼、裝機地址都相同,但店家名稱不同,直接導入資料至待審核資料區,並在興趣點網頁資料庫抓取回來的店家名稱先進行過濾,將E-MAIL或者優惠資訊等不相關內容濾除;(f)若網路上搜尋回來的資訊,其電話號碼、店家名稱都相同,但裝機地址不同,原則上以興趣點資料庫資料為主,若有加值資料存在,才需導入資料至待審核資料區,若無加值資訊,直接捨棄這一筆;(g)若網路上搜尋回來的資訊,其電話號碼在興趣點資料庫尚未存在,即導入資料至待審核資料區。e. A large number of data comparison modules 12 need to compare the information of the "basic data" in the interest point web database and the point of interest database: (a) first compare the full number of the phone number FullTel field, and check the points of interest The table in the database, whether the POI_ENABLE field of the phone number is 0, which means that it is in use, and a phone number may have many pen history records. If there is a piece of data in the phone number history record, it is continued. The following inspection process; if there is no data in use, jump directly from the following inspection procedures, and discard the data, on behalf of the phone has been disassembled, the content on the network does not need to trust; (b) again is the installation Address device_address field, the level of detail to a few floors, if the address is the same, then continue to compare the store name information; (c) then compare the store name POI_name field, and than its store The similarity of the names to determine whether it is the same store; (d) If the information retrieved on the Internet is the same, the phone number, the installed address, and the store name are the same, and no value-added data exists, and the pen is directly discarded; The information searched on the Internet has the same phone number, installed address, and store name, and there is a value-added data, and the information needs to be imported into the data area to be audited; (e) if the information retrieved on the Internet, its phone number, The installed addresses are the same, but the store name is different, directly import the data to the data area to be audited, and filter the name of the store that was retrieved from the database of the interest point database to filter out irrelevant content such as E-MAIL or preferential information; (f) If the information retrieved on the Internet is the same, the telephone number and the store name are the same, but the installed address is different. In principle, the information of the interest point database is the main one. If there is value-added data, the data needs to be imported to be reviewed. In the data area, if there is no value-added information, directly discard the pen; (g) if the information retrieved on the Internet, the phone number does not exist in the database of interest points, ie The information to be reviewed data area.

f.經過大量資料比對模組12產生出的待審核資料,會由大量資料來源人工審核介面13做審核動作,審核完畢後回傳興趣點搜尋引擎資料庫32。f. After a large amount of data comparison module 12 generates the data to be reviewed, a large number of data source manual review interface 13 will be used for the review action, and after the review is completed, the interest point search engine database 32 is returned.

g.唯一標籤定義模組14,將收到每一筆資料加上某特定標籤欄位,接著把資料傳送至大量資料管理模組15管理。g. The unique tag definition module 14 will receive each piece of data plus a particular tag field and then transfer the data to a plurality of data management modules 15 for management.

h.大量資料管理模組15,依據將會根據資料特性,主動分類相關資料,對資料作整合性規劃,並傳送及接收資料至興趣點搜尋引擎資料庫32。h. A large number of data management modules 15 based on the data characteristics, active classification of relevant data, integrated planning of data, and transmission and reception of data to the point of interest search engine database 32.

i.企業應用系統整合資料比對模組16,透過企業應用系統整合資料比對模組16檢查大量資料來源人工審核介面13回傳的資料,該店家電話號碼是否存在,且將產生完的資料逕行發送給大量資料抽取人工審核介面17做審核動作。i. The enterprise application system integration data comparison module 16 checks the data returned by the manual review interface 13 of the large data source through the enterprise application system integration data comparison module 16 whether the store phone number exists and the generated data will be generated. The route is sent to a large amount of data extraction manual review interface 17 for audit action.

j.大量資料抽取人工審核介面17會把審核過的資料回傳至興趣點主機資料庫33。j. A large amount of data extraction manual review interface 17 will return the reviewed data to the point of interest host database 33.

k.客戶2填寫完申請文件21,並且將資料按照規定格式填寫完畢,開始處理此份申請案件。k. Customer 2 fills in the application document 21 and fills in the information in the prescribed format to start processing the application.

l.填寫完申請文件21之後,送至大量資料抽取模組18,大量資料抽取模組18將抽取興趣點主機資料庫33該需求資料,再傳送至計費模組19。After the application file 21 is completed, it is sent to the mass data extraction module 18, and the large data extraction module 18 will extract the demand data from the point of interest host database 33 and then transmit it to the billing module 19.

m.計費模組19處理結束時,會產生一份報表資料和收費簡訊,並使用E-MAIL將報表發送給客戶2。m. At the end of the processing of the billing module 19, a report data and a charging brief will be generated, and the report will be sent to the customer 2 by using E-MAIL.

請參閱圖四,為本發明大量多元資料篩選管理的系統及其方法之資料流程順序圖,係說明各模組之間的呼叫順序,按照呼叫時間來排序,由上而下,由左而右。Please refer to FIG. 4 , which is a data flow sequence diagram of a system and a method for filtering and managing a large number of multiple data according to the present invention, which illustrates a sequence of calls between modules, sorted according to call time, from top to bottom, from left to right. .

圖式內上方為構成系統的模組以及客戶人員,其指標可以分為三種,實線實心黑色三角頭指標、虛線三角頭指標、實線黑色三角頭指標,其中實線實心黑色三角頭指標代表了呼叫模組、啟動模組執行命令;虛線三角頭指標代表回應訊息給原呼叫端;實線黑色三角頭指標代表傳遞資料給接受端。The top and bottom of the diagram are the modules and customer personnel that make up the system. The indicators can be divided into three types: solid black triangle head indicator, dotted triangle head indicator, and solid black triangle head indicator. The solid black triangle head indicator represents the solid line. The call module and the start module execute the command; the dotted triangle header indicator represents the response message to the original caller; the solid black triangle header indicator represents the transfer of the data to the accepting end.

從圖式上方可以看到底下這幾個模組:From the top of the diagram you can see the following modules:

1.興趣點搜尋引擎模組1。1. Point of interest search engine module 1.

2.大量資料分類模組11。2. A large number of data classification modules 11.

3.大量資料比對模組12。3. A large amount of data is compared to module 12.

4.唯一標籤定義模組14。4. Unique tag definition module 14.

5.大量資料管理模組15。5. A large number of data management modules 15.

6.企業應用系統整合資料比對模組16。6. The enterprise application system integrates the data comparison module 16.

7.大量資料抽取模組18。7. A large number of data extraction modules 18.

8.計費模組19。8. Billing module 19.

除了這幾個模組之外還有操作人員和資料庫、客戶人員的角色,順序從最左邊的操作端開始輸入欲查詢全球資源定址器(Uniform Resource Locator),接者興趣點搜尋引擎模組1開始呼叫大量資料分類模組11,接者大量資料分類模組11回傳訊息給興趣點搜尋引擎模組1,要求興趣點搜尋引擎模組1下載資料到指定目錄,並且將資料交給大量資料分類模組11處理,大量資料分類模組11接收之後會先驗證資料格式是否正確,若有錯誤會將資料剔除,假設資料沒有重大錯誤,資料分類完之後,則會繼續將資料放在指定目錄下,由大量資料比對模組12繼續處理。假設一般資料通過大量資料比對模組12的處理之後,會繼續啟動大量資料來源人工審核介面13,將蒐尋及歸類的審查完資料寫入興趣點搜尋引擎資料庫32內,並且加上額外的補充資訊,等到資料全部儲存到興趣點搜尋引擎資料庫32之後,大量資料分類模組11會發訊息給大量資料管理模組15,告知審查作業處理完畢,可以開始進行定時排程作業。大量資料來源主機資料庫3,會呼叫唯一標籤定義模組 14,接者唯一標籤定義模組14會回傳訊息給大量資料來源主機資料庫3產生特定標籤欄位,且大量資料來源主機資料庫3將新資料寄送給大量資料管理模組15。大量資料管理模組15接收興趣點搜尋引擎資料庫32和大量資料來源主機資料庫3的資料,則把資料分類後寄送給企業應用系統整合資料比對模組16。假設一般資料通過企業應用系統整合資料比對模組16的處理之後,會繼續啟動大量資料抽取人工審核介面17,將整合及歸類的審查完資料寫入興趣點主機資料庫33內,並且加上額外的補充資訊,接者大量資料管理模組15會發訊息給大量資料分類模組11,告知審查作業處理完畢,可以開始進行定時排程作業。若客戶2開始遞送申請文件21,接者大量資料抽取模組18開始呼叫大量資料管理模組15,接者大量資料管理模組15回傳訊息給大量資料抽取模組18,大量資料抽取模組18寄發簡訊通知客戶2已經開始在處理此份申請案件21,處理完成之後,直接寄給計費模組19,根據案件編號所查詢的資料筆數來計價,且自動發收費簡訊通知客戶2。訊息發送完畢之後,會呼叫並回傳成功的訊息給流程計費模組19,代表此份申請文件21已經可以結案。In addition to these modules, there are operators and databases, and the role of the client personnel. The order is to input the global resource locator (Uniform Resource Locator) from the leftmost operation terminal, and the point of interest search engine module. 1 starts to call a large number of data classification modules 11, and receives a large number of data classification modules 11 to return a message to the point of interest search engine module 1, requesting the point of interest search engine module 1 to download the data to a specified directory, and hand over the data to a large number of The data classification module 11 processes, after receiving a large number of data classification modules 11, it will first verify whether the data format is correct. If there is an error, the data will be rejected. If there is no major error in the data, after the data is classified, the data will continue to be specified. In the directory, the data processing module 12 continues to process. Assuming that the general data is processed by the large amount of data matching module 12, a large number of sources of manual review interface 13 will continue to be launched, and the searched and classified reviewed data will be written into the point of interest search engine database 32, and additional The supplementary information waits until all the data is stored in the interest point search engine database 32, and the large data classification module 11 sends a message to the large data management module 15 to inform the review processing that the processing is completed, and the scheduled scheduling operation can be started. A large number of data source host database 3, will call the unique tag definition module 14. The unique tag definition module 14 will return a message to the large data source host database 3 to generate a specific tag field, and the large data source host database 3 sends the new data to the mass data management module 15. The plurality of data management modules 15 receive the data of the point of interest search engine database 32 and the plurality of data source host database 3, and then sort the data and send it to the enterprise application system integration data comparison module 16. Assuming that the general data is processed by the enterprise application system integration data matching module 16, a large amount of data extraction manual review interface 17 is continued, and the integrated and classified review data is written into the point of interest host database 33, and With additional supplementary information, a large number of data management modules 15 will send a message to the large data classification module 11, informing that the review operation is completed, and the scheduled scheduling operation can be started. If the client 2 starts to deliver the application file 21, the bulk data extraction module 18 starts to call the mass data management module 15, and the large number of data management modules 15 return the message to the large data extraction module 18, and a large number of data extraction modules. 18Send the newsletter to inform the customer 2 that the application case 21 has been processed. After the processing is completed, it will be sent directly to the billing module 19, and the price will be calculated based on the number of documents queried by the case number, and the customer will be notified of the automatic billing information. . After the message is sent, the successful message will be called and returned to the process billing module 19, and the application file 21 can be closed.

本發明所提供之大量多元資料篩選管理的系統及其方法,與其他習用技術相互比較時,更具有下列之優點:The system and method for managing a large amount of multivariate data screening provided by the present invention have the following advantages when compared with other conventional technologies:

1.可輸入設備號碼透過企業應用系統整合資料比對模組查詢基本資料、透過企業應用系統整合資料比對模組利用地址計算經緯度資料。1. The device number can be input through the enterprise application system to integrate the data comparison module to query the basic data, and the enterprise application system integrates the data comparison module to calculate the latitude and longitude data by using the address.

2.結合高效能的全文檢索技術,可以在短時間內快速又正確的比對出大量資料來源主機資料庫所沒有的資料,效率是人工查詢的數百倍。2. Combining the high-performance full-text search technology, it can quickly and correctly compare the data that is not available in the host data database in a short time. The efficiency is hundreds of times of manual query.

3.大量資料抽取模組可以自動的將查詢結果產生各類型報表,並且自動發簡訊通知客戶已經開始在處理此份申請案件。3. A large number of data extraction modules can automatically generate various types of reports for the query results, and automatically send a short message to inform the customer that the application has begun to process the application.

4.計費模組,會將結果報表寄送給客戶以及相關人員,並且發送收費簡訊。4. The billing module will send the results report to the customer and related personnel, and send a billing newsletter.

5.興趣點搜尋引擎模組,是一般查詢系統所缺乏的功能,可以透過輸入的關鍵字來調查相關的店家資料。5. The point of interest search engine module is a function that is lacking in the general inquiry system. It can search related store information through the input keywords.

6.本發明可減輕大量人力的查詢人員的負擔,減少人為因素的缺失6. The invention can reduce the burden of a large number of manpower inquiries and reduce the lack of human factors

上列詳細說明係針對本發明之一可行實施例之具體說明,惟該實施例並非用以限制本發明之專利範圍,凡未脫離本發明技藝精神所為之等效實施或變更,均應包含於本案之專利範圍中。The detailed description of the preferred embodiments of the present invention is intended to be limited to the scope of the invention, and is not intended to limit the scope of the invention. The patent scope of this case.

綜上所述,本案不僅於技術思想上確屬創新,並具備習用之傳統方法所不及之上述多項功效,已充分符合新穎性及進步性之法定發明專利要件,爰依法提出申請,懇請 貴局核准本件發明專利申請案,以勵發明,至感德便。To sum up, this case is not only innovative in terms of technical thinking, but also has many of the above-mentioned functions that are not in the traditional methods of the past. It has fully complied with the statutory invention patent requirements of novelty and progressiveness, and applied for it according to law. Approved this invention patent application, in order to invent invention, to the sense of virtue.

1‧‧‧興趣點搜尋引擎模組1‧‧‧Point of Interest Search Engine Module

11‧‧‧大量資料分類模組11‧‧‧A large number of data classification modules

12‧‧‧大量資料比對模組12‧‧‧Many data comparison modules

13‧‧‧大量資料來源人工審核介面13‧‧‧Many sources of manual review interface

14‧‧‧唯一標籤定義模組14‧‧‧Unique label definition module

15‧‧‧大量資料管理模組15‧‧‧Many data management modules

16‧‧‧企業應用系統整合資料比對模組16‧‧‧Enterprise Application System Integration Data Comparison Module

17‧‧‧大量資料抽取人工審核介面17‧‧‧Many data extraction manual review interface

18‧‧‧大量資料抽取模組18‧‧‧Many data extraction modules

19‧‧‧計費模組19‧‧‧ Billing Module

2‧‧‧客戶2‧‧‧Customer

21‧‧‧申請文件21‧‧‧ Application documents

3‧‧‧大量資料來源主機資料庫3‧‧‧Many sources of host database

31‧‧‧索引資料庫31‧‧‧ Index database

32‧‧‧興趣點搜尋引擎資料庫32‧‧‧Point of Interest Search Engine Database

33‧‧‧興趣點主機資料庫33‧‧‧Point of Interest Host Database

圖一為本發明自大量多元資料篩選管理的系統及其方法之操作人員輸入網頁種子的方式; 圖二為該大量多元資料篩選管理的系統及其方法之操作人員管理網頁種子的方式;圖三為該大量多元資料篩選管理的系統及其方法之系統架構圖;以及圖四為該大量多元資料篩選管理的系統及其方法之資料流程順序圖。FIG. 1 is a method for inputting a webpage seed by an operator of a system and a method for filtering and managing a large amount of multiple data according to the present invention; Figure 2 shows the way in which the operator of the large-scale multivariate data screening management system and its method manages the seed of the webpage; Figure 3 shows the system architecture diagram of the system and method of the multivariate data screening management; and Figure 4 shows the large amount of multivariate data. A data flow sequence diagram of the system and method for screening management.

1‧‧‧興趣點搜尋引擎模組1‧‧‧Point of Interest Search Engine Module

11‧‧‧大量資料分類模組11‧‧‧A large number of data classification modules

12‧‧‧大量資料比對模組12‧‧‧Many data comparison modules

13‧‧‧大量資料來源人工審核介面13‧‧‧Many sources of manual review interface

14‧‧‧唯一標籤定義模組14‧‧‧Unique label definition module

15‧‧‧大量資料管理模組15‧‧‧Many data management modules

16‧‧‧企業應用系統整合資料比對模組16‧‧‧Enterprise Application System Integration Data Comparison Module

17‧‧‧大量資料抽取人工審核介面17‧‧‧Many data extraction manual review interface

18‧‧‧大量資料抽取模組18‧‧‧Many data extraction modules

19‧‧‧計費模組19‧‧‧ Billing Module

2‧‧‧客戶2‧‧‧Customer

21‧‧‧申請文件21‧‧‧ Application documents

3‧‧‧大量資料來源主機資料庫3‧‧‧Many sources of host database

31‧‧‧目錄資料庫31‧‧‧Catalogue Database

32‧‧‧興趣點搜尋引擎資料庫32‧‧‧Point of Interest Search Engine Database

33‧‧‧興趣點主機資料庫33‧‧‧Point of Interest Host Database

Claims (11)

一種大量多元資料篩選管理系統,係結合資料庫全文檢索、資料檔案格式轉換、通訊、系統狀態偵測技術,以達成全自動化的查詢服務系統,其包括:a.興趣點搜尋引擎模組,其係用以執行定時排程作業,利用一興趣點搜尋引擎系統去抓取店家資料,歸類至索引興趣點網頁資料庫或索引興趣點資料庫,並把該索引興趣點網頁資料庫做為待審核店家資料,而把該索引興趣點資料庫做為待分類店家資料;b.大量資料分類模組,其係用以接受該興趣點搜尋引擎模組傳遞過來的指令,根據抓取到的店家資料內容或地理資訊,若有新資料傳入的話,將主動分類屬於店家基本資料或店家加值資料,對資料作整合性規劃,並具有把審核過和已分類的該店家資料,回傳至興趣點主機資料庫之能力,並通知該大量資料比對模組有新的資料進來,以啟動該大量資料比對模組來處理資料;c.大量資料比對模組,其係用以接受該大量資料分類模組傳遞過來的指令,將執行網路上爬取到的該店家資料存進該索引興趣點網頁資料庫和該索引興趣點資料庫進行比對作業,進而尋找到該索引興趣點資料庫未含有的店家資料或差異性的店家資料,並且將尋找到的該店家資料逕行發送給大量資料來源 人工審核介面做審核動作;d.唯一標籤定義模組,其係用以接受大量資料來源主機資料庫所傳遞過來的資料,並且將整理過的資料寫入至該興趣點主機資料庫,以指定的某特定標籤欄位為唯一標籤,格式為ASCII編碼方式,將此新特定標籤欄位移放到該興趣點主機資料庫進行管理,並通知大量資料管理模組;e.大量資料管理模組,其係用以接受唯一標籤定義模組傳遞過來的指令,係到指定的資料庫下讀取資料並分類,以及規劃定時排程將該店家資料或地理資訊傳送至一興趣點搜尋引擎資料庫,且為系統可接受的標準格式;f.企業應用系統整合資料比對模組,主要係執行資料比對產生作業,會接受大量管理模組傳遞的指令產生相對應的資料,其資料來源是該興趣點搜尋引擎資料庫和該興趣點主機資料庫,並且會將產生的資料逕行發送給大量資料抽取人工審核介面做審核動作,並且通知該大量資料管理模組資料處理作業已經完成;g.大量資料抽取模組,係將標準化過的店家資料從該興趣點主機資料庫讀取,並通知計費模組,及發簡訊通知客戶已經開始在處理此份申請案件;以及,h.計費模組,係執行費用計算以及告知客戶,並且將計算完的結果逕行發送給客戶以及相關處理人員; 其中,該大量多元資料篩選管理的系統係用以同時處理多方資料來源、分析網頁所傳來的的店家資料或地理資訊,在短時間內查詢大量的資料,利用該企業應用系統整合資料比對模組,與該大量管理模組中的原有店家資料比對,以得到精確的結果,並且結合該計費模組,告知客戶送來的案件已經在處理中,以及本次查詢將會收取之費用,待查詢結果輸出標準格式之後,自動按照客戶所留下的E-MAIL位址,自動回寄給客戶。 A large-scale multi-dimensional data screening management system, which combines database full-text search, data file format conversion, communication, and system state detection technology to achieve a fully automated query service system, which includes: a. a point of interest search engine module, It is used to perform scheduled scheduling operations, using a point of interest search engine system to capture store information, categorizing it into an index point of interest web database or an index point of interest database, and treating the index point of interest web database as a Review the store information, and use the index point of interest database as the store information to be classified; b. a large number of data classification module, which is used to accept the instructions transmitted by the point of interest search engine module, according to the captured store Data content or geographic information, if new information is introduced, it will actively classify the basic information of the store or the value-added information of the store, and make an integrated plan for the data, and have the information of the store that has been reviewed and classified, and transmitted back to Interested in the ability of the host database, and informed the large amount of data to have new data in the module to start the large amount of data comparison mode a group to process data; c. a large number of data matching modules, which are used to accept instructions transmitted by the mass data classification module, and store the store data crawled on the network into the index point of interest webpage database Comparing with the index point of interest database, searching for the store information or the differential store data not included in the index point of interest database, and sending the found data path to the large number of sources The manual review interface performs the audit action; d. The unique tag definition module is used to accept the data transmitted by the host source database of a large number of sources, and write the collated data to the host database of the point of interest to specify A specific label field is a unique label, and the format is ASCII encoding. The new specific label column is placed in the host database of the point of interest for management, and a large number of data management modules are notified; e. a large number of data management modules, It is used to accept the instructions passed by the unique tag definition module, read the data and classify it under the specified database, and plan the scheduled schedule to transfer the store data or geographic information to a point of interest search engine database. And the standard format acceptable to the system; f. Enterprise application system integration data comparison module, mainly to perform data comparison to generate operations, will accept a large number of management module to transmit instructions to generate corresponding data, the source of which is The point of interest searches the engine database and the point of interest host database, and sends the generated data to a large number of data extraction labor The core interface performs the auditing action, and notifies the data management module that the data processing operation has been completed; g. The mass data extraction module reads the standardized store data from the host database of the point of interest and notifies the charging module. The group, and send a newsletter to inform the customer that the application has been processed; and, h. the billing module, is to calculate the execution fee and inform the customer, and send the calculated result to the customer and related processing personnel; Among them, the system of multi-variable data screening management is used to simultaneously process multi-party data sources, analyze store information or geographic information transmitted from web pages, query a large amount of data in a short time, and use the enterprise application system to integrate data comparison. The module is compared with the original store information in the large number of management modules to obtain accurate results, and combined with the billing module, the customer is informed that the sent case is already being processed, and the query will be charged. The fee is automatically returned to the customer according to the E-MAIL address left by the customer after the standard format of the query result is output. 一種大量多元資料篩選管理方法,其包括以下步驟:a.一興趣點搜尋引擎模組檢查是否有新的全球資源定址器,其係執行定時排程作業,利用一興趣點搜尋引擎系統去抓取店家資料,歸類至索引興趣點網頁資料庫或索引興趣點資料庫,並把該索引興趣點網頁資料庫做為待審核店家資料,而該索引興趣點資料庫則做為待分類店家資料;b.將新的或異動之該店家資料下載到指定的目錄下存放,並啟動一大量資料分類模組,根據抓取到的該店家資料,分類屬於店家基本資料或者是店家加值資料,對資料作整合性規劃,且將已審核過和已分類的店家資料,回傳至一興趣點主機資料庫;c.該大量資料分類模組將該店家資料的資料內容分類之檔案送至一大量資料比對模組審查,係執行網路上欲爬取之客戶要求特定業別店家資料和索引興趣 點資料庫進行比對作業,進而尋找到興趣點資料庫未含有的店家資料或差異性的店家資料,並且將尋找到的店家資料逕行發送給該大量資料來源人工審核介面做審核動作,且利用大量資料來源人工審核介面將資料回傳至一興趣點搜尋引擎資料庫,並接收和回傳資料至該興趣點主機資料庫;d.一唯一識別定義模組確認每一筆資料加上特定識別欄位,係指定的某特定識別欄位為唯一標籤,格式為ASCII編碼方式,將此新特定識別欄位存入該興趣點主機資料庫之中,並啟動一大量資料管理模組來處理接收的資料;e.該大量資料管理模組接收和回傳資料至該興趣點搜尋引擎資料庫,係到指定的資料庫下讀取資料並分類,以及規劃定時排程將店家資料或地理資訊傳送至該興趣點搜尋引擎資料庫,且為系統可接受的標準格式,並可把資料再送至一企業應用系統整合資料比對模組審查,利用一大量資料抽取人工審核介面將資料回傳至該興趣點主機資料庫;f.一大量資料抽取模組係將標準化過的店家資料依據客戶需求抽取該興趣點主機資料庫的資料,並告知客戶其所提出的查詢案件已經開始受理,且已經在處理中;g.該大量資料抽取模組結束之後,即通知計費模組;以及 h.該計費模組統計完費用之後,將處理結果以及費用相關訊息通知客戶以及相關人員,並且將報表寄送給客戶。 A multi-variable data screening management method includes the following steps: a. A point of interest search engine module checks whether there is a new global resource addresser, which performs a scheduled job, and uses a point of interest search engine system to crawl The store information is classified into an index point of interest webpage database or an index point of interest database, and the index point of interest webpage database is used as the store information to be audited, and the index point of interest database is used as the store information to be classified; b. Download the new or changed store information to a designated directory, and start a large data classification module. According to the captured store information, the classified basic information or the store value added data is The data is used for integrated planning, and the reviewed and classified store information is returned to a point of interest host database; c. The large data classification module sends the file of the store data classification to a large amount Data comparison module review, which is performed by customers who want to crawl on the Internet to request specific industry store information and index interest. Point database to compare operations, and then find the store information or differential store data not included in the database of interest points, and send the found store data path to the large data source manual review interface for audit action, and use A large number of sources manually review the data back to a point of interest search engine database, and receive and return data to the point of interest host database; d. A unique identification definition module to confirm each piece of data plus a specific identification field Bit, a specific identification field specified by the system is a unique label, the format is ASCII encoding, the new specific identification field is stored in the host database of the point of interest, and a large data management module is started to process the received data. Information; e. The mass data management module receives and returns data to the point of interest search engine database, reads and classifies the data under the designated database, and plans a scheduled schedule to transmit the store data or geographic information to The point of interest search engine database is a standard format acceptable to the system and can be sent to an enterprise application system integration. Material comparison module review, using a large amount of data extraction manual review interface to transfer data back to the point of interest host database; f. A large number of data extraction module is to standardize the store data according to customer needs to extract the point of interest host The data of the database, and inform the customer that the inquiry case has been accepted and is already being processed; g. after the large number of data extraction modules are finished, the charging module is notified; h. After the billing module has calculated the cost, the customer and related personnel will be notified of the processing result and the cost related information, and the report will be sent to the customer. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該興趣點搜尋引擎模組之特徵包括:a.在使用者操作錯誤或輸入參數無效時,即產生錯誤訊息;b.遇到網路斷線,或資料庫當機沒有回應時,產生警告訊息;c.檢索範圍含括各大網站、以及種子列表內所涵蓋的網站內容;d.提供定時排程檢查功能,若作業處理中需要其他功能模組配合進行,則產生處理訊息,並下達指令,送交相關功能模組;e.可將資料庫內容(工商店家基本資料之欄位)導入搜尋引擎,亦即興趣點基本資料必須常駐在搜尋引擎的索引資料庫內,並掃瞄網站列表內的網站是否存有索引資料庫內相關店家之相關資訊,產出相關報表;f.依據其他功能模組的要求,顯示指定訊息於螢幕上,或更新特定畫面顯示;g.依據收到其他功能模組的要求,將其錯誤訊息顯示於螢幕上、列印於報表上、並記錄於系統事件日誌資料庫;h.依據操作員的需求,查詢系統的事件記錄、呼叫記 錄、目前有哪些網站列表的資料正在處理中,並產生報表;以及i.依據操作員的需求,將全自動化改成人工作業啟動直接操作系統之全球資源定址器。 For example, the plurality of multivariate data screening management systems described in claim 1 wherein the feature of the point of interest search engine module comprises: a. generating an error message when the user operates incorrectly or the input parameter is invalid; b. A warning message is generated when the network is disconnected, or when the database fails to respond; c. The search scope includes the contents of the websites covered by the major websites and the seed list; d. provides a scheduled scheduling function, if the operation is performed In the process, other function modules are required to cooperate, and a processing message is generated, and an instruction is sent to the relevant function module; e. The content of the database (the field of the basic information of the shop) can be imported into the search engine, that is, the point of interest. The basic data must be resident in the search engine's index database, and scan the website list to see if there is relevant information about the relevant stores in the index database, and output relevant reports; f. According to the requirements of other functional modules, display Specify the message on the screen or update the specific screen display; g. Display the error message on the screen and print it on the screen according to the requirements of other function modules. On, and recorded in the system event log database;. H according to the needs of the operator, the query system event logs, call record Recorded, which website list data is currently being processed, and reports are generated; and i. According to the operator's needs, the global resource addresser that activates the direct operating system will be fully automated. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該大量資料分類模組之特徵包括:a.若有新資料傳入,指定的主機位址不存在,或指定的目錄無效時,產生錯誤訊息於螢幕上;b.接收或更新資料過程處理失敗或錯誤時,產生錯誤訊息於螢幕上;c.接收或更新資料完成時,檢查一下接收資料是否齊全,若不齊全則產生錯誤訊息;d.接收到其他功能模組訊息時,將其訊息直接顯示於螢幕上;e.接收索引資料庫之完成時,會將資料依據其特性而分類並導入至該興趣點搜尋引擎資料庫各個相關之資料表格中;f.接收及更新大量資料來源人工審核介面,所審核過的大量正確資料,會逕行儲存至該興趣點搜尋引擎資料之該資料表格;以及g.每日利用定時排程功能,接收及輸出資料至該興趣點主機資料庫。 For example, the plurality of multivariate data screening management systems described in claim 1 wherein the features of the mass data classification module include: a. if new data is introduced, the specified host address does not exist, or the specified directory is invalid. When an error message is generated on the screen; b. When receiving or updating the data processing failure or error, an error message is generated on the screen; c. When receiving or updating the data, check whether the received data is complete, if not complete, generate Error message; d. When receiving other function module messages, display their messages directly on the screen; e. When receiving the index database, the data will be classified according to its characteristics and imported into the point of interest search engine data. The relevant data forms of the library; f. receiving and updating a large number of sources of manual review interface, a large amount of correct data reviewed, will be stored in the data table of the search engine data of the interest point; and g. daily use timing Scheduling function, receiving and outputting data to the host database of the point of interest. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該大量資料比對模組之特徵包括: a.主要功能係執行網路上爬取到店家資料存進索引興趣點網頁資料庫和索引興趣點資料庫進行比對作業,僅需比對出興趣點網頁資料庫和興趣點資料庫內『基本資料』不一致之資訊,然後將資料設定為待審核資訊;b.透過該審核畫面來檢視這些差異資料或資料庫未含有的店家資料,逕行發送給大量資料來源人工審核介面做審核參考;c.該審核介面係支援多人同時使用的功能,避免某筆資料同時被競爭鎖定,導致資料無法審核的情況發生,進入審核介面後,記錄按下存檔按鈕的使用者登入身分,以及當時的存檔時間,然後寫入至興趣點臨時表格內;以及d.當該審核畫面顯示資料差異時,會將有差異的資料改成紅色字體顯示,並且提供編輯功能,若雙方資料結果有差異,最後則以編輯過的興趣點網頁資料庫資訊為主,並寫回該資料庫。 For example, a large number of multi-dimensional data screening management systems described in claim 1 of the patent application, wherein the characteristics of the large-scale data matching module include: a. The main function is to perform online comparison and retrieval of the store data into the index point of interest database database and the index point of interest database for comparison operations, only need to compare the interest point web database and the point of interest database in the basic Information that is inconsistent, and then set the information as information to be reviewed; b. View the difference information or the store information not included in the database through the audit screen, and send it to the manual review interface of a large number of sources for audit reference; c. The review interface supports multiple users at the same time, avoiding the fact that a certain data is locked by the competition at the same time, resulting in the unrecognizable data. After entering the review interface, the user login status of the archive button is recorded, and the archive time at that time. And then write to the temporary table of interest points; and d. When the review screen displays the data difference, the difference data will be changed to red font display, and the editing function will be provided. If the data results of the two parties are different, the last is The edited point of interest webpage database information is mainly used, and the database is written back. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該唯一標籤定義模組之特徵包括:a.將接受大量資料來源主機資料庫所傳遞過來的資料,並且將整理過的資料寫入至興趣點主機資料庫,且指定某特定標籤欄位為唯一標籤,格式為ASCII編碼方式,將此新特定標籤欄位儲存進興趣點主機資料庫管理;b.寫入過程中,若發生資料格式錯誤,或任何異常錯誤,將這些錯誤資料寫入記錄。 For example, the large number of multivariate data screening management systems described in claim 1 wherein the unique tag definition module includes: a. will receive data from a large number of source host databases, and will organize the data. Write to the point of interest host database, and specify a specific label field as a unique label, the format is ASCII encoding, store this new specific label field into the point of interest host database management; b. During the writing process, A data format error, or any abnormal error, is written to the record. 如申請專利範圍第1項所述之大量多元資料篩選管理系 統,其中該大量資料管理模組之特徵包括:a.依據特定標籤欄位,管理資料庫內所有的興趣點資料;b.接收大量資料來源主機資料庫的資料後,根據資料特性,逕行決定此筆資料儲存於興趣點主機資料庫之哪一類表格;c.接收及更新大量資料抽取人工審核介面,所審核過的大量正確資料,會逕行儲存至該興趣點主機資料庫之該資料表格;d.遇到網路斷線,或資料庫當機沒有回應時,產生告警訊息;以及e.每日利用該定時排程功能,接收及輸出資料至興趣點搜尋引擎的資料庫。 A large number of multivariate data screening management systems as described in item 1 of the patent application scope. The characteristics of the mass data management module include: a. managing all the points of interest data in the database according to the specific label field; b. receiving the data of the host data database of a large number of data sources, and determining according to the characteristics of the data Which type of form is stored in the host database of the point of interest; c. Receive and update a large amount of data to extract the manual review interface, and a large amount of correct data that has been reviewed will be stored in the data form of the host database of the point of interest; d. Generate an alarm message when the network is disconnected, or when the database fails to respond; and e. daily use the timing scheduling function to receive and output data to the database of the point of interest search engine. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中企業應用系統整合資料比對模組之特徵包括:a.將產生完的資料逕行發送給大量資料抽取人工審核介面做審核動作;b.透過企業應用系統整合檢查大量資料來源人工審核介面回傳的資料,該店家電話號碼是否存在,若不存在則予以刪除,反之,若存在則跳至下一步驟;c.讀取該興趣點資料庫表格,並檢查該號碼是否存在於該表格中,若不存在則標示為待審核的新資料;若存在則與該表格資料內容做比對,不相同則標示為待審核的差異資料;d.依據查詢條件查詢興趣點主檔歸戶檔資料及待審核/已審 核資料。 For example, a large number of multivariate data screening management systems described in claim 1 of the patent scope, wherein the characteristics of the enterprise application system integration data matching module include: a. sending the generated data path to a large amount of data extraction manual review interface for auditing action b. Through the enterprise application system integration check the data returned by the manual review interface of a large number of data sources, whether the store phone number exists, if it does not exist, delete it; otherwise, if it exists, skip to the next step; c. read the Interest point database form, and check whether the number exists in the form, if it does not exist, it is marked as new data to be reviewed; if it exists, it is compared with the content of the form, if it is different, it is marked as the difference to be reviewed Information; d. According to the query conditions, the information of the main points of the interest points and the files to be reviewed and reviewed Nuclear data. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該大量資料抽取模組之特徵包括:a.根據客戶所提出申請文件,抽取該興趣點主機資料庫之需求資料,並把查詢結果以標準格式輸出;b.自動發簡訊通知客戶已經開始在處理此份申請案件;c.若處理好之後,自動通知該計費模組。 For example, the plurality of multivariate data screening management systems described in claim 1 wherein the features of the mass data extraction module include: a. extracting the demand data of the host database of the point of interest according to the application file submitted by the client, and The result of the query is output in a standard format; b. The automatic short message informs the customer that the application has been processed; c. If processed, the billing module is automatically notified. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中該計費模組之特徵包括:a.根據案件編號所查詢的資料筆數來計價;b.若遇到大量需求查詢的資料,價格另外計算;c.自動帶出客戶列帳號碼,並傳送給帳務系統;d.自動發收費簡訊通知客戶。 For example, the plurality of multivariate data screening management systems described in claim 1 wherein the billing module includes: a. pricing based on the number of data queried by the case number; b. if a large number of demand queries are encountered Data, the price is calculated separately; c. automatically bring out the customer's account number and transmit it to the accounting system; d. automatically send a fee to notify the customer. 如申請專利範圍第1項所述之大量多元資料篩選管理系統,其中各模組以TCP/IP或Socket的方式作為通信媒介。 For example, a large number of multi-dimensional data screening management systems described in claim 1 of the patent scope, wherein each module uses TCP/IP or Socket as a communication medium.
TW101103471A 2012-02-03 2012-02-03 System and method for managing a large number of multiple data TWI453608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW101103471A TWI453608B (en) 2012-02-03 2012-02-03 System and method for managing a large number of multiple data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW101103471A TWI453608B (en) 2012-02-03 2012-02-03 System and method for managing a large number of multiple data

Publications (2)

Publication Number Publication Date
TW201333722A TW201333722A (en) 2013-08-16
TWI453608B true TWI453608B (en) 2014-09-21

Family

ID=49479521

Family Applications (1)

Application Number Title Priority Date Filing Date
TW101103471A TWI453608B (en) 2012-02-03 2012-02-03 System and method for managing a large number of multiple data

Country Status (1)

Country Link
TW (1) TWI453608B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106293676B (en) * 2015-06-08 2019-05-10 东元电机股份有限公司 Generate the method and system of whole detection program
US10235176B2 (en) 2015-12-17 2019-03-19 The Charles Stark Draper Laboratory, Inc. Techniques for metadata processing
WO2019152792A1 (en) 2018-02-02 2019-08-08 Dover Microsystems, Inc. Systems and methods for policy linking and/or loading for secure initialization
US11150910B2 (en) * 2018-02-02 2021-10-19 The Charles Stark Draper Laboratory, Inc. Systems and methods for policy execution processing
WO2019213061A1 (en) 2018-04-30 2019-11-07 Dover Microsystems, Inc. Systems and methods for checking safety properties
TW202022679A (en) 2018-11-06 2020-06-16 美商多佛微系統公司 Systems and methods for stalling host processor
US11841956B2 (en) 2018-12-18 2023-12-12 Dover Microsystems, Inc. Systems and methods for data lifecycle protection
TWI828433B (en) * 2022-11-21 2024-01-01 中華電信股份有限公司 Processing apparatus and processing method for data stream and computer program product excuting the processing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101504290A (en) * 2009-03-11 2009-08-12 凯立德欣技术(深圳)有限公司 Navigation system and its interest point peripheral searching method
TW201011574A (en) * 2008-09-10 2010-03-16 Mitac Int Corp Method of using and providing local searching service, and its program product, catalogue service platform and architecture
TW201017119A (en) * 2008-10-30 2010-05-01 Tomtom Int Bv Data acquisition apparatus, data acquisition system and method of acquiring data
TW201040752A (en) * 2009-05-13 2010-11-16 Alibaba Group Holding Ltd Method and system for providing localized information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201011574A (en) * 2008-09-10 2010-03-16 Mitac Int Corp Method of using and providing local searching service, and its program product, catalogue service platform and architecture
TW201017119A (en) * 2008-10-30 2010-05-01 Tomtom Int Bv Data acquisition apparatus, data acquisition system and method of acquiring data
CN101504290A (en) * 2009-03-11 2009-08-12 凯立德欣技术(深圳)有限公司 Navigation system and its interest point peripheral searching method
TW201040752A (en) * 2009-05-13 2010-11-16 Alibaba Group Holding Ltd Method and system for providing localized information

Also Published As

Publication number Publication date
TW201333722A (en) 2013-08-16

Similar Documents

Publication Publication Date Title
TWI453608B (en) System and method for managing a large number of multiple data
US10963513B2 (en) Data system and method
US7890509B1 (en) Parcel data acquisition and processing
CN103473230B (en) Service area determines that method, logistics service provider recommend method and related device
CN106844372B (en) Logistics information query method and device
RU2695420C1 (en) Method of collecting logistic information and interstate transportation system
JP2013531289A (en) Use of model information group in search
CN102053984A (en) Systems and methods for information retrieval, information query and information issue
CN104809177A (en) Webpage commenting and recommending methods and systems based on client
CN112269816B (en) Government affair appointment correlation retrieval method
CN102231152B (en) Searching method for precisely inquiring based on IP (Internet Protocol) address of mobile terminal
CN110543477B (en) Label construction system and method
CN106095738A (en) Recommendation tables single slice
US9015166B2 (en) Methods and systems for annotation of digital information
US20170300531A1 (en) Tag based searching in data analytics
CN110928903B (en) Data extraction method and device, equipment and storage medium
CN112632405A (en) Recommendation method, device, equipment and storage medium
CN111191111A (en) Content recommendation method, device and storage medium
CN111414410A (en) Data processing method, device, equipment and storage medium
JP5764080B2 (en) Web search system and Web search method
JP6160503B2 (en) Information input system and program
CN102222067A (en) Searching method for accurately querying information according to IP (Internet Protocol) address of keyword
TW202230158A (en) Systems and methods for extracting attributes from product titles
CN109636303B (en) Storage method and system for semi-automatically extracting and structuring document information
CN114416968A (en) Document reviewing method and device

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees