TWI427494B - A patent document search system, processing method, and search method with cloud structure - Google Patents

A patent document search system, processing method, and search method with cloud structure Download PDF

Info

Publication number
TWI427494B
TWI427494B TW99118401A TW99118401A TWI427494B TW I427494 B TWI427494 B TW I427494B TW 99118401 A TW99118401 A TW 99118401A TW 99118401 A TW99118401 A TW 99118401A TW I427494 B TWI427494 B TW I427494B
Authority
TW
Taiwan
Prior art keywords
search
retrieval
attribute
platform
content
Prior art date
Application number
TW99118401A
Other languages
Chinese (zh)
Other versions
TW201145050A (en
Inventor
Chao Chin Chang
Kai Chieh Hsu
Original Assignee
Chao Chin Chang
Kai Chieh Hsu
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chao Chin Chang, Kai Chieh Hsu filed Critical Chao Chin Chang
Priority to TW99118401A priority Critical patent/TWI427494B/en
Publication of TW201145050A publication Critical patent/TW201145050A/en
Application granted granted Critical
Publication of TWI427494B publication Critical patent/TWI427494B/en

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

雲端架構的專利文件檢索平台、處理方法及其檢索方法Patent document retrieval platform, processing method and retrieval method thereof for cloud architecture

本發明係提供一種雲端架構的專利文件檢索平台、處理方法及其檢索方法,尤指應用專利文件的檢索平台,並利用檢索平台內分散式的儲存位置架構,便進行快速的檢索運算。The invention provides a patent document retrieval platform, a processing method and a retrieval method thereof for a cloud architecture, in particular, a retrieval platform for applying a patent document, and uses a distributed storage location architecture in the retrieval platform to perform a fast retrieval operation.

隨著企業對於無形資產如智慧財產權的逐漸重視,越來越多的企業或個人都將其研發出來的技術成果予以申請成為專利,藉以獲取專利權的保障,排除他人不當的利用其研發成果。這樣的趨勢下便造成各國的專利申請數量逐漸增多,形成各國對於儲存專利文件的資料庫容量則越趨擴大。With the increasing emphasis on intangible assets such as intellectual property rights, more and more enterprises or individuals apply for patents that they have developed to become patents, so as to obtain the protection of patent rights and exclude others from improper use of their research and development results. Under such a trend, the number of patent applications in various countries has gradually increased, and the capacity of the database for storing patent documents has become larger.

然而,許多企業需要透過一些專業檢索平台來研究專利文件,執行專利文件的分析及統計,以獲取未來的研發方向,便隨著專利文件舉日漸增的情況下,造成分析及統計的困難度。However, many companies need to research patent documents through some professional search platforms, perform analysis and statistics of patent documents to obtain future research and development directions, and make analysis and statistics difficult with the increasing number of patent documents.

就目前專利文件檢索平台的分析及統計執行下,通常需要定期向各國官方的專利文件儲存資料庫擷取專利文件,之後,根據使用者的檢索條件與專利軟體內所儲存的所有專利文件進行比對。此種比對方式對於專利文件增多的情況下,便造成檢索速度越趨緩慢的問題。Under the current analysis and statistical implementation of the patent document retrieval platform, it is usually necessary to regularly obtain patent documents from official patent file storage databases of various countries, and then compare them with all patent documents stored in the patent software according to the user's search conditions. Correct. In the case of such an increase in the number of patent documents, the search speed becomes slower.

再者,目前專利文件檢索平台僅能針對專利文件的數量、發明人、申請人、引證關係等資訊執行交叉統計分析,相對地,針對專利說明書及申請權利範圍所載明的內容並無法深入地從中擷取出較重要及相互關聯的資訊,也就是說,依照先前技術,若是企業或一般使用者想知道某一篇技術文件與資料庫中專利文件所揭露的技術相似度或關聯性關係,藉以成為訴訟上詳細研究或研發方向的明確訂立,則必須逐一閱讀資料庫中複數筆專利文件,並從而運用專業知識及經驗透過人為判斷兩者之關聯性。然而,隨著科技的發展,每天有愈來愈多的專利資料被創造出來,所包含之技術內容亦漸形複雜,使得運用人力判斷專利關聯性顯得愈來愈困難,而且愈來愈不經濟。假如這些訴訟的研究或研發的方向訂立是完全仰賴人為判斷做決定,使得重要的決策皆隨著變動因子過高的方式而沒有科學方法的協助,則對於目前的專利文件檢索平台將會落入於較為表面化且變動性過高的缺失。Furthermore, the current patent document retrieval platform can only perform cross-statistical analysis on the number of patent documents, inventors, applicants, citation relationships, etc., and relatively, the content of the patent specification and the scope of application rights cannot be deeply Extracting more important and interrelated information from the middle, that is, according to the prior art, if the enterprise or the general user wants to know the technical similarity or relationship between the technical documents and the patent documents in the database, To become a clear study of the detailed research or research and development direction of litigation, you must read the multiple patent documents in the database one by one, and use the professional knowledge and experience to judge the relevance of the two through human judgment. However, with the development of science and technology, more and more patent materials are created every day, and the technical content involved is gradually complicated, making it more and more difficult to use patents to judge patent relevance, and it is becoming more and more economical. . If the research or R&D direction of these litigations is based entirely on human judgment, and the important decisions are made with the help of scientific methods, the current patent document retrieval platform will fall into the category. A lack of superficiality and excessive variability.

是以,上述習用之專利檢索平台於實際操作時,仍具有諸多問題與缺失,此即為本發明人與從事此行業者所亟欲改善之目標所在。Therefore, the above-mentioned patent search platform still has many problems and deficiencies in the actual operation, which is the goal that the inventors and those engaged in the industry desire to improve.

依本發明所揭露的實施範例中,可提供一種雲端架構的專利文件檢索平台、處理方法及其檢索方法。透過專利文件內容的不同屬性作為分散式架構的基準,並利用資料比對探勘等分析方式,可使越來越多的專利文件予以快速檢索及分析。According to the embodiment of the present invention, a patent document retrieval platform, a processing method, and a retrieval method of the cloud architecture can be provided. Through the different attributes of patent document content as the benchmark of the decentralized architecture, and using data analysis and other analytical methods, more and more patent documents can be quickly retrieved and analyzed.

在一實施例中,所揭露者是關於一種雲端架構的專利文件檢索平台之處理方法。此包含擷取複數專利文件;解析該些專利文件的複數屬性內容,分別產生複數相同的屬性內容及相異的屬性內容;傳輸各相同及相異的屬性內容至複數儲存位置;使相同的屬性內容分別對應儲存於相同的儲存位置,而相異的屬性內容則分別對應儲存於不同的儲存位置,之後可供接收檢索要求,並進行比對檢索要求與各儲存位置內的屬性內容是否完全相同或部份相同,再生成一檢索結果,並將檢索結果回應至檢索介面。In one embodiment, the disclosed method relates to a method of processing a patent document retrieval platform of a cloud architecture. This includes extracting a plurality of patent documents; parsing the plural attribute contents of the patent documents, respectively generating the same attribute content and different attribute contents; transmitting the same and different attribute contents to the plural storage locations; making the same attribute The content is stored in the same storage location respectively, and the different attribute contents are respectively stored in different storage locations, and then the retrieval request is received, and the comparison retrieval request is exactly the same as the attribute content in each storage location. Or partially identical, generate a search result, and respond to the search interface.

在另一實施例中,所揭露者是關於一種雲端架構的專利文件檢索平台之檢索方法。此包含提供一個檢索介面可供接收檢索要求;輸入一檢索要求至檢索介面;判斷檢索要求是否包括有複數檢索條件;若具有複數檢索條件,則耦合該複數檢索條件,並生成檢索參數,再執行比對程序;若具有單一檢索條件,則執行比對程序;擷取比對後符合完全相同或部份相同的屬性內容;生成一檢索結果並回應檢索結果至檢索介面。In another embodiment, the disclosed method is a retrieval method for a patent document retrieval platform of a cloud architecture. The method includes providing a search interface for receiving a search request; inputting a search request to the search interface; determining whether the search request includes a plurality of search conditions; and if there is a complex search condition, coupling the complex search condition, generating a search parameter, and then executing The comparison program; if there is a single search condition, the comparison program is executed; the matching is identical or partially identical to the attribute content; the search result is generated and the search result is returned to the search interface.

在又一實施例中,所揭露者是關於一種雲端架構的專利文件檢索平台之檢索方法。此包含提供一個檢索介面可供接收檢索要求;輸入一檢索要求至檢索介面;判斷檢索要求是否包括有複數檢索條件;若具有複數檢索條件,則耦合該複數檢索條件,並生成檢索參數,再執行比對程序;若具有單一檢索條件,則執行比對程序;擷取比對後符合完全相同或部份相同的屬性內容生成一檢索結果;挖掘完全相同或部份相同於該至少一個檢索條件的複數屬性內容與複數儲存位置之各屬性內容的相互關聯性;生成一資料挖掘結果並回應資料挖掘結果至檢索介面。In yet another embodiment, the disclosed method is a retrieval method for a patent document retrieval platform of a cloud architecture. The method includes providing a search interface for receiving a search request; inputting a search request to the search interface; determining whether the search request includes a plurality of search conditions; and if there is a complex search condition, coupling the complex search condition, generating a search parameter, and then executing Comparing the program; if there is a single search condition, executing the comparison program; and obtaining the search result after the comparison meets the identical or partially identical attribute content; mining the same or partially identical to the at least one search condition The correlation between the content of the complex attribute and the content of the complex storage location; generating a data mining result and responding to the data mining result to the search interface.

在又一實施例中,所揭露者是關於一種雲端架構的專利文件檢索平台提供一個檢索介面可供接收檢索要求;輸入一檢索要求至檢索介面,其中此檢索要求為至少一技術文件或專利文件;解析該至少一技術文件或專利文件,並產生複數相同及相異的屬性內容;判斷檢索要求是否包括有複數檢索條件;若具有複數檢索條件,則耦合該複數檢索條件,並生成檢索參數,再執行比對程序;若具有單一檢索條件,則執行比對程序;擷取比對後符合完全相同或部份相同的屬性內容生成一檢索結果;回應檢索結果至檢索介面。In yet another embodiment, the disclosed method provides a search interface for receiving a search request for a patent document retrieval platform of a cloud architecture; inputting a search request to a search interface, wherein the search request is at least one technical document or patent document Parsing the at least one technical file or patent file, and generating a plurality of identical and different attribute contents; determining whether the search request includes a plural search condition; if there is a complex search condition, coupling the complex search condition and generating a search parameter, Then, the comparison program is executed; if there is a single search condition, the comparison program is executed; after the comparison, the attribute content that meets the exact same or partially the same is generated to generate a search result; and the search result is returned to the search interface.

在又一實施例中,所揭露者是關於一種雲端架構的專利文件檢索平台提供一個檢索介面可供接收檢索要求;耦合檢索至少一檢索條件的多種組合,並生成字詞檢索參數;比對字詞檢索參數與複數儲存位置之各屬性內容是否完全相同或部份相同;擷取比對後符合完全相同或部份相同的屬性內容,生成一檢索結果;回應檢索結果至檢索介面。In still another embodiment, the disclosed method provides a search interface for receiving a search request for a patent document retrieval platform of a cloud architecture; coupling and searching at least one combination of at least one search condition, and generating a word search parameter; Whether the word retrieval parameter and the attribute content of the plural storage location are identical or partially identical; the matching attribute content that meets the exact same or partially identical content is generated to generate a retrieval result; and the retrieval result is returned to the retrieval interface.

為達成上述目的及功效,本發明所採用之技術手段及其構造,茲繪圖就本發明之較佳實施例詳加說明其特徵與功能如下,俾利完全瞭解。In order to achieve the above objects and effects, the technical means and the configuration of the present invention will be described in detail with reference to the preferred embodiments of the present invention.

請參閱第一、二圖所示,係為本發明較佳實施例之處理流程圖及平台架構示意圖,由圖中可以清楚看出,在此較佳實施例中所揭露者是關於一檢索平台1,此種檢索平台1分別連接於遠端系統2及檢索介面3,使檢索平台1一方面可擷取遠端系統2所儲存的複數專利文件,並將各專利文件分別儲存在檢索平台1內部,另一方面,亦可接收檢索介面3所傳輸之檢索要求(search request),並根據檢索要求搜尋檢索平台1內所儲存的複數專利文件,再將符合檢索要求的專利文件內容回應至檢索介面3。其中檢索平台1從遠端系統2擷取的各專利文件分別包括有複數屬性內容,此些屬性內容包括有摘要、發明所屬之技術領域、先前技術、發明內容、實施方式以及申請專利範圍,或者是更包括有摘要、發明所屬之技術領域、先前技術、發明內容、實施方式以及申請專利範圍至少其中之一的段落、句子、以及字詞等,另,遠端系統2包括但不限於美國專利商標局(USPTO)、日本專利局(JPO)、韓國專利局(KIPRIS)、歐洲專利局(EPO)的複數專利文件儲存資料庫,而檢索介面3則可提供一檢索頁面供使用者輸入至少一個檢索條件。Please refer to the first and second figures, which are flowcharts of processing and platform architecture of a preferred embodiment of the present invention. It can be clearly seen from the figure that the disclosed embodiment is related to a retrieval platform. 1. The search platform 1 is connected to the remote system 2 and the search interface 3 respectively, so that the search platform 1 can retrieve the plurality of patent documents stored by the remote system 2, and store the patent documents in the search platform 1 respectively. Internally, on the other hand, the search request transmitted by the search interface 3 can also be received, and the plurality of patent documents stored in the search platform 1 can be searched according to the search request, and the content of the patent file meeting the search requirements can be responded to the search. Interface 3. Each patent file retrieved by the retrieval platform 1 from the remote system 2 includes a plurality of attribute contents, including the abstract, the technical field to which the invention belongs, the prior art, the inventive content, the implementation manner, and the patent application scope, or It is a paragraph, a sentence, a word, and the like that further include at least one of the abstract, the technical field to which the invention belongs, the prior art, the inventive content, the embodiment, and the patent application. Further, the remote system 2 includes, but is not limited to, a US patent. The Patent Office (USPTO), the Japanese Patent Office (JPO), the Korean Patent Office (KIPRIS), and the European Patent Office (EPO) have multiple patent file storage databases, while the search interface 3 provides a search page for the user to enter at least one. Search condition.

關於檢索平台1、遠端系統2與檢索介面3之處理流程包括有下列步驟:The processing flow of the retrieval platform 1, the remote system 2 and the retrieval interface 3 includes the following steps:

(100)擷取複數專利文件。(100) Drawing multiple patent documents.

(101)解析該些專利文件的複數屬性內容,分別產生複數相同的屬性內容及相異的屬性內容。(101) Parsing the plural attribute contents of the patent documents, respectively generating the same attribute content and different attribute contents.

(102)傳輸各相同及相異的屬性內容至複數儲存位置。(102) Transfer the same and different attribute contents to a plurality of storage locations.

(103)使相同的屬性內容分別對應儲存於相同的儲存位置,而相異的屬性內容則分別對應儲存於不同的儲存位置。(103) The same attribute content is stored in the same storage location, and the different attribute contents are respectively stored in different storage locations.

(104)接收檢索要求。(104) Receiving a search request.

(105)比對檢索要求與各儲存位置內的屬性內容是否完全相同或部份相同。(105) The alignment search request is identical or partially identical to the attribute content in each storage location.

(106)生成一檢索結果,並將檢索結果回應至檢索介面3。(106) generating a search result and responding the search result to the search interface 3.

綜上所述,此處理流程的詳細說明可進一步參考第三、四圖,係為本發明較佳實施例之檢索平台方塊圖及一專利文件之複數屬性內容之示意圖,由圖中可以清楚看出,在此較佳範例中,此檢索平台1可包括有一資料擷取單元11,並使資料擷取單元11可定期地擷取遠端系統2之複數專利文件,且資料擷取單元11連接有一資料解析單元12,此資料解析單元12可針對各專利文件的複數屬性內容進行分析解構,並產生複數相同及相異的屬性內容,也就是說,資料解析單元12可分析解構各專利文件,並使各專利文件的摘要(abstract)A、發明所屬之技術領域(field of the invention)B、先前技術(description of the related art)C、發明內容(brief summary of the invention)D、實施方式(detailed description of the invention)E以及申請專利範圍(what is claimed is)F的各屬性內容從各專利文件被拆解出來,當然地,若複數屬性內容被定義為包括有摘要A、發明所屬之技術領域B、先前技術C、發明內容D、實施方式E、申請專利範圍F以及至少其中之一的段落、句子及字詞,舉例來說,申請專利範圍F的段落F1、F2、句子F11、F12及字詞F111、F112,同樣地,也會被資料解析單元12從各專利文件中拆解出來。而且資料解析單元12會將相同的屬性內容形成一群組,例如各專利文件的摘要A皆形成同一群組,換言之,相異的屬性內容便形成不同群組,例如各專利文件的摘要A與申請專利範圍F會形成不同群組,且各群組內的複數屬性內容會依序傳輸至複數儲存位置13,之後,複數儲存位置13會於接收到複數相同及相異的屬性內容分別執行儲存,使相同的屬性內容儲存在同一個儲存位置13,例如各專利文件的摘要A皆儲存在同一個儲存位置13,換言之,相異的屬性內容便儲存在不同儲存位置13,例如各專利文件的摘要A與申請專利範圍F會分別儲存在不同的儲存位置13。而複數儲存位置13可定義為不同的伺服器裝置,或者是同一個伺服器裝置的不同儲存空間,例如位在同一個伺服器裝置內的不同硬碟,但不以此為限。更進一步地,複數儲存位置13的相互連接方式可為平行式(parallel)的架構或階層式(hierarchical)的架構。In summary, the detailed description of the process flow can be further referred to the third and fourth figures, which are schematic diagrams of the search platform block diagram and the plural attribute content of a patent document according to a preferred embodiment of the present invention, which can be clearly seen from the figure. In this preferred example, the search platform 1 can include a data capture unit 11 and enable the data capture unit 11 to periodically retrieve the plurality of patent documents of the remote system 2, and the data capture unit 11 is connected. There is a data parsing unit 12, which can analyze and deconstruct the complex attribute content of each patent document, and generate complex and identical attribute contents, that is, the data parsing unit 12 can analyze and deconstruct the patent documents. And the abstract of each patent document, the field of the invention B, the description of the related art C, the summary of the invention D, the implementation ( Detailed description of the invention) E and the contents of the patent claim scope (what is claimed is) F are disassembled from each patent document, of course, The plural attribute content is defined to include paragraphs, sentences, and words including Abstract A, Technical Field B to which the invention belongs, Prior Art C, Invention Content D, Embodiment E, Patent Application Range F, and at least one of them, for example, Paragraphs F1, F2, sentences F11, F12 and words F111, F112 of the patent application scope F are similarly disassembled from the patent documents by the data analysis unit 12. Moreover, the data parsing unit 12 forms the same attribute content into a group. For example, the abstracts A of each patent document form the same group. In other words, the different attribute contents form different groups, for example, the abstract A of each patent document and The patent application scope F will form different groups, and the complex attribute contents in each group will be sequentially transmitted to the plurality of storage locations 13, after which the plurality of storage locations 13 will respectively perform storage in the same and different attribute contents. , the same attribute content is stored in the same storage location 13, for example, the abstract A of each patent document is stored in the same storage location 13, in other words, the different attribute content is stored in different storage locations 13, such as patent documents The abstract A and the patented range F are stored in different storage locations 13, respectively. The plurality of storage locations 13 may be defined as different server devices, or different storage spaces of the same server device, such as different hard disks located in the same server device, but not limited thereto. Further, the interconnection manner of the plurality of storage locations 13 may be a parallel architecture or a hierarchical architecture.

在此較佳實施例中,可進一步參考第四圖,以及第五圖係為本發明複數儲存位置的階層式架構示意圖,以階層式的架構作說明,定義第一階層各儲存位置分別包括有摘要13A、發明所屬之技術領域13B、先前技術13C、發明內容13D、實施方式13E以及申請專利範圍13F,且第二階層各儲存位置分別包括有摘要、發明所屬之技術領域、先前技術、發明內容、實施方式以及申請專利範圍的段落,第三階層各儲存位置分別包括有摘要、發明所屬之技術領域、先前技術、發明內容、實施方式以及申請專利範圍各段落中的句子,第四階層儲存位置分別包括有摘要、發明所屬之技術領域、先前技術、發明內容、實施方式以及申請專利範圍各句子中的字詞。In the preferred embodiment, reference may be made to the fourth figure, and the fifth figure is a hierarchical architecture diagram of the plurality of storage locations of the present invention. The hierarchical architecture is used to describe that each storage location of the first level includes Abstract 13A, the technical field 13B to which the invention belongs, the prior art 13C, the invention content 13D, the embodiment 13E, and the patent application scope 13F, and the storage locations of the second hierarchy respectively include the abstract, the technical field to which the invention belongs, the prior art, and the inventive content. , the embodiment, and the paragraphs of the patent application scope, the third-level storage locations respectively include the abstract, the technical field to which the invention belongs, the prior art, the inventive content, the embodiment, and the sentences in the paragraphs of the patent application scope, the fourth-level storage location The words included in the abstract, the technical field to which the invention pertains, the prior art, the inventive content, the embodiment, and the sentences in the scope of the patent application are included.

舉例來說,以申請專利範圍F為說明,還包括有申請專利範圍F的段落F1、F2、句子F11、F12及字詞F111、F112,第一階層的儲存位置13F可儲存申請專利範圍F,而第二階層的複數儲存位置13F1、13F2則儲存申請專利範圍中的各段落F1、F2,各段落的判斷方式可利用句號,又因為申請專利範圍單句為之的法規規定,申請專利範圍中的各段落便代表了申請專利範圍的各權項,因此各權項會被各別儲存在第二階層的不同儲存位置13F1、13F2,第三階層的複數儲存位置13F11、13F12則儲存各權項中的各句子F11、F12,各句子的判斷方式可利用分號或逗號,並將各句子分別儲存在第三階層的不同儲存位置13F11、13F12,而第四階層的複數儲存位置13F111、13F112則儲存各句子中的各字詞F111、F112,各字詞的判斷方式可利用字元容量,將各句子中的字詞分別儲存在第四階層的不同儲存位置13F111、13F112,然而,上述說明僅是一較佳實施例,但不以此為限。For example, in the patent application scope F, the paragraphs F1, F2, sentences F11, F12 and the words F111, F112 of the patent application scope F are also included, and the storage location 13F of the first hierarchy can store the patent application scope F, The plural storage locations 13F1 and 13F2 of the second class store the paragraphs F1 and F2 in the scope of application for patents. The judgment method of each paragraph can use the full stop, and the patent application scope is stipulated in the patent application scope. Each paragraph represents the rights of the patent application scope, so the rights are stored in different storage locations 13F1, 13F2 of the second level, and the multiple storage locations 13F11, 13F12 of the third level are stored in the respective rights. For each sentence F11, F12, each sentence can be judged by using a semicolon or a comma, and each sentence is stored in a different storage location 13F11, 13F12 of the third hierarchy, and the plurality of storage locations 13F111, 13F112 of the fourth hierarchy are stored. Each word in each sentence F111, F112, each word can be judged by the character capacity, the words in each sentence are stored in different storage positions 13F1 of the fourth level 11, 13F112, however, the above description is only a preferred embodiment, but is not limited thereto.

而且,上述之資料解析單元12的解析方式包括但不限於n元語法(n-gram),換言之,當接收到各專利文件後會利用n元語法解析出各專利文件中的複數屬性內容。此種解析程序通常會針對專利文件中各片段執行切割解析成預設的儲存單位,例如儲存單位設定為字詞時,且專利文件內之申請專利範圍F中具有「一種燈具之散熱結構」,的句子內容,則被解析的字詞為包括有「一種」、「種燈」、「燈具」、「具之」、「之散」、「散熱」、「熱結」、「結構」等內容。之後再將此些字詞分別儲存在指定第四階層的字詞儲存位置13F111、13F112。Moreover, the parsing manner of the above-mentioned data parsing unit 12 includes, but is not limited to, an n-gram (n-gram). In other words, after receiving each patent document, the complex attribute content in each patent file is parsed by using an n-gram. Such a parsing program usually performs cutting analysis into a preset storage unit for each segment in the patent document, for example, when the storage unit is set to a word, and the patent application scope F in the patent document has "a heat dissipation structure of a lamp". For the sentence, the words to be parsed include "a kind", "species light", "lighting", "having", "scattering", "heating", "hot knot", "structure", etc. . These words are then stored in the word storage locations 13F111, 13F112 of the specified fourth level, respectively.

更進一步地,第三圖所揭露較佳實施例之檢索平台1又包括有至少一邏輯運算單元14,且至少一邏輯運算單元14為分別連接有複數儲存位置與檢索介面3。此至少一邏輯運算單元14可為微處理器(Microprocessor)或微控制器(MCU),但不以此為限。而此至少一邏輯運算單元14為可接收檢索介面3所傳輸的檢索要求,並根據檢索要求分別同時讀取複數儲存位置內的屬性內容,使至少一邏輯運算單元14分別同時與各儲存位置進行比對程序。Further, the retrieval platform 1 of the preferred embodiment of the third embodiment further includes at least one logic operation unit 14, and at least one logic operation unit 14 is connected with a plurality of storage locations and a retrieval interface 3, respectively. The at least one logical operation unit 14 can be a microprocessor or a microcontroller (MCU), but is not limited thereto. The at least one logical operation unit 14 is configured to receive the search request transmitted by the search interface 3, and simultaneously read the attribute contents in the plurality of storage locations according to the search request, so that at least one logical operation unit 14 simultaneously performs the same with each storage location. Comparison program.

綜上所述,本發明所揭露階層式的分散式架構,為可供檢索介面3傳輸檢索要求時,各儲存位置會分別同時進行該檢索要求的分析比對程序,如此,在專利文件逐日增多的情況下,便可加快眾多專利文件的檢索處理速度。In summary, the hierarchical decentralized architecture disclosed in the present invention, when the retrieval request is transmitted through the search interface 3, each storage location separately performs the analysis comparison procedure of the retrieval request, so that the patent documents increase day by day. In this case, the retrieval processing speed of many patent documents can be accelerated.

在另一較佳實施例中,遠端系統2的複數專利文件亦可預先下載至一可讀取紀錄媒體,使可讀取紀錄媒體儲存有複數專利文件,當此可讀取紀錄媒體的複數專利文件被檢索平台1所擷取後便會進行後續的解析程序。例如,使用者可向另一方購買複數專利文件,並將此些專利文件儲存至檢索平台1,此時資料擷取單元11會擷取出此些專利文件,並使至少一邏輯運算單元14會接收此些專利文件,之後至少一邏輯運算單元14會將各專利文件傳輸至資料解析單元12進行前述之解析程序。如此,檢索平台1所欲儲存或解析的專利文件來源可不侷限於遠端系統2或內儲有複數專利文件的可讀取紀錄媒體,亦可包括有其他形式,或者是從他方伺服器預儲的位置中擷取。In another preferred embodiment, the plurality of patent documents of the remote system 2 can also be pre-downloaded to a readable record medium, so that the readable record medium stores a plurality of patent documents, and the plurality of recordable media can be read. After the patent document is retrieved by the retrieval platform 1, a subsequent analysis procedure is performed. For example, the user may purchase a plurality of patent documents from the other party and store the patent documents in the search platform 1. At this time, the data capture unit 11 extracts the patent documents and causes at least one logical operation unit 14 to receive In these patent documents, at least one logical operation unit 14 then transmits each patent file to the data parsing unit 12 for the aforementioned parsing procedure. Thus, the source of the patent file to be stored or parsed by the search platform 1 may not be limited to the remote system 2 or the readable record medium storing the plurality of patent documents, or may be included in other forms or pre-stored from other servers. Select from the location.

再請參閱第六圖,係為本發明較佳實施例之檢索處理流程圖,由圖中可以清楚看出,在此較佳範例中,關於檢索平台1與檢索介面3之詳細處理流程包括有下列步驟:Please refer to the sixth figure, which is a flowchart of the retrieval process according to the preferred embodiment of the present invention. It can be clearly seen from the figure that in the preferred example, the detailed processing flow of the retrieval platform 1 and the retrieval interface 3 includes The following steps:

(200)提供一個檢索介面3可供接收檢索要求。(200) Provide a search interface 3 for receiving retrieval requests.

(201)輸入一檢索要求至檢索介面3,其中此檢索要求包括有至少一檢索條件。(201) Entering a search request into the search interface 3, wherein the search request includes at least one search condition.

(202)判斷檢索要求是否包括有複數檢索條件,若是,進行步驟(203),若否,進行步驟(205)。(202) Determine whether the search request includes a plural search condition, and if so, proceed to step (203), and if no, proceed to step (205).

(203)耦合該複數檢索條件,並生成檢索參數。(203) coupling the complex search condition and generating a search parameter.

(204)比對檢索參數與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(206),若否,進行步驟(207)。(204) Aligning the search parameters with the attribute contents of the plurality of storage locations, whether they are identical or partially identical, if yes, proceeding to step (206), and if not, proceeding to step (207).

(205)比對檢索條件與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(206),若否,進行步驟(207)。(205) Whether the content of each attribute of the comparison search condition and the plurality of storage positions is identical or partially identical, and if so, proceeding to step (206), and if not, proceeding to step (207).

(206)擷取比對後符合完全相同或部份相同的屬性內容。(206) Aligning the attributes that match the exact same or partially identical attributes.

(207)生成一檢索結果。(207) Generate a search result.

(208)回應檢索結果至檢索介面3。(208) Responding to the search result to the search interface 3.

承上述,關於前述檢索流程之檢索程式可內儲於一可讀取紀錄媒體,當該檢索程式被載入於裝置,且檢索介面3透過無線或有線網路與檢索平台1建立連線後,使用者可透過檢索介面3輸入一檢索要求,此檢索要求會被傳輸到檢索平台1之至少一邏輯運算單元14,且各邏輯運算單元14會判斷檢索要求所具之檢索條件是否為複數,若檢索條件為複數則會將檢索條件進行耦合,並生成檢索參數,以供後續的比對程序,若檢索條件為一個則不會將檢索條件進行耦合,並直接進行比對程序。In the above, the search program for the foregoing search process can be stored in a readable record medium, and when the search program is loaded into the device, and the search interface 3 is connected to the search platform 1 via a wireless or wired network, The user can input a search request through the search interface 3, and the search request is transmitted to at least one logical operation unit 14 of the search platform 1, and each logical operation unit 14 determines whether the search condition of the search request is a complex number. When the search condition is plural, the search conditions are coupled, and the search parameters are generated for subsequent comparison procedures. If the search condition is one, the search conditions are not coupled, and the comparison program is directly performed.

其中,此檢索條件可為預設關鍵字群組、使用者輸入的檢索字詞及對應的屬性內容、權數條件上述至少其中之一,關於預設關鍵字群組則包括有複數關鍵字詞,該些關鍵字詞為通常是某技術領域專家篩選出該技術領域關鍵性字詞,此為初始化預設的資料,而使用者可透過選擇的方式加入檢索條件,增加該技術領域檢索的準確度。另一方面,使用者亦可透過自行輸入檢索字詞的方式來設定檢索條件,而不論是關鍵字詞的設定或任意字詞的輸入皆可進一步對應耦合有屬性內容,更甚者,上述之檢索條件又耦合有權數條件,此可協助定義關鍵字群組以及使用者任意輸入字詞的檢索重要性排列,或者是使用者輸入字詞或預設關鍵字詞及不同的對應屬性內容的檢索優先序列。舉例來說,當關鍵字字詞的權數條件為(2),而任意輸入字詞的權數條件為(1),則表示關鍵字字詞進行比對程序後所產生的結果要優先排列在任意輸入字詞之前,或者是優先處理關鍵字字詞的檢索條件搜尋,然而,此權數條件與關鍵字群組以及任意輸入字詞的相互影響或優先序列定義可隨使用者設計,並不以此為限。The search condition may be at least one of the preset keyword group, the search term input by the user, the corresponding attribute content, and the weight condition, and the plural keyword word is included in the preset keyword group. The keyword words are usually selected by a technical expert to select key words in the technical field, which is to initialize the preset data, and the user can add the search condition by selecting the method to increase the accuracy of the search in the technical field. . On the other hand, the user can also set the search condition by inputting the search term by himself, and the setting of the keyword word or the input of any word can further correspond to the attribute content, and even more, the above The search condition is coupled with a number of conditions, which can help define the keyword group and the search importance ranking of the user's arbitrarily input words, or the search of the user input word or preset keyword word and different corresponding attribute content. Priority sequence. For example, when the weight condition of the keyword word is (2) and the weight condition of any input word is (1), it means that the result of the keyword word after the comparison program is prioritized in any order. Before the word is entered, or the search condition search for the keyword term is prioritized, however, the interaction of the weight condition with the keyword group and any input words or the priority sequence definition can be designed by the user, and is not Limited.

之後,進行比對程序時,便可利用至少一檢索條件所生成的檢索參數,或利用一個檢索條件與複數儲存位置的屬性內容進行比對,此時至少一邏輯運算單元14會同時讀取並搜尋複數儲存位置內是否有完全相同或部份相同於檢索參數或檢索條件的屬性內容,若有,則會依序擷取完全相同或部份相同的屬性內容,然而,不論複數儲存位置內是否具有完全相同或部份相同於檢索參數或檢索條件的屬性內容,至少一邏輯運算單元14比對後所產生的檢索結果,皆會傳回至檢索介面3,令使用者可從檢索介面3得知檢索結果。Then, when the comparison program is performed, the search parameters generated by the at least one search condition may be used, or the search conditions may be compared with the attribute contents of the plurality of storage positions, and at least one logical operation unit 14 simultaneously reads and Search for the content of the attribute in the plural storage location that is identical or partially identical to the search parameter or search condition. If there is, the attribute content of the same or partially the same will be retrieved in sequence, however, regardless of whether the multiple storage location is The attribute content having the same or partially identical to the search parameter or the search condition is returned to the search interface 3 by at least one of the search results generated by the logical operation unit 14 so that the user can obtain the search interface 3. Know the search results.

前述的比對程序尤指資料挖掘(data mining)的程序。此資料挖掘程序中可使用但不限於關聯性演算法(association rule)、詞類/詞性(part of speech)分析法或對列演算法(alignment algorithm),例如史密斯-華特曼演算法(smith-waterman)上述至少其中之一,也就是說,各資料挖掘的演算法或分析法也可相互組合使用。The aforementioned comparison program is especially a program of data mining. This data mining program can be used but is not limited to an association rule, a part of speech analysis or an alignment algorithm, such as the Smith-Wattman algorithm (smith- Waterman) At least one of the above, that is, the algorithm or analysis method of each data mining can also be used in combination with each other.

另請參閱第七圖,係為本發明另一較佳實施例之檢索處理流程圖,由圖中可以清楚看出,此部分與較佳實施例具有不同的地方在於,另一較佳實施中,在原本的架構下增加了額外的比對程序,也就是說,關於檢索平台1與檢索介面3之處理流程在另一較佳實施例中包括有二種以上之比對程序,此處理流程包括下列步驟:Please refer to the seventh figure, which is a flowchart of the search process according to another preferred embodiment of the present invention. It can be clearly seen from the figure that this part is different from the preferred embodiment in another preferred embodiment. In the original architecture, an additional comparison program is added, that is, the processing flow of the retrieval platform 1 and the retrieval interface 3 includes two or more comparison programs in another preferred embodiment. Includes the following steps:

(300)提供一個檢索介面3可供接收檢索要求。(300) Provide a search interface 3 for receiving retrieval requests.

(301)輸入一檢索要求至檢索介面3,其中此檢索要求包括有至少一檢索條件。(301) Entering a search request to the search interface 3, wherein the search request includes at least one search condition.

(302)判斷檢索要求是否包括有複數檢索條件,若是,進行步驟(303),若否,進行步驟(305)。(302) Determine whether the search request includes a plural search condition, and if so, proceed to step (303), and if no, proceed to step (305).

(303)耦合複數檢索條件,並生成檢索參數。(303) Coupling the complex retrieval conditions and generating retrieval parameters.

(304)進行第一次比對程序,比對檢索參數與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(306),若否,進行步驟(307)。(304) Performing a first comparison procedure, comparing whether the search parameters and the attribute contents of the plurality of storage locations are identical or partially identical, and if so, performing step (306), and if not, performing step (307).

(305)進行第一次比對程序,比對檢索條件與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(306),若否,進行步驟(307)。(305) Performing a first comparison procedure, comparing whether the content of each attribute of the search condition and the plurality of storage locations is identical or partially identical, and if so, performing step (306), and if not, performing step (307).

(306)擷取比對後符合完全相同或部份相同的屬性內容,生成一檢索結果。(306) A matching search result is obtained by matching the identical or partially identical attribute content.

(307)進行第二次比對程序,挖掘完全相同或部份相同於該至少一個檢索條件的複數屬性內容與複數儲存位置之各屬性內容的相互關聯性。(307) Performing a second comparison procedure to mine the correlation between the complex attribute content of the same or partially identical to the at least one retrieval condition and the attribute content of the plurality of storage locations.

(308)生成一資料挖掘結果。(308) Generate a data mining result.

(309)回應資料挖掘結果至檢索介面3。(309) Respond to the data mining results to the search interface 3.

同樣地,此另一較佳實施例的檢索處理流程與第六圖之較佳實施例檢索處理流程相同。二者具有差異的是,當至少一邏輯運算單元14完成比對程序後,假設比對程序後的檢索結果為擷取到符合完全相同或部份相同於至少一檢索條件或檢索參數的屬性內容,之後,會再進行第二次的資料比對程序,詳言之,第一次的比對程序或第二次的比對程序中可使用但不限於關聯性演算法、詞類/詞性分析法或對列演算法,例如史密斯-華特曼演算法上述至少其中之一,也就是說,各資料挖掘的演算法或分析法也可相互組合使用。其中,關於關聯性演算法,可利用此演算法來執行完全相同或部份相同於該至少一個檢索條件或檢索參數與複數儲存位置之各屬性內容的佔有比率相關性分析,舉例來說,當檢索條件包括有『胃潰瘍』的任意輸入字詞及『申請專利範圍』的屬性內容,且使申請專利範圍對應於任意輸入字詞,則至少一邏輯運算單元14便會將二者檢索條件耦合,經由比對後生成一檢索結果,且檢索結果為在申請專利範圍所儲存的各儲存位置中搜尋到複數完全相同或部份相同的屬性內容。然而,此些比對的儲存位置自然包括有前述第一階層的申請專利範圍的儲存位置、第二階層的申請專利範圍各權項的儲存位置、第三階層的申請專利範圍各權項各句子的儲存位置、第四階層的申請專利範圍各句子各字詞的儲存位置。之後,至少一邏輯運算單元14會依據搜尋的檢索結果再進一步資料挖掘出各儲存位置中是否具有相關於檢索結果之屬性內容,此資料挖掘的運算法包括但不限於Apriori運算法。在申請專利範圍中搜尋出完全相同或部份相同『胃潰瘍』字詞的檢索結果,並根據此檢索結果進一步利用Apriori運算法得出各儲存位置中『質子幫浦抑制劑』與『胃潰瘍』同時出現的機率高於預設數值,此預設數值可由使用者自行訂立,如此,便可得知胃潰瘍與質子幫浦抑制劑具有高度相關性。Similarly, the retrieval processing flow of this other preferred embodiment is the same as the retrieval processing flow of the preferred embodiment of the sixth embodiment. The difference between the two is that, after the at least one logical operation unit 14 completes the comparison program, it is assumed that the search result after the comparison program is extracted to the attribute content that is identical or partially identical to at least one retrieval condition or retrieval parameter. After that, the second data comparison program will be performed. In detail, the first comparison program or the second comparison program can be used but is not limited to the association algorithm, word class/speech analysis method. Or at least one of the above algorithms, such as the Smith-Wattman algorithm, that is, the algorithm or analysis of each data mining can also be used in combination with each other. Wherein, regarding the association algorithm, the algorithm may be used to perform an accountability ratio correlation analysis of the attribute content of the attribute content that is identical or partially identical to the at least one search condition or the search parameter and the plurality of storage locations, for example, when The search conditions include any input words of "gastric ulcer" and attribute contents of "application patent scope", and the patent application scope corresponds to any input word, at least one logical operation unit 14 will couple the search conditions. After the comparison, a search result is generated, and the search result is to search for a plurality of identical or partially identical attribute contents in each storage location stored in the patent application scope. However, the storage locations of such comparisons naturally include the storage location of the first-level patent application scope, the storage location of the rights of the second-level patent application scope, and the claims of the third-level patent application scope. The storage location, the fourth-level application patent scope, the storage location of each word in each sentence. Then, at least one logic operation unit 14 further extracts, according to the search result of the search, whether the attribute content related to the search result is included in each storage location. The data mining algorithm includes but is not limited to the Apriori algorithm. Search for the search results of the same or partial "stomach ulcer" words in the scope of patent application, and further use the Apriori algorithm to obtain the "proton pump inhibitor" and "stomach ulcer" in each storage position. The probability of occurrence is higher than the preset value. This preset value can be set by the user. Thus, it can be known that the gastric ulcer is highly correlated with the proton pump inhibitor.

另外一方面,此資料挖掘的程序亦可使用但不限於詞類/詞性分析法,利用此分析法供執行完全相同或部份相同於該至少一個檢索條件的複數屬性內容與複數儲存位置之各屬性內容的詞性位置相關性分析。舉例來說,當資料挖掘程序中包括有詞類/詞性分析法以及關聯性演算法,同樣地,當檢索條件包括有『胃潰瘍』的任意輸入字詞及『申請專利範圍』的屬性內容,且經過關聯性演算法得出胃潰瘍與質子幫浦抑制劑具有高度相關性的結論,並同時經過詞類/詞性分析法判斷胃潰瘍與質子幫浦抑制劑為主詞和受詞的關係,則可進一步推定胃潰瘍的用藥解決方案為質子幫浦抑制劑。In another aspect, the data mining program may also use, but is not limited to, a part-of-speech/speech analysis method, and the analysis method is used to execute the attributes of the complex attribute content and the plural storage location that are identical or partially identical to the at least one search condition. Analysis of part of speech location relevance of content. For example, when the data mining program includes a word class/speech analysis method and an association algorithm, similarly, when the search condition includes any input word of "stomach ulcer" and the attribute content of "application patent scope", and Correlation algorithm concludes that gastric ulcer and proton pump inhibitors are highly correlated, and at the same time, by word/lexical analysis to determine the relationship between gastric ulcer and proton pump inhibitors as the main words and words, gastric ulcer can be further presumed. The drug solution is a proton pump inhibitor.

而關於史密斯-華特曼演算法,此演算法為針對檢索要求執行複數儲存位置的屬性內容的比對,換言之,比對時可針對已儲存好的複數儲存位置分別進行比對,並對於各儲存位置的比對結果以懲罰性(penalty)分數計算方式來定義,分數計算方式包括但不限於區分為負分為正分,其中負分定義為比對後並無相似結果,而正分則表示具有相似度,並對於相似程度進行分數累加,使得分數較高者代表相似程度也較高,為了簡化計算方式負分也以零分取代,亦可以全部定義為正分,但區分為分數的大小。With regard to the Smith-Wattman algorithm, this algorithm is an alignment of the attribute contents of the complex storage location for the retrieval request, in other words, the comparison can be performed separately for the stored plural storage locations, and for each The comparison result of the storage location is defined by the penalty score calculation method, and the score calculation method includes, but is not limited to, a negative division into a positive score, wherein the negative score is defined as having no similar result after the comparison, and the positive score is The representation has similarity, and the scores are added to the degree of similarity, so that the higher scores represent higher degree of similarity. In order to simplify the calculation, the negative scores are also replaced by zero points, and may also be defined as positive scores, but are divided into fractions. size.

另舉一例再說明之,上述比對方式為可有利於未定義明確技術內容的專利文件比對,詳言之,上述的比對方式不僅能比對相同或近似於檢索要求的內容,亦可協助找出未明確但相關的技術內容,舉例來說,當檢索要求包括有通用串列埠(USB)的檢索條件,且檢索平台1中儲存有一專利文件,其揭露(Disclosure)有『一連接器改良結構』,但未明確定義為此種通用串列埠之技術,則利用關聯性演算法以及詞類/詞性分析法便可挖掘出來成為檢索結果,也就是說,此『一連接器改良結構』的專利文件若揭露有『訊號腳位和電源腳位』、『4pin』,『5pin』,則可利用關聯性演算法比對出現機率找出相關,或者是結合詞類/詞性分析法找出『傳輸』為動詞時,其受詞是否有出現『訊號和電源』的相關性內容。如此,便可挖掘出未明確揭示通用串列埠連接器結構的專利文件。然而,此舉例僅為本發明之說明而已,並不限於此種型態的文件內容。As another example, the above comparison method can facilitate the comparison of patent documents without defining clear technical content. In detail, the above comparison method can not only compare the content that is the same or similar to the search requirement, but also Assist in finding unclear but relevant technical content. For example, when the search request includes a general serial port (USB) search condition, and the search platform 1 stores a patent file, the disclosure has a "connection" The improved structure of the device, but not explicitly defined as the technology of such universal serialization, can be mined as a search result by using the association algorithm and the word/speech analysis method, that is, the "one connector improved structure" If the patent document reveals "signal pin and power pin", "4pin", and "5pin", you can use the correlation algorithm to find out the probability of occurrence, or combine the word/speech analysis to find out When "transmission" is a verb, whether the word "signal and power" has a relevant content. In this way, patent documents that do not explicitly reveal the structure of the universal serial port connector can be unearthed. However, this example is merely illustrative of the invention and is not limited to the file content of this type.

再請參閱第八圖所示,係為本發明又一較佳實施例之檢索處理流程圖,由圖中可以清楚看出,當程序中輸入的檢索要求包括有至少一文件(Document),此文件之定義為包括有段落、句子、字詞等屬性內容,舉例來說,文件可為專利文件或技術文件等任何均等態樣。此部分與較佳實施例具有不同的地方在於,又一較佳實施中,在原本的架構下增加了額外的文件解析程序,也就是說,本發明可進一步增加有文件的解析程序,而形成另一種資料檢索的處理流程方式,換言之,進行比對程序之前會進行檢索條件之解析程序,此檢索處理流程包括下列步驟:Referring to FIG. 8 again, it is a flowchart of a search process according to still another preferred embodiment of the present invention. It can be clearly seen from the figure that when the search request input in the program includes at least one file (Document), The definition of a document includes the content of a paragraph, a sentence, a word, and the like. For example, the document may be any equal aspect such as a patent document or a technical document. This part is different from the preferred embodiment in that, in another preferred embodiment, an additional file parsing program is added under the original architecture, that is, the present invention can further increase the file parsing program to form Another method of processing data retrieval, in other words, an analysis procedure for performing retrieval conditions before the comparison program, the retrieval processing flow includes the following steps:

(400)提供一個檢索介面3可供接收檢索要求。(400) Provide a search interface 3 for receiving retrieval requests.

(401)輸入一檢索要求至檢索介面3,其中此檢索要求為至少一文件。(401) Enter a search request into the search interface 3, wherein the search request is at least one file.

(402)解析該至少一技術文件或專利文件,並產生複數相同及相異的屬性內容。(402) Parsing the at least one technical file or patent file and generating a plurality of identical and different attribute contents.

(403)判斷檢索要求是否包括有複數檢索條件,若是,進行步驟(404),若否,進行步驟(406)。(403) Determine whether the search request includes a plural search condition, and if so, proceed to step (404), and if no, proceed to step (406).

(404)耦合複數檢索條件,並生成檢索參數。(404) Coupling the complex retrieval conditions and generating retrieval parameters.

(405)比對檢索參數與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(407),若否,進行步驟(408)。(405) Aligning whether the search parameters and the attribute contents of the plurality of storage locations are identical or partially identical, and if so, proceeding to step (407), and if not, proceeding to step (408).

(406)比對檢索條件與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(407),若否,進行步驟(408)。(406) Whether the content of each attribute of the comparison search condition and the plural storage location is identical or partially identical, and if so, proceeding to step (407), and if not, proceeding to step (408).

(407)擷取比對後符合完全相同或部份相同的屬性內容,生成一檢索結果。(407) A matching search result is obtained by matching the identical or partially identical attribute content.

(408)生成一檢索結果。(408) Generate a search result.

(409)回應檢索結果至檢索介面3。(409) Responding to the search result to the search interface 3.

詳言之,當檢索介面3接收到檢索要求時,檢索介面3會將檢索要求傳輸至檢索平台1,而檢索平台1之至少一邏輯運算單元14會判斷檢索條件是否為文件形式,若檢索條件為文件形式時,至少一邏輯運算單元14會將此文件傳輸至資料解析單元12進行解析程序,例如前述之n元語法,當然地,亦可以其他解析程序替代而不限於此。當資料解析單元12完成解析程序並形成複數屬性內容後,則至少一邏輯運算單元14會接收此些屬性內容,並進行比對程序。此比對程序以史密斯-華特曼演算法作為詳細說明,但並不限於此。In detail, when the search interface 3 receives the search request, the search interface 3 transmits the search request to the search platform 1, and at least one logical operation unit 14 of the search platform 1 determines whether the search condition is a file form, if the search condition When it is in the form of a file, at least one logical operation unit 14 transmits the file to the data parsing unit 12 for parsing, for example, the aforementioned n-gram. Of course, other parsing programs may be substituted instead of being limited thereto. After the data parsing unit 12 completes the parsing process and forms the complex attribute content, at least one logical operation unit 14 receives the attribute contents and performs a comparison program. This comparison program is described in detail by the Smith-Wattman algorithm, but is not limited thereto.

詳細地說,當文件被解析為複數屬性內容後,此演算法可分別針對此些的屬性內容與複數儲存位置中所具有的內容進行邊緣式比對,詳言之,若是檢索要求的屬性內容與複數儲存位置的屬性內容具有相同的字首及字尾,也就是說,當比對後產生一相同的字詞時,定義為字首,之後可於相同的順序往後開始搜尋出另一相同的字詞,定義為字尾。且相同的字首及字尾會形成一區段,可能為一句子或一段落,此區段的相同順序具有完全相同或近似的內容,而近似的定義可利用分數計算方式設定,且字首和字尾的相同順序具有係完全相同的內容,則此檢索要求的比對結果會被截取出來。In detail, when the file is parsed into the content of the complex attribute, the algorithm can perform edge-wise comparison on the content of the attribute content and the content in the plural storage location respectively, in detail, if the attribute content of the retrieval is required It has the same prefix and suffix as the attribute content of the plural storage location, that is, when the same word is generated after the comparison, it is defined as the prefix, and then the search can be started in the same order. The same word is defined as the suffix. And the same prefix and suffix will form a segment, which may be a sentence or a paragraph. The same order of the segments has exactly the same or similar content, and the definition of the approximation can be set by the score calculation method, and the prefix and If the same order of the suffixes has exactly the same content, the comparison result of this retrieval request will be intercepted.

因此,此種演算法必須按照一定的字串或字詞表現順序,才能予以判斷是否近似或相同於檢索要求中技術文件或專利文件所表現的內容。進一步地,由於複數屬性內容會被預先儲存於不同的儲存位置,才能進行後續的檢索程序,並可提供不同的檢索條件進行比對基準,如此,進行比對後除了會予以表現相同或相似的檢索結果,亦會依據所儲存的位置給予定義,也就是說,假設比對結果位於段落位置則會表示出此相同處出現在文件中的段落,例如比對結果位於申請專利範圍中,則會標記在申請專利範圍,甚至是第幾個權項或權項中的哪一個位置。Therefore, such an algorithm must be in accordance with a certain string or word expression order in order to determine whether it is similar or identical to the content of the technical documents or patent documents in the search requirements. Further, since the plural attribute contents are pre-stored in different storage locations, subsequent retrieval procedures can be performed, and different retrieval conditions can be provided for comparison benchmarks, so that the comparisons will be identical or similar except for the comparison. The search result will also be defined according to the stored position, that is, if the comparison result is located at the paragraph position, it will indicate the same paragraph that appears in the document, for example, if the comparison result is in the scope of patent application, Mark in the scope of the patent application, or even which of the first few rights or rights.

可由上述的檢索方式得知,當比對後具有相同的字詞並定義為字首,則會從此字首開始再進行比對,其他部分則會忽略,且另一方面,可進一步地判斷一屬性內容中,例如『申請專利範圍』具有較多的相似或相同的比對結果,則會針對此較為相同或近似的屬性內容再進行比對,則可節省許多比對的時間提升比對效率。It can be known from the above retrieval method that when the comparison has the same word and is defined as the prefix, the comparison will be performed from the beginning of the word, and the other parts will be ignored, and on the other hand, one can be further judged. In the attribute content, for example, the "patent application scope" has more similar or identical comparison results, and then compares the similar or similar attribute contents, which can save a lot of comparison time increase comparison efficiency. .

再者,又一實施例中,為可進一步判斷機率的數值,也就是說,假設輸入檢索條件進行比對程序後,在檢索平台1各儲存位置13中顯示此檢索條件的機率出現在某一專利文件或某一屬性內容為定義顯著,則會針對此一專利文件或此一屬性內容再進一步比對,或者是,直接執行顯示產生結果。Furthermore, in still another embodiment, the probability of further determining the probability, that is, the probability that the retrieval condition is displayed in each storage location 13 of the retrieval platform 1 after a comparison condition is entered, is assumed to occur. If the patent document or the content of a certain attribute is significant, the patent document or the content of the attribute may be further compared, or the display result may be directly executed.

當然地,假如檢索平台1內儲存有越多的專利文件,則利用此機率的判斷方式便越節省檢索時間,舉例來說,可針對具有顯著性的複數屬性內容或針對某一專利文件中的屬性內容再進一步地進行後續比對,則可以加快檢索速度。Of course, if the number of patent documents stored in the search platform 1 is more, the use of this probability judgment method saves the search time, for example, for the singular plural attribute content or for a patent document. The attribute content is further subjected to subsequent comparison, which can speed up the retrieval.

如此,對於長期撰寫專利說明書的專利工程師而言,可透過輸入一發明人所提供的技術文件,藉以尋找相關的專利文件內容,如此,便可迅速獲取先前技術的內容,並快速地完成專利說明書的撰寫,提高撰寫效率。對於研發的工程師來說,也能依據他人已發表的文件內容,利用此種以文找文的方式,快速找到目前已經提出申請的專利,藉此能得知其他企業或個人的研發方向及尚未發展的領域。In this way, for a patent engineer who has written a patent specification for a long time, the technical file provided by the inventor can be input to find the relevant patent document content, so that the content of the prior art can be quickly obtained, and the patent specification can be quickly completed. Writing, improve writing efficiency. For the R&D engineer, it is also possible to quickly find the patents that have been applied for, based on the contents of the documents that have been published by others, so as to know the R&D direction of other companies or individuals. The field of development.

舉例來說,當專利工程師輸入一技術文件至檢索介面3,檢索介面3會將技術文件傳輸至檢索平台1,檢索平台1之至少一邏輯運算單元14會利用資料解析單元12針對技術文件的內容進行分析解構,藉以解析出複數屬性內容,例如段落、句子及字詞上述至少其中之一,之後,至少一邏輯運算單元14會將解析後的複數屬性內容進行比對程序,判斷各儲存位置13中是否具有相同或相似於複數屬性內容的資料,藉以產生檢索結果。在比對過程中,檢索要求之屬性內容會各自地比對儲存位置中的內容,此種比對方式,當判斷不具有相似或相同內容後會進行各別地標記或計分,而判斷具有相似或相同內容後會除了進行標記或計分外,還會截取出此些內容,藉以傳輸並顯示於檢索介面3。因為比對程序是採用區段性的比對方法並結合本發明的分散式架構,則會快速地檢索出資料,並生成一檢索結果。For example, when the patent engineer inputs a technical file to the search interface 3, the search interface 3 transmits the technical file to the retrieval platform 1, and at least one logical operation unit 14 of the retrieval platform 1 uses the data analysis unit 12 for the content of the technical file. Performing analytical deconstruction to parse the complex attribute content, such as at least one of a paragraph, a sentence, and a word, and then at least one logical operation unit 14 compares the parsed complex attribute content to determine each storage location 13 Whether there is data in the same or similar to the content of the plural attribute in order to generate the search result. In the comparison process, the attribute content of the retrieval request will respectively compare the contents in the storage location, and the comparison manner, when judging that there is no similar or identical content, will be separately marked or scored, and the judgment has Similar or identical content will be intercepted or scored in addition to being tagged or scored for transmission and display on search interface 3. Because the alignment program uses a segmented alignment method in conjunction with the distributed architecture of the present invention, the data is quickly retrieved and a search result is generated.

另外一方面,透過史密斯-華特曼演算法以及關聯性演算法可定義出檢索要求中的關鍵字詞,也就是說,透過與檢索要求相同或近似的檢索結果儲存於儲存位置中,並定義此為關鍵字詞,則可成為協助之後使用者檢索上的便利性及快速性,使得使用者在後續的檢索要求可依據先前檢索結果做設定,藉以能持續性地進行檢索分析。On the other hand, the Smith-Wattman algorithm and the association algorithm can define the keyword words in the search request, that is, the search results that are the same or similar to the search requirements are stored in the storage location and defined. This is a keyword word, which can be used as a convenience and speed for the user to search after the assistance, so that the user can set the subsequent search request according to the previous search result, so that the search analysis can be performed continuously.

前述比對方式可表現在文件與文件的比對,也就是說,檢索要求可包括有一技術文件或專利文件,輸入此技術文件或專利文件,並依據此技術文件或專利文件的內容進行比對。The foregoing comparison method can be expressed in the comparison between the document and the file, that is, the retrieval requirement can include a technical document or a patent document, input the technical document or the patent document, and compare according to the content of the technical document or the patent document. .

綜上所述,在此檢索平台1具有複數儲存位置分散式的架構下,則至少一邏輯運算單元14為可同時對各儲存位置進行資料挖掘程序的運算,便可縮短資料挖掘的運算時間。In summary, in the architecture in which the search platform 1 has a plurality of storage locations, at least one logical operation unit 14 can perform data mining procedures on each storage location at the same time, thereby shortening the operation time of data mining.

另請參閱第九圖,係為本發明又一較佳實施例之檢索處理流程圖,由圖中可以清楚看出,此部分與較佳實施例具有不同的地方在於,又一較佳實施中,在原本架構下增加了額外的重組字分析程序(angram),也就是說,關於檢索平台1與檢索介面3檢索處理流程進一步包括有重組字分析程序,而產生此一又一較佳實施例型態,此檢索處理流程包括下列步驟:Please refer to the ninth embodiment, which is a flowchart of a search process according to still another preferred embodiment of the present invention. It can be clearly seen from the figure that this part is different from the preferred embodiment in another preferred embodiment. An additional recombination word analysis program is added under the original architecture, that is, the search processing flow for the retrieval platform 1 and the retrieval interface 3 further includes a recombination word analysis program, and this further preferred embodiment is produced. Type, this retrieval process includes the following steps:

(500)提供一個檢索介面3可供接收檢索要求。(500) Provide a search interface 3 for receiving retrieval requests.

(501)輸入一檢索要求至檢索介面3,其中此檢索要求包括有至少一檢索條件。(501) Entering a search request into the search interface 3, wherein the search request includes at least one search condition.

(502)耦合檢索至少一檢索條件的多種組合,並生成字詞檢索參數。(502) Coupling to retrieve a plurality of combinations of at least one search condition and generating a word retrieval parameter.

(503)比對字詞檢索參數與複數儲存位置之各屬性內容是否完全相同或部份相同,若是,進行步驟(504),若否,進行步驟(505)。(503) Whether the matching word search parameter and the attribute content of the plurality of storage locations are identical or partially identical, and if yes, proceeding to step (504), and if not, proceeding to step (505).

(504)擷取比對後符合完全相同或部份相同的屬性內容。(504) Aligning the attributes that match the exact same or partially identical attributes.

(505)生成一檢索結果(505) generate a search result

(506)回應(response)檢索結果至檢索介面3。(506) Respond to the search result to the search interface 3.

承上述,當檢索平台1接收到檢索介面3所傳輸的檢索要求時,檢索平台1之至少一邏輯運算單元14便會進行重組字分析程序,其程序的方式可包括但不限於重組字分析法,也就是說,各邏輯運算單元14會使檢索要求的至少一檢索條件生成多種組合的字詞檢索參數,也就是說,當檢索條件為英文字組時,則可以將英文字組的每個字母排列成不同的英文字組,或者是,當檢索條件為英文片語時,則可以將英文片語的每個字組排列成不同片語,舉例來說,檢索條件包括有『Tired nerves』的任意輸入字詞及『實施方式』的屬性內容,經過重組字分析程序,則檢索條件亦可包括『Tense driver』的任意輸入字詞及『實施方式』的對應屬性內容,如此,便可找出相關意思,使檢索條件可廣泛涵蓋相關聯的字詞,並於生成字詞檢索參數後,便可將『Tired nerves』及『Tense driver』比對『實施方式』屬性內容的儲存位置,之後,便可將檢索結果回應至檢索介面3,如此,便可達到檢索的多樣性及廣泛性的優勢。In the above, when the retrieval platform 1 receives the retrieval request transmitted by the retrieval interface 3, at least one logical operation unit 14 of the retrieval platform 1 performs a recombinant word analysis program, and the manner of the program may include, but is not limited to, a recombinant word analysis method. That is, each logical operation unit 14 causes at least one search condition required by the search to generate a plurality of combined word search parameters, that is, when the search condition is an English block, each of the English blocks can be The letters are arranged in different English blocks, or when the search condition is English, each block of the English phrase can be arranged into different phrases. For example, the search conditions include "Tired nerves". The arbitrary input words and the attribute contents of the "implementation method", after the reorganized word analysis program, the search conditions may also include any input words of the "Tense driver" and the corresponding attribute contents of the "implementation method", so that it can be found Relevant meanings, so that the search conditions can cover a wide range of related words, and after generating the word retrieval parameters, "Tired nerves" and "Tense driver" can be used. By comparing the storage location of the "implementation" attribute content, the search result can be responded to the search interface 3, so that the diversity and broadness of the search can be achieved.

此種經過字詞相關性連結後,亦可透過學習的方式,將相關聯的技術儲存於裝置中,如此,便可加快檢索速度。After the word correlation is linked, the related technology can also be stored in the device through learning, so that the retrieval speed can be accelerated.

綜合而言,前述各較佳實施例中的檢索結果被顯示在檢索介面3的畫面,可以是圖示或表列的方式,又或者,其圖示或表列的方式會受到檢索介面3所接收的檢索條件所影響,詳言之,當檢索介面3所接收到的檢索條件,會進行前述之比對程序,之後,所產生之檢索結果圖示或表列的條列排序方式,包括有相似度高排序在前,相似度低排序在後的方式,甚至是依造檢索介面3所接收的預設條件值(例如依申請日、公開日或是某些權重設定等),例如檢索結果中的相似度在申請專利範圍較多者排列在前,或者是,針對預設關鍵字出現頻率較高者排列在前,上述之複數設定方式也可以相互耦合,當然,這些條列的排序方式可隨使用習慣有不同的設定,並不侷限於上述之說明。In summary, the search results in the foregoing preferred embodiments are displayed on the screen of the search interface 3, which may be in the form of a diagram or a list, or the manner of the diagram or the list may be searched by the search interface 3. The received search conditions are affected. In detail, when the search condition received by the search interface 3 is performed, the foregoing comparison process is performed, and then the generated search result icon or the column arrangement of the table column includes High similarity ranks first, the similarity is low, and even the default condition value received by the search interface 3 (for example, according to the application date, public date or some weight setting, etc.), such as the search result The degree of similarity in the patent application range is higher, or the higher frequency of the preset keywords is ranked first, and the above plural setting methods can also be coupled to each other. Of course, the ordering of these columns is It can be set differently depending on the usage habits and is not limited to the above description.

在其他方面,檢索結果圖示或表列方式的又一較佳實施例,請參閱第十圖,係為本發明較佳實施例之檢索結果示意圖,由圖中可以清楚看出,若檢索結果回應至檢索介面3的表現方式可為關聯性的樹狀結構圖,詳言之,輸入檢索條件為具有『相關聯之測試值』文件和『申請專利範圍』,則會進行比對程序,且使二檢索條件相對應,當檢索結果生成後,除了會產生申請專利範圍包括有『相關聯之測試值』之相關聯專利文件外,還會將其相關聯的字詞或者是主詞、動詞、受詞一同檢索出,並且各相關聯的字詞能以樹狀結構連結於『相關聯之測試值』,以此例來說,便可檢索出一篇申請專利範圍具有『相關聯之測試值』的專利文件,此專利文件展開後會將各權項分別表列在樹狀結構的第一層,而其相關的主詞或受詞字詞『裝置』、『測試值』表列在樹狀結構的第二層,且各字詞表列時同時會顯示其相對應的權項數值。In other respects, a further preferred embodiment of the search result icon or the manner of the list, please refer to the tenth figure, which is a schematic diagram of the search result according to the preferred embodiment of the present invention, which can be clearly seen from the figure, if the search result The response to the search interface 3 can be a tree structure diagram of relevance. In detail, if the input search condition is "associated test value" file and "patent application scope", the comparison program will be performed, and Corresponding to the two search conditions, when the search result is generated, in addition to the associated patent documents including the "associated test value", the associated words or the main words, verbs, The words are retrieved together, and each associated word can be linked to the "associated test value" in a tree structure. For example, a patent application scope can be retrieved with the associated test value. Patent document, after the publication of this patent document, will list the rights in the first layer of the tree structure, and the related subject words or the words "device" and "test value" listed in the tree Knot The second layer of the structure, and each word list will also display its corresponding weight value.

另,再舉一例說明,請參閱第十一圖,係為本發明另一較佳實施例之檢索結果示意圖,由圖中可以清楚看出,若檢索結果回應至檢索介面3的表現方式亦可為樹狀結構圖,此樹狀結構的表現方式可依摘要、發明所屬之技術領域、先前技術、發明內容、實施方式或申請專利範圍各自的段落、句子及字詞分別擷取部份內容並且依預定的優先序列形成樹狀排列。詳言之,同樣地,輸入檢索條件為具有『相關聯之測試值』的文件和『申請專利範圍』,且使二檢索條件相對應,當檢索結果生成後,便可檢索出一篇申請專利範圍具有『相關聯之測試值』的專利文件,若將申請專利範圍展開,便可形成樹狀結構,關於第一層的樹狀結構可為預設各權項為裝置或方法,並註記為開放式、半開放式或封閉式寫法,則表示為主要權項,且亦可進一步註記是否有『進一步包括』,則標示為附屬權項,且第二層則表示在各主要權項或附屬權項中定義為主詞或受詞的相關聯字詞,例如『加權值』、『失效』字詞。In addition, another example is shown in FIG. 11 , which is a schematic diagram of a search result according to another preferred embodiment of the present invention. It can be clearly seen from the figure that if the search result is returned to the search interface 3 For a tree structure diagram, the representation of the tree structure may be based on the respective paragraphs, sentences and words of the abstract, the technical field, the prior art, the inventive content, the embodiment or the patent application scope, and A tree arrangement is formed according to a predetermined priority sequence. In detail, similarly, the input search condition is a file having the "associated test value" and the "application patent scope", and the two search conditions are correspondingly matched, and when the search result is generated, a patent application can be retrieved. A patent document with the scope of the associated test value. If the scope of the patent application is expanded, a tree structure can be formed. The tree structure of the first layer can be a preset device or method, and is noted as Open, semi-open or closed, expressed as the main right, and may further note whether there is "further inclusion", which is marked as a subsidiary right, and the second level is indicated in each major right or subsidiary An associated word that defines a subject or a subject in a right, such as a "weighted value" or "invalid" word.

再進一步地,檢索平台1亦可隨著遠端系統2所儲存的專利文件數量或內容變動而更新,其更新方式可透過資料擷取單元11判斷遠端系統2是否有需要更新的至少一個專利文件,並於判斷需要更新時,使資料擷取單元11再將需要更新的各專利文件自該遠端系統依序擷取,並於擷取之後,利用一資料解析單元12接收該些更新的專利文件,並針對該些更新專利文件內的複數屬性內容進行分析解構,將分析解構後所產生的各相同及相異的屬性內容再分別傳輸至複數儲存位置13。Further, the search platform 1 may be updated according to the number of patent documents or content stored in the remote system 2, and the update manner may be determined by the data capturing unit 11 to determine whether the remote system 2 has at least one patent to be updated. And when the data is judged to be updated, the data retrieval unit 11 sequentially retrieves the patent documents that need to be updated from the remote system, and after receiving the data, the data analysis unit 12 receives the updates. The patent documents are analyzed and deconstructed for the content of the plural attributes in the updated patent documents, and the identical and different attribute contents generated after the analysis of the deconstruction are separately transmitted to the plurality of storage locations 13.

關於前述之任一處理流程或檢索流程之處理或檢索程式可內儲於一可讀取紀錄媒體,當可讀取該程式的一裝置載入該檢索程式後,便可執行此些步驟。其中,可讀取該檢索程式的一裝置可為電腦(Computer)、一能執行多緒(thread)之裝置或一執行單緒之裝置,而可讀取紀錄媒體可為一網路伺服器,透過線上下載將檢索程式裝設在裝置中,或者是儲存於光碟片,但不限於此。The processing or retrieval program for any of the foregoing processing or retrieval processes can be stored in a readable recording medium, and can be executed after a device that can read the program loads the retrieval program. The device that can read the search program can be a computer, a device capable of executing a thread, or a device for executing a single thread, and the recordable medium can be a network server. The search program is installed in the device via online download, or stored on a disc, but is not limited thereto.

惟,以上所述者僅為本揭露之實施範例,當不能依此限定本發明實施之範圍。即舉凡本發明申請專利範圍所作之均等變化與修飾,皆應仍屬本發明專利涵蓋之範圍。However, the above description is only an example of the implementation of the present disclosure, and the scope of the present invention cannot be limited thereto. All changes and modifications made to the scope of the present invention should remain within the scope of the present invention.

1...檢索平台1. . . Search platform

11...資料擷取單元11. . . Data acquisition unit

12...資料解析單元12. . . Data analysis unit

13...儲存位置13. . . Storage location

14...邏輯運算單元14. . . Logical unit

2...遠端系統2. . . Remote system

3...檢索介面3. . . Search interface

A...摘要A. . . Summary

B...發明所屬之技術領域B. . . Technical field to which the invention belongs

C...先前技術C. . . Prior art

D...發明內容D. . . Summary of the invention

E...實施方式E. . . Implementation

F...申請專利範圍F. . . Patent application scope

F1...申請專利範圍段落F1. . . Patent application passage

F2...申請專利範圍段落F2. . . Patent application passage

F11...申請專利範圍句子F11. . . Patent application scope sentence

F12...申請專利範圍句子F12. . . Patent application scope sentence

F111...申請專利範圍字詞F111. . . Patent application term

F112...申請專利範圍字詞F112. . . Patent application term

13A...摘要儲存位置13A. . . Summary storage location

13B...發明所屬之技術領域儲存位置13B. . . Technical storage area to which the invention belongs

13C...先前技術儲存位置13C. . . Prior art storage location

13D...發明內容儲存位置13D. . . SUMMARY OF THE INVENTION Storage location

13E‧‧‧實施方式儲存位置13E‧‧‧Implement storage location

13F‧‧‧申請專利範圍儲存位置13F‧‧‧Requested patent range storage location

13A1‧‧‧摘要段落儲存位置13A1‧‧‧ Summary paragraph storage location

13B1‧‧‧發明所屬之技術領域段落儲存位置13B1‧‧‧Technical field paragraph storage location

13C1‧‧‧先前技術段落儲存位置13C1‧‧‧Previous technical paragraph storage location

13D1‧‧‧發明內容段落儲存位置13D1‧‧‧Inventory paragraph storage location

13E1‧‧‧實施方式段落儲存位置13E1‧‧‧Implementation paragraph storage location

13F1‧‧‧申請專利範圍段落儲存位置13F1‧‧‧Application patent range paragraph storage location

13A11‧‧‧摘要句子儲存位置13A11‧‧‧Abstract sentence storage location

13B11‧‧‧發明所屬之技術領域句子儲存位置13B11‧‧‧Technical field sentence storage location

13C11‧‧‧先前技術句子儲存位置13C11‧‧‧Previous technical sentence storage location

13D11‧‧‧發明內容句子儲存位置13D11‧‧‧Inventive sentence storage location

13E11‧‧‧實施方式句子儲存位置13E11‧‧‧ Implementation sentence storage location

13F11‧‧‧申請專利範圍句子儲存位置13F11‧‧‧Application for patent range sentence storage location

13A111‧‧‧摘要字詞儲存位置13A111‧‧‧Abstract word storage location

13B111‧‧‧發明所屬之技術領域字詞儲存位置13B111‧‧‧Technical field storage location

13C111‧‧‧先前技術字詞儲存位置13C111‧‧‧Previous technical word storage location

13D111‧‧‧發明內容字詞儲存位置13D111‧‧‧Invented content word storage location

13E111‧‧‧實施方式字詞儲存位置13E111‧‧‧ Implementation word storage location

13F111‧‧‧申請專利範圍字詞儲存位置13F111‧‧‧Requested patent range word storage location

第一圖 係為本發明較佳實施例之處理流程圖。The first figure is a process flow diagram of a preferred embodiment of the present invention.

第二圖 係為本發明較佳實施例之平台架構示意圖。The second figure is a schematic diagram of a platform architecture of a preferred embodiment of the present invention.

第三圖 係為本發明較佳實施例之檢索平台方塊圖。The third figure is a block diagram of a search platform of a preferred embodiment of the present invention.

第四圖 係為本發明專利說明書屬性內容之示意圖。The fourth figure is a schematic diagram of the attribute content of the patent specification of the present invention.

第五圖 係為本發明較佳實施例之複數儲存位置的階層式架構示意圖。Figure 5 is a schematic diagram of a hierarchical architecture of a plurality of storage locations in accordance with a preferred embodiment of the present invention.

第六圖 係為本發明較佳實施例之檢索處理流程圖。Figure 6 is a flow chart showing the retrieval process of the preferred embodiment of the present invention.

第七圖 係為本發明另一較佳實施例之檢索處理流程圖。Figure 7 is a flow chart showing the retrieval process of another preferred embodiment of the present invention.

第八圖 係為本發明又一較佳實施例之檢索處理流程圖。The eighth figure is a flowchart of the retrieval process of still another preferred embodiment of the present invention.

第九圖 係為本發明又一較佳實施例之檢索處理流程圖。Figure 9 is a flow chart showing the retrieval process of still another preferred embodiment of the present invention.

第十圖 係為本發明較佳實施例之檢索結果示意圖。The tenth figure is a schematic diagram of the search results of the preferred embodiment of the present invention.

第十一圖 係為本發明另一較佳實施例之檢索結果示意圖。Figure 11 is a schematic diagram showing the results of a search for another preferred embodiment of the present invention.

Claims (18)

一種雲端架構的專利文件檢索平台之處理方法,可利用一檢索平台執行複數專利文件的檢索程序,且該些專利文件為包括有複數屬性內容,其步驟包括如下:擷取複數專利文件;解析該些專利文件的複數屬性內容,分別產生複數相同的屬性內容及相異的屬性內容;傳輸各相同及相異的屬性內容至複數儲存位置;使相同的屬性內容分別對應儲存於相同的儲存位置,而相異的屬性內容則分別對應儲存於不同的儲存位置;接收檢索要求,當接收到檢索要求時,可利用重組字分析法將該至少一個檢索條件生成多種組合的字詞檢索參數,再將此多種組合的複數字詞檢索參數進行與複數儲存位置之各屬性內容的比對程序;比對檢索要求與各儲存位置內的屬性內容是否完全相同或部份相同;生成一檢索結果,並將檢索結果回應至檢索介面。 A processing method of a patent document retrieval platform of a cloud architecture, which can perform a retrieval process of a plurality of patent documents by using a retrieval platform, and the patent documents include plural attribute contents, and the steps thereof include the following steps: extracting a plurality of patent documents; The plural attribute content of the patent documents respectively generates the same attribute content and the different attribute content; the same and different attribute contents are transmitted to the plural storage locations; the same attribute contents are respectively stored in the same storage location, The different attribute contents are respectively stored in different storage locations; receiving the search request, when receiving the search request, the at least one search condition may be generated by the recombinant word analysis method to generate a plurality of combined word search parameters, and then The plurality of combined complex digital word retrieval parameters are compared with each attribute content of the plurality of storage locations; the comparison retrieval request is identical or partially identical to the attribute content in each storage location; generating a search result and The search results are echoed to the search interface. 如申請專利範圍第1項所述之雲端架構的專利文件檢索平台之處理方法,其中該複數儲存位置內的屬性內容為可進行更新程序,其可透過一資料擷取單元判斷一遠端系統是否有需要更新的至少一個專利文件,並於判斷需要更新時,該資料擷取單元再將需要更新的各專利文件自該遠端系統依序擷取 ,並於擷取之後,利用一資料解析單元接收該些更新的專利文件,並針對該些更新專利文件內的複數屬性內容進行分析解構,將分析解構後所產生的各相同及相異的屬性內容再分別傳輸至複數儲存位置。 The processing method of the patent document retrieval platform of the cloud architecture described in claim 1, wherein the attribute content in the plurality of storage locations is an updateable program, and the data retrieval unit can determine whether a remote system is There is at least one patent document that needs to be updated, and when it is judged that the update is needed, the data retrieval unit then sequentially retrieves the patent documents that need to be updated from the remote system. And after extracting, using a data parsing unit to receive the updated patent documents, and analyzing and deconstructing the complex attribute contents in the updated patent documents, and analyzing the identical and different attributes generated after deconstruction. The content is then transferred to multiple storage locations. 如申請專利範圍第1項所述之雲端架構的專利文件檢索平台之處理方法,其中該檢索結果回應至檢索介面的表現方式可為樹狀結構,此樹狀結構的表現方式可依摘要、發明所屬之技術領域、先前技術、發明內容、實施方式或申請專利範圍各自的段落、句子及字詞分別擷取部份內容並且依預定的優先序列形成樹狀排列。 For example, in the processing method of the patent document retrieval platform of the cloud architecture described in claim 1, wherein the retrieval result may be a tree structure in response to the retrieval interface, and the representation manner of the tree structure may be summarized and invented. The respective paragraphs, sentences and words of the technical field, prior art, inventive content, embodiment or patent application range respectively capture part of the content and are arranged in a tree according to a predetermined priority sequence. 如申請專利範圍第1項所述之雲端架構的專利文件檢索平台之處理方法,其中當檢索要求被輸入至檢索介面時,該檢索要求為包括有至少一個檢索條件,且至少一個檢索條件包括有關鍵字群組、權數設定條件、使用者輸入的檢索內容及對應的屬性內容上述至少其中之一。 The processing method of the patent document retrieval platform of the cloud architecture described in claim 1, wherein when the retrieval request is input to the retrieval interface, the retrieval request includes at least one retrieval condition, and at least one retrieval condition includes At least one of the keyword group, the weight setting condition, the search content input by the user, and the corresponding attribute content. 一種雲端架構的專利文件檢索平台的檢索方法,該檢索方法為一檢索程式,且內儲於可讀取紀錄媒體中,當可讀取該檢索程式之一裝置載入該檢索程式後,執行下列步驟:提供一個檢索介面連接於一檢索平台,其中該檢索介面可接收檢索要求,並使檢索平台內具有複數儲存位置,且各儲存位置為分別預設相同的屬性內容儲存於相同的儲存位置,而相異的屬性內容儲存於不同的儲存位置; 輸入該檢索要求至檢索介面,且檢索要求包括至少一檢索條件;傳輸檢索要求至檢索平台;比對檢索要求及檢索平台內複數儲存位置之屬性內容;擷取完全相同或部份相同於該檢索要求的屬性內容;及生成一檢索結果,並回應檢索結果至檢索介面。 A search method of a patent document retrieval platform of a cloud architecture, the retrieval method is a retrieval program, and is stored in a readable record medium. When a device capable of reading the retrieval program is loaded into the retrieval program, the following execution is performed. Step: providing a search interface to be connected to a search platform, wherein the search interface can receive the search request, and have a plurality of storage locations in the search platform, and each storage location is stored in the same storage location for the preset identical attribute content, The different attribute contents are stored in different storage locations; Entering the search request to the search interface, and the search request includes at least one search condition; transmitting the search request to the search platform; comparing the search request with the attribute content of the plurality of storage locations in the search platform; capturing exactly the same or partially the same as the search The required attribute content; and generate a search result and respond to the search result to the search interface. 如申請專利範圍第5項所述之雲端架構的專利文件檢索平台,其中當擷取完全相同或部份相同於該檢索要求的屬性內容後,可進行資料挖掘程序,此資料挖掘程序為包括詞類/詞性分析法。 For example, the patent document retrieval platform of the cloud architecture described in claim 5, wherein when the attribute content that is identical or partially identical to the retrieval requirement is retrieved, a data mining program may be performed, and the data mining program includes a word class. / part of speech analysis. 如申請專利範圍第5項所述之雲端架構的專利文件檢索平台,其中當擷取完全相同或部份相同於該檢索要求的屬性內容後,可進行資料挖掘程序,此資料挖掘程序為包括史密斯-華特曼演算法。 For example, the patent document retrieval platform of the cloud architecture described in claim 5, wherein the data mining program can be performed after extracting the attribute content that is identical or partially identical to the retrieval requirement, and the data mining program includes Smith. - Waterman algorithm. 一種雲端架構的專利文件檢索平台的檢索方法,該檢索方法為一檢索程式,且內儲於可讀取紀錄媒體中,當可讀取該檢索程式之一裝置載入該檢索程式後,執行下列步驟:提供一個檢索介面連接於一檢索平台,其中該檢索介面可接收檢索要求,並使檢索平台內具有複數儲存位置,且各儲存位置為分別預設相同的屬性內容儲存於相同的儲存位置,而相異的屬性內容儲存於不同的儲存位置;輸入該檢索要求至檢索介面,且檢索要求包括至少一檢索條 件;傳輸檢索要求至檢索平台;比對檢索要求及複數儲存位置內屬性內容;擷取完全相同或部份相同於該檢索要求的屬性內容;執行資料挖掘程序;及生成一檢索結果,並回應檢索結果至檢索介面。 A search method of a patent document retrieval platform of a cloud architecture, the retrieval method is a retrieval program, and is stored in a readable record medium. When a device capable of reading the retrieval program is loaded into the retrieval program, the following execution is performed. Step: providing a search interface to be connected to a search platform, wherein the search interface can receive the search request, and have a plurality of storage locations in the search platform, and each storage location is stored in the same storage location for the preset identical attribute content, The different attribute contents are stored in different storage locations; the search request is input to the search interface, and the search request includes at least one search bar Transfer the search request to the search platform; compare the search requirements and the attribute content in the plural storage location; retrieve the attribute content that is identical or partially identical to the search request; execute the data mining program; and generate a search result and respond Search results to the search interface. 如申請專利範圍第8項所述之雲端架構的專利文件檢索平台的檢索方法,其中當執行的資料挖掘程序為包括史密斯-華特曼演算法。 The method for retrieving a patent document retrieval platform of the cloud architecture as described in claim 8 wherein the data mining program executed includes a Smith-Wattman algorithm. 如申請專利範圍第8項所述之雲端架構的專利文件檢索平台的檢索方法,其中當執行的資料挖掘程序為包括詞類/詞性分析法。 For example, the method for searching a patent document retrieval platform of the cloud architecture described in claim 8 of the patent application, wherein the data mining program executed includes a part-of-speech/partition analysis method. 如申請專利範圍第5或8項所述之雲端架構的專利文件檢索平台的檢索方法,其中該複數屬性內容的預設儲存方法為透過一資料擷取單元從遠端系統擷取複數專利文件,之後該資料擷取單元將該些專利文件傳輸至一資料解析單元,使資料解析單元接收該些專利文件後,進行該些專利文件複數屬性內容的分析解構程序,此分析解構程序為將產生該些專利文件的各相同及相異的屬性內容分別傳輸至複數儲存位置,使複數儲存位置分別接收資料解析單元所產生相同及相異的複數屬性內容,且儲存相同的屬性內容於相同的儲存位置,而相異的屬性內容儲存於不同的儲存位 置。 The method for searching a patent document retrieval platform of the cloud architecture described in claim 5 or 8, wherein the method for storing the plurality of attribute contents is to retrieve a plurality of patent documents from a remote system through a data retrieval unit. Then, the data acquisition unit transmits the patent documents to a data analysis unit, so that the data analysis unit receives the patent documents, and then performs an analysis deconstruction program of the plurality of attribute contents of the patent documents, and the analysis destructor program generates the The same and different attribute contents of the patent documents are respectively transmitted to the plurality of storage locations, so that the plurality of storage locations respectively receive the same and different complex attribute contents generated by the data parsing unit, and store the same attribute content in the same storage location. , and different attribute content is stored in different storage locations Set. 如申請專利範圍第5或8項所述之雲端架構的專利文件檢索平台的檢索方法,其中該複數儲存位置內的屬性內容為可進行更新程序,其可透過一資料擷取單元判斷一遠端系統是否有需要更新的至少一個專利文件,並於判斷需要更新時,該資料擷取單元再將需要更新的各專利文件自該遠端系統依序擷取,並於擷取之後,利用一資料解析單元接收該些更新的專利文件,並針對該些更新專利文件內的複數屬性內容進行分析解構,將分析解構後所產生的各相同及相異的屬性內容再分別傳輸至複數儲存位置。 The method for searching a patent document retrieval platform of the cloud architecture described in claim 5 or 8, wherein the attribute content in the plurality of storage locations is an updateable program, and the data retrieval unit can determine a remote end through a data retrieval unit. Whether the system has at least one patent file that needs to be updated, and when it is judged that the update is needed, the data acquisition unit sequentially retrieves each patent document to be updated from the remote system, and uses a data after the retrieval. The parsing unit receives the updated patent documents, and analyzes and deconstructs the complex attribute contents in the updated patent files, and separately transmits the identical and different attribute contents generated after the deconstruction analysis to the plurality of storage locations. 如申請專利範圍第5或8項所述之雲端架構的專利文件檢索平台的檢索方法,其中該檢索結果回應至檢索介面的表現方式可為樹狀結構,此樹狀結構的表現方式可依摘要、發明所屬之技術領域、先前技術、發明內容、實施方式及申請專利範圍各自的段落、句子及字詞分別擷取部份內容並且依預定的優先序列形成樹狀排列。 The method for searching a patent document retrieval platform of the cloud architecture described in claim 5 or 8, wherein the retrieval result may be a tree structure in response to the retrieval interface, and the representation of the tree structure may be summarized The respective paragraphs, sentences and words of the technical field, the prior art, the inventive content, the embodiment and the patent application scope of the invention respectively extract part of the content and form a tree arrangement according to a predetermined priority sequence. 如申請專利範圍第5或8項所述之雲端架構的專利文件檢索平台的檢索方法,其中當接收該至少一個檢索條件至邏輯運算單元時,可利用重組字分析法將該至少一個檢索條件生成多種組合的字詞檢索參數,再將此多種組合的複數字詞檢索參數進行與複數儲存位置之各屬性內容的比對程序。 The method for searching a patent document retrieval platform of a cloud architecture according to claim 5 or 8, wherein when the at least one retrieval condition is received to the logical operation unit, the at least one retrieval condition may be generated by using a recombinant word analysis method. A plurality of combined word retrieval parameters, and the plurality of combined complex digital word retrieval parameters are compared with the respective attribute contents of the plurality of storage locations. 如申請專利範圍第5或8項所述之雲端架構的專利文件檢索平台的檢索方法,其中若檢索要求包括有複數檢索條件時,則會進行耦合程序,並生成一檢索參數,之後,進行檢索參數及複數儲存位置內之屬性內容的比對程序。 The method for searching a patent document retrieval platform of the cloud architecture described in claim 5 or 8, wherein if the retrieval request includes a plurality of retrieval conditions, the coupling process is performed, and a retrieval parameter is generated, and then the retrieval is performed. A comparison program of parameters and attribute contents in a plurality of storage locations. 一種雲端架構的專利文件檢索平台,該平台為可接收至少一個檢索要求,並根據該檢索要求執行複數專利文件的檢索運作,且各專利文件為包括有複數屬性內容,該平台為包括有:一資料擷取單元,可擷取複數專利文件;一資料解析單元連接於該資料擷取單元,可接收該些專利文件,並針對該些專利文件內的複數屬性內容進行分析解構,將分析解構後所產生的各相同及相異的屬性內容分別傳輸至複數儲存位置;複數儲存位置連接於該資料解析單元,可分別接收資料解析單元所產生相同及相異的複數屬性內容,使相同的屬性內容分別對應儲存於相同的儲存位置,而相異的屬性內容分別對應儲存於不同的儲存位置;及至少一邏輯運算單元對應連結於複數儲存位置,可供接收並運算該檢索要求與各儲存位置之屬性內容,生成檢索結果。 A patent document retrieval platform of a cloud architecture, the platform is capable of receiving at least one retrieval requirement, and performing a retrieval operation of a plurality of patent documents according to the retrieval requirement, and each patent document includes a plurality of attribute contents, and the platform includes: The data acquisition unit may retrieve a plurality of patent documents; a data analysis unit is connected to the data extraction unit, and the patent documents may be received, and the complex attribute content in the patent documents is analyzed and deconstructed, and the analysis is deconstructed. The generated identical and different attribute contents are respectively transmitted to the plurality of storage locations; the plurality of storage locations are connected to the data parsing unit, and respectively receive the same and different complex attribute contents generated by the data parsing unit, so that the same attribute content is obtained. Correspondingly stored in the same storage location, the different attribute contents are respectively stored in different storage locations; and at least one logical operation unit is correspondingly connected to the plurality of storage locations for receiving and computing the retrieval request and each storage location. Attribute content, generate search results. 如申請專利範圍第16項所述之雲端架構的專利文件檢索平台,其中該資料擷取單元進一步可主動偵測一遠端系統 是否有需要更新的至少一個專利文件,並將需要更新的各專利文件依序擷取。 For example, the patent document retrieval platform of the cloud architecture described in claim 16 wherein the data acquisition unit further actively detects a remote system. Is there at least one patent document that needs to be updated, and the patent documents that need to be updated are sequentially taken. 如申請專利範圍第16項所述之雲端架構的專利文件檢索平台,其中該複數儲存位置可定義為不同的伺服器裝置或單一伺服器裝置的不同儲存空間。 The patent document retrieval platform of the cloud architecture described in claim 16 wherein the plurality of storage locations can be defined as different storage locations of different server devices or a single server device.
TW99118401A 2010-06-07 2010-06-07 A patent document search system, processing method, and search method with cloud structure TWI427494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW99118401A TWI427494B (en) 2010-06-07 2010-06-07 A patent document search system, processing method, and search method with cloud structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW99118401A TWI427494B (en) 2010-06-07 2010-06-07 A patent document search system, processing method, and search method with cloud structure

Publications (2)

Publication Number Publication Date
TW201145050A TW201145050A (en) 2011-12-16
TWI427494B true TWI427494B (en) 2014-02-21

Family

ID=46765807

Family Applications (1)

Application Number Title Priority Date Filing Date
TW99118401A TWI427494B (en) 2010-06-07 2010-06-07 A patent document search system, processing method, and search method with cloud structure

Country Status (1)

Country Link
TW (1) TWI427494B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5687312B2 (en) 2013-06-21 2015-03-18 株式会社Ubic Digital information analysis system, digital information analysis method, and digital information analysis program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1291340C (en) * 2002-11-15 2006-12-20 鸿富锦精密工业(深圳)有限公司 Patent quotation information real time updating and demonstrating system and method
TW200834436A (en) * 2007-02-01 2008-08-16 Univ Nat Taiwan Science Tech Intellectual property service systems
CN101276351A (en) * 2007-03-30 2008-10-01 上海汉光知识产权数据科技有限公司 Patent documentation retrieval method
TW200939045A (en) * 2008-03-06 2009-09-16 Fu-Ren Lin A method for comparing documents

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1291340C (en) * 2002-11-15 2006-12-20 鸿富锦精密工业(深圳)有限公司 Patent quotation information real time updating and demonstrating system and method
TW200834436A (en) * 2007-02-01 2008-08-16 Univ Nat Taiwan Science Tech Intellectual property service systems
CN101276351A (en) * 2007-03-30 2008-10-01 上海汉光知识产权数据科技有限公司 Patent documentation retrieval method
TW200939045A (en) * 2008-03-06 2009-09-16 Fu-Ren Lin A method for comparing documents

Also Published As

Publication number Publication date
TW201145050A (en) 2011-12-16

Similar Documents

Publication Publication Date Title
US20210049198A1 (en) Methods and Systems for Identifying a Level of Similarity Between a Filtering Criterion and a Data Item within a Set of Streamed Documents
US10394851B2 (en) Methods and systems for mapping data items to sparse distributed representations
AU2017345199B2 (en) Methods and systems for identifying a level of similarity between a plurality of data representations
JP6095621B2 (en) Mechanism, method, computer program, and apparatus for identifying and displaying relationships between answer candidates
Alzahrani et al. Understanding plagiarism linguistic patterns, textual features, and detection methods
RU2501078C2 (en) Ranking search results using edit distance and document information
JP5699789B2 (en) Information processing apparatus, information processing method, program, and information processing system
US9483460B2 (en) Automated formation of specialized dictionaries
US9785671B2 (en) Template-driven structured query generation
KR101060594B1 (en) Keyword Extraction and Association Network Configuration for Document Data
KR20200094627A (en) Method, apparatus, device and medium for determining text relevance
US20180004838A1 (en) System and method for language sensitive contextual searching
JP2017509049A (en) Coherent question answers in search results
JP2021507350A (en) Reinforcement evidence retrieval of complex answers
US20130232147A1 (en) Generating a taxonomy from unstructured information
JP2006073012A (en) System and method of managing information by answering question defined beforehand of number decided beforehand
US10198497B2 (en) Search term clustering
US20120317125A1 (en) Method and apparatus for identifier retrieval
TWI427494B (en) A patent document search system, processing method, and search method with cloud structure
Bouarara et al. BHA2: bio-inspired algorithm and automatic summarisation for detecting different types of plagiarism
KR100659370B1 (en) Method for constructing a document database and method for searching information by matching thesaurus
Santos et al. Mimicking web search engines for expert search
JP2009146013A (en) Content retrieval method, its device, and program
JP2008269106A (en) Schema extraction method, information processor, computer program, and recording medium
Gottron External plagiarism detection based on standard IR technology and fast recognition of common subsequences

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees