TWI502382B - A patent serch method applied with principal component analysis - Google Patents
A patent serch method applied with principal component analysis Download PDFInfo
- Publication number
- TWI502382B TWI502382B TW102127235A TW102127235A TWI502382B TW I502382 B TWI502382 B TW I502382B TW 102127235 A TW102127235 A TW 102127235A TW 102127235 A TW102127235 A TW 102127235A TW I502382 B TWI502382 B TW I502382B
- Authority
- TW
- Taiwan
- Prior art keywords
- search
- matrix
- principal component
- citation
- patents
- Prior art date
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Description
本發明係關於一種專利搜尋方法,並且特別地,關於一種應用主成分分析之專利搜尋方法。The present invention relates to a patent search method and, in particular, to a patent search method using principal component analysis.
專利內蘊含豐富的資訊,對企業的重要性早已被世界所公認。根據世界智慧財產權組織(WIPO)的分析報導,在各式專業期刊、雜誌、百科全書等有關技術發展的資料中,唯一能夠全盤公開技術核心者,僅有專利說明書。因此世界上所有的研發成果約有90%~95%均可在專利說明書中找到,其中有80%並未記載於其他雜誌期刊中,是以吸取專利資訊對企業研發、預測科技趨勢發展有莫大的助益。據世界智慧財產權組織進一步的調查顯示,善加利用專利資訊,可以縮短研發時間60%、節省研發經費40%。The patent contains a wealth of information, and the importance to the company has long been recognized by the world. According to the analysis of the World Intellectual Property Organization (WIPO), among the various technical publications, such as various professional journals, magazines, and encyclopedias, the only ones that can fully disclose the core of technology are patent specifications. Therefore, about 90%~95% of all research and development achievements in the world can be found in the patent specifications, 80% of which are not recorded in other magazines, which is a great way to absorb patent information and develop the technology trends. Help. According to further investigations by the World Intellectual Property Organization, the use of patent information can shorten the research and development time by 60% and save 40% of research and development expenses.
在現今科技產業蓬勃發展的情況下,面對強大的競爭壓力,「創新」一詞已成為全球競爭力的代表根源,不論國內國外的各大產業無不積極投入大量時間及人力資源來進行研發創新。然而,企業在進行研發創新之前,常利用專利檢索以瞭解研發標的所處技術領域之專利佈局,一方面可以避免自己的研發成果侵犯他人的專利,另一方面也可以做為未來企業申請專利之前案參考。In the current booming technology industry, in the face of strong competitive pressures, the term "innovation" has become the representative source of global competitiveness. No matter whether domestic and foreign industries are actively investing a lot of time and human resources. innovating. However, before conducting R&D innovation, companies often use patent search to understand the patent layout of the technical field in which the R&D target is located. On the one hand, they can avoid their own research and development results infringing on others' patents, and on the other hand, they can also apply for patents in future enterprises. Reference.
然而在專利檢索的過程中,往往需要不斷地調整檢索條件,透過各種同義字或是其上下位概念用詞做變換組合,或是加以排除特定詞的條件予以限縮。此外,現今普遍之專業用詞大多都已和以往所使用有所不同,且若遇有心人士於撰寫專利時以不常見、不常用的特殊用詞來敘述其技術特徵,此種刻意規避的手法將導致檢索的障礙相對提高。然而若想要藉由將檢索範圍擴大以解決上述之難題,則檢索結果精準度下滑的問題,亦將會隨之產生。However, in the process of patent search, it is often necessary to constantly adjust the search conditions, and to change the combination of various synonyms or their subordinate concepts, or to limit the conditions of specific words. In addition, most of the common professional terms used today are different from those used in the past, and if a person with a heart is writing a patent with special features that are not common and not commonly used to describe their technical characteristics, this deliberate circumvention method The barriers to search will be relatively increased. However, if the problem is to be solved by expanding the search range, the problem of the accuracy of the search results will also occur.
故本發明將針對上述問題予以有效的改善,盡可能的在不喪失專利檢索精準度的前提下,有效提升專利檢索之廣度,以提高專利檢索的可信度。Therefore, the present invention will effectively improve the above problems, and as far as possible, without increasing the accuracy of patent search, effectively improve the breadth of patent search, so as to improve the credibility of patent search.
本發明期望藉由提出一種應用主成分分析之專利檢索方法,以使專利檢索的廣度增加的前提下而不影響專利檢索的準確率。The present invention is expected to provide a patent search method using principal component analysis so as to increase the breadth of patent search without affecting the accuracy of patent search.
本發明提出一種應用主成分分析之專利檢索方法,其包含以下步驟:(S1)根據一檢索條件搜尋一專利資料庫以獲得N筆檢索專利;(S2)根據一預定引證網路規則針對該N筆檢索專利進行專利引證搜尋以獲得M筆引證專利;(S3)針對該M筆引證專利與該N筆檢索專利進行一專利相似度計算以產生一專利相似度矩陣;(S4)根據該專利相似度矩陣進行一主成分分析以篩選產生Q筆相似專利,並將該Q筆相似專利與該N筆檢索專利合併成為(Q+N)筆檢索結果;(S5)將該(Q+N)筆檢索結果取代步驟(S2)之該N筆檢索專利並重複步驟(S2)至步驟(S4)直到Q等於零為止,其中N、M均為大於一之自然數,Q為大或等於零之整數。The invention provides a patent search method using principal component analysis, which comprises the following steps: (S1) searching a patent database according to a search condition to obtain N search patents; (S2) targeting the N according to a predetermined citation network rule The patent search patent searches for a patent for patent citation; (S3) performs a patent similarity calculation for the M-reference patent and the N-search patent to generate a patent similarity matrix; (S4) is similar according to the patent The principal matrix analysis performs a principal component analysis to screen and generate a Q pen similar patent, and merges the Q pen similar patent with the N pen search patent into a (Q+N) pen search result; (S5) the (Q+N) pen The search result replaces the N pen search patent of step (S2) and repeats steps (S2) to (S4) until Q is equal to zero, where N and M are both natural numbers greater than one, and Q is an integer greater than or equal to zero.
本發明步驟(S3)之專利相似度計算包含有以下子步驟:(S31)建立一MxN之引證關聯矩陣;(S32)根據該MxN之引證關聯矩陣產生一Mx1之關聯強度矩陣,該Mx1之關聯強度矩陣包含每一引證專利各自對應之一關聯強度值;(S33)正規化該Mx1之關聯強度矩陣以產生每一引證專利各自對應之一正規化關聯強度值;(S34)自該M筆引證專利中篩選該正規化關聯強度值大於一之P筆引證專利,其中P為大於一之自然數;(S35)判斷P是否大於N,若是,則建立一PxN之專利相似度矩陣;若否,則建立一MxN之專利相似度矩陣。而該專利相似度矩陣係利用每一引證專利相較於N筆檢索專利各自所屬之一預定專利分類碼之關係所建立,該預定專利分類碼可以是國際專利分類碼或是美國專利分類碼。此外,該預定專利分類碼亦可以是主專利分類碼,或是次專利分類碼,或是主專利分類碼與次專利分類碼之組合。The patent similarity calculation of the step (S3) of the present invention comprises the following sub-steps: (S31) establishing a reference matrix of MxN; (S32) generating an associated intensity matrix of Mx1 according to the citation association matrix of the MxN, the association of the Mx1 The intensity matrix includes one of the associated strength values of each of the cited patents; (S33) normalizing the associated intensity matrix of the Mx1 to generate a normalized association strength value corresponding to each of the cited patents; (S34) from the M-citing The patent selects a patent of the P-referenced value whose normalized correlation strength value is greater than one, wherein P is a natural number greater than one; (S35) determines whether P is greater than N, and if so, establishes a PxN patent similarity matrix; if not, Then establish a patent similarity matrix of MxN. The patent similarity matrix is established by using the relationship between each cited patent and one of the predetermined patent classification codes of the N-search patents, and the predetermined patent classification code may be an international patent classification code or a US patent classification code. In addition, the predetermined patent classification code may also be a primary patent classification code, or a secondary patent classification code, or a combination of a primary patent classification code and a secondary patent classification code.
本發明步驟(S4)之主成分分析包含有以下子步驟:(41)根據該專利相似度矩陣計算一共變異數矩陣;(S42)根據該共變異數矩陣進行特徵分解以產生一特徵值及一特徵向量;(S43)計算一主成分係數以及一主成分解釋變異 百分比;(S44)再根據該主成分係數以及該主成分解釋變異百分比選取該Q筆相似專利。The principal component analysis of the step (S4) of the present invention comprises the following substeps: (41) calculating a common variance matrix according to the patent similarity matrix; (S42) performing feature decomposition according to the common variance matrix to generate a feature value and a Feature vector; (S43) calculating a principal component coefficient and a principal component interpretation variation Percentage; (S44) and then selecting the Q-like similar patent according to the principal component coefficient and the percentage variation of the principal component interpretation.
此外,本發明所採用之該預定引證網路規則可以包含一被引證網路(Forward Citation)以及一引證網路(Backward Citation),且該預定引證網路規則可以包含一層專利引證網路或二層專利引證網路或三層專利引證網路。In addition, the predetermined citation network rule adopted by the present invention may include a Forward Citation and a Backward Citation, and the predetermined citation network rule may include a layer of patent citation network or two. Layer patent citation network or three-tier patent citation network.
相較於習知技術,本發明藉由提出一種應用主成分分析之專利檢索方法,將主成分分析與專利引證網路結合應用於專利檢索上,即可在不喪失專利檢索精準度的前提下,有效提升專利檢索之廣度,以提高專利檢索的可信度。Compared with the prior art, the present invention proposes a patent search method using principal component analysis, and combines principal component analysis and patent citation network to apply for patent search, without losing the accuracy of patent search. Effectively enhance the breadth of patent searches to improve the credibility of patent searches.
S1~S5‧‧‧步驟S1~S5‧‧‧Steps
S31~S35‧‧‧步驟S31~S35‧‧‧Steps
S351~S352‧‧‧步驟S351~S352‧‧‧Steps
S41~S44‧‧‧步驟S41~S44‧‧‧Steps
10‧‧‧引證關聯矩陣10‧‧‧citation association matrix
12‧‧‧關聯強度矩陣12‧‧‧Correlation strength matrix
14‧‧‧正規化關聯強度矩陣14‧‧‧Normalized correlation strength matrix
圖一係繪示本發明專利檢索方法之流程圖。FIG. 1 is a flow chart showing the patent search method of the present invention.
圖二係繪示本發明之專利相似度計算之流程圖。Figure 2 is a flow chart showing the calculation of the similarity of the patent of the present invention.
圖三係繪示本發明之引證關聯矩陣及關聯強度正規化矩陣之示意圖。FIG. 3 is a schematic diagram showing the citation association matrix and the correlation strength normalization matrix of the present invention.
圖四係繪示本發明之主成分分析之流程圖。Figure 4 is a flow chart showing the principal component analysis of the present invention.
圖五係繪示本發明之具體實施例之第一主成分係數之示意圖。Figure 5 is a schematic diagram showing the first principal component coefficients of a particular embodiment of the present invention.
以下將對本發明所提一種應用主成分分析之專利檢索方法進行一細部的說明。請參閱圖一,圖一係繪示本發明專利檢索方法之流程圖。本發明提出一種應用主成分分析之專利檢索方法,其包含以下步驟:(S1)根據一檢索條件搜尋一專利資料庫以獲得N筆檢索專利,其中專利資料庫可以是美國專利商標局之專利資料庫,或者其他國家相關專利資料庫等,而檢索條件可以是一組關鍵字或是一組專利號碼等;(S2)根據一預定引證網路規則針對該N筆檢索專利進行專利引證搜尋以獲得M筆引證專利,其中預定引證網路規則可以是針對每一檢索專利藉由前向引證(Forward citation)以及後向引證(Backward citation)以進行專利引證搜尋以獲得M筆引證專利,此外預定引證網路規亦可以包含一層專利引證網路或二層專利引證網路或三層專利引證網路等;(S3)針對該M筆引證專利與該N筆檢索專利進行一專利相似度計算以產生一專利相似度矩陣;(S4)根據該專利相似度矩陣進行一主成分分析以篩選產生Q筆相似專利,並將該Q筆相似專利與該N筆檢索專利 合併成為(Q+N)筆檢索結果;(S5)將該(Q+N)筆檢索結果取代步驟(S2)之該N筆檢索專利並重複步驟(S2)至步驟(S4)直到Q等於零為止,其中N、M均為大於一之自然數,Q為大或等於零之整數。A detailed description of the patent search method using principal component analysis proposed by the present invention will be given below. Please refer to FIG. 1. FIG. 1 is a flow chart showing the patent search method of the present invention. The invention provides a patent search method using principal component analysis, which comprises the following steps: (S1) searching a patent database according to a search condition to obtain an N-search patent, wherein the patent database may be a patent material of the US Patent and Trademark Office. a library, or a related patent database of another country, etc., and the search condition may be a set of keywords or a set of patent numbers, etc.; (S2) performing a patent citation search for the N search patents according to a predetermined citation network rule The M-cited patent, wherein the predetermined citation network rule may be a patent citation for each search patent by Forward citation and Backward citation to obtain a M-cited patent, in addition to the citation The network specification may also include a patent citation network or a second-tier patent citation network or a three-layer patent citation network; (S3) performing a patent similarity calculation for the M-cited patent and the N-search patent to generate a patent similarity matrix; (S4) performing a principal component analysis according to the patent similarity matrix to screen and generate a similar patent for the Q pen, and Q T N T is similar to the retrieval Patent Patent Merging into (Q+N) pen search results; (S5) replacing the (Q+N) pen search result with the N pen search patent of step (S2) and repeating step (S2) to step (S4) until Q is equal to zero , wherein N and M are both natural numbers greater than one, and Q is an integer greater than or equal to zero.
請參閱圖二及圖四,圖二係繪示本發明之專利相似度計算之流程圖,圖四係繪示本發明之引證關聯矩陣及關聯強度正規化矩陣之示意圖。本發明專利檢索方法之步驟(S3)之專利相似度計算包含有以下子步驟:(S31)建立一MxN之引證關聯矩陣10(如圖四所示),其中A1至An代表N筆檢索專利,B1至Bm代表M筆引證專利,假若A1檢索專利與B3引證專利存有引證關係(不論為前向引證或後向引證),則對應到引證關聯矩陣10內的引證數值(Cij)即為1,若當A1檢索專利和B2引證專利不具有任何引證關係時,引證關聯矩陣內10的引證數值(Cij)便為0,故引證關聯矩陣10係為一個由0和1所組成之矩陣;(S32)根據該MxN之引證關聯矩陣10產生一Mx1之關聯強度矩陣12,該Mx1之關聯強度矩陣12包含每一引證專利各自對應之一關聯強度值(i =1~n ;j =1~m ),其中關聯強度值 (Relationship Strength,RS)定義為B1至Bm中任一引證專利的所有引證數值(Cij)之和(如圖四所示),也就是說,當引證專利中的任一專利與N筆檢索專利存在有越多的引證關係,其引證數值(Cij)的加總勢必越大,同時亦代表著該筆引證專利與N筆檢索專利之關聯強度值(RSj)越高;(S33)正規化該Mx1之關聯強度矩陣12以產生每一引證專利各自對應之一正規化關聯強度值並形成一Mx1正規化關聯強度矩陣14,正規化關聯強度值(RSnol_j)定義為關聯強度(RSj)除以平均關聯強度(RSave);(S34)自該M筆引證專利中篩選該正規化關聯強度值大於一之P筆引證專利,其中P為大於一之自然數;(S35)判斷P是否大於N,(S351)若是,則建立一PxN之專利相似度矩陣;(S352)若否,則建立一MxN之專利相似度矩陣。此外,本發明以國際專利分類號(IPC)作為計算相似度的依據,其具有主部(section)、主類(class)、次類(subclass)、主目(group)以及次目(subgroup)五個階層,本發明以此為相似度評分依據,定義專利相似度為介於0到1之間的值,以一個階層為單位依序進行位階的判別。而該專利相似度矩陣係利用每一引證專利相較於N筆檢索專利各自所屬之一預定專利分類碼之關係所建立,該預定專利分類 碼可以是國際專利分類碼或是美國專利分類碼。該預定專利分類碼亦可以是主專利分類碼,或是次專利分類碼,或是主專利分類碼與次專利分類碼之組合。Referring to FIG. 2 and FIG. 4, FIG. 2 is a flow chart showing the patent similarity calculation according to the present invention, and FIG. 4 is a schematic diagram showing the citation association matrix and the correlation strength normalization matrix of the present invention. The patent similarity calculation of the step (S3) of the patent search method of the present invention comprises the following sub-steps: (S31) establishing a MxN citation association matrix 10 (shown in FIG. 4), wherein A1 to An represent N-search patents, B1 to Bm represent the M-cited patent. If the A1 search patent has a citation relationship with the B3 citation patent (whether it is a forward citation or a backward citation), the citation value (Cij) corresponding to the citation association matrix 10 is 1 If the A1 search patent and the B2 citation patent do not have any citation relationship, the citation value (Cij) in the citation association matrix is 0, so the citation association matrix 10 is a matrix composed of 0 and 1; S32) generating, according to the reference matrix 10 of the MxN, an associated intensity matrix 12 of Mx1, where the associated intensity matrix 12 of the Mx1 includes one of the associated strength values of each of the cited patents ( i =1~ n ; j =1~ m ), where the relationship strength (RS) is defined as the sum of all the cited values (Cij) of any of the cited patents B1 to Bm (as shown in Figure 4) That is to say, when there is more citation relationship between any patent in the cited patent and the N-search patent, the sum of the citation value (Cij) is bound to increase, and it also represents the patent and N pen. Searching for the associated strength value (RSj) of the patent; (S33) normalizing the associated intensity matrix 12 of the Mx1 to generate a normalized correlation strength value corresponding to each of the cited patents and forming an Mx1 normalized correlation strength matrix 14, The normalized correlation strength value (RSnol_j) is defined as the correlation strength (RSj) divided by the average correlation strength (RSave); (S34) from the M-cited patent, the normalized association strength value is greater than one of the P-citing patents, wherein P is a natural number greater than one; (S35) determining whether P is greater than N, (S351) if yes, establishing a patent similarity matrix of PxN; (S352) if not, establishing a patent similarity matrix of MxN. In addition, the present invention uses the International Patent Classification Number (IPC) as a basis for calculating similarity, having a main section, a main class, a subclass, a group, and a subgroup. The five levels, the present invention uses this as the basis for the similarity score, defines the patent similarity as a value between 0 and 1, and sequentially determines the ranks in units of one hierarchy. The patent similarity matrix is established by using the relationship between each cited patent and one of the predetermined patent classification codes of the N-search patents, and the predetermined patent classification code may be an international patent classification code or a US patent classification code. The predetermined patent classification code may also be a primary patent classification code, or a secondary patent classification code, or a combination of a primary patent classification code and a secondary patent classification code.
請參閱圖三,圖三係繪示本發明之主成分分析之流程圖。本發明專利檢索方法之步驟(S4)之主成分分析包含有以下子步驟:(41)根據該專利相似度矩陣計算一共變異數矩陣,而其計算方法即若有兩個隨機的變數x 1i
和x 2i
,且i=1…N,則兩變數之間的共變異數定義為:若將共變異數推廣到高維的空間時,可以得到共變異矩陣(Covariance matrix),如下:x i
=[x 1i
…x Mi
] T
;
以下將針對本發明之一具體實施例進行說明。本發明所提出一種應用主成分分析之專利檢索方法可以被應用在檢索「直通矽晶穿孔」(Through-Silicon Via,TSV)為一檢索標的,檢索資料庫為一美國公告專利資料庫上之實例應用。Hereinafter, a specific embodiment of the present invention will be described. The patent search method using principal component analysis proposed by the present invention can be applied to search for "Through-Silicon Via" (TSV) as a search target, and the search database is an example of a US patent database. application.
於本發明步驟(S1)中,首先根據一檢索條件為ABST/(("trough-silicon via$"or TSV or TSVs)and("integrated circuit$"))於2013/4/16在美國公告專利資料庫中進行進階檢索,檢索結果獲得N筆檢索專利,其中N等於44。In the step (S1) of the present invention, the patent is first announced in the United States according to a search condition of ABST/(("trough-silicon via$" or TSV or TSVs) and ("integrated circuit$") in 2013/4/16. An advanced search is performed in the database, and the search result is obtained by N search patents, where N is equal to 44.
於本發明步驟(S2)中,根據一預定引證網路規則針對44筆檢索專利進行專利引證搜尋,該預定引證網路規則採用前向引證(Forward Citation)以及後向引證(Backward Citation)並使用一層引證網路規則,藉由針對44筆檢索專利進行專利引證搜尋以獲得M筆引證專利,其中M等於406。In the step (S2) of the present invention, a patent citation search is performed on 44 search patents according to a predetermined citation network rule, and the citation network rule adopts Forward Citation and Backward Citation and uses A layer of citation network rules, through a patent citation search for 44 search patents to obtain a M-cited patent, where M equals 406.
於本發明步驟(S3)中,針對該406筆引證專利與該44筆檢索專利進行一專利相似度計算以產生一專利相似度矩陣。其中本發明所採用之專利相似度計算包含有以下子步驟:(S31)建立一406x44之引證關聯矩陣;(S32)根據該406x44之引證關聯矩陣產生一406x1之關聯強度矩陣,該406x1之關聯強度矩陣包含每一引證專利各自對應之一關聯強度值,即經由引證關聯矩陣內的引證數值(Cij)計算出該406筆引證專利中每件引證專利的關聯強度(RSj),之後針對該406筆引證專利之各自關聯強度(RSj)後得到平均關聯強度(RSave)為1.8705;(S33)正規化該406x1之關聯強度矩陣以產生每一引證專利各自對應之一正規化關聯強度值;(S34)自該406筆引證專利中篩選該正規化關聯強度值大於一之P筆引證專利,其中P等於115;(S35)判斷P是否大於N,(S351)若是,則建立一PxN之專利相似度矩陣,亦即建立一115x44之專利相似度矩陣。於實際應用上,本實施例以國際專利分類號(IPC)為相似度評分依據,定義專利相似度為介於0到1之間的值,以一個階層為單位依序進行位階的判別,每一位階增加0.2的相似值。In the step (S3) of the present invention, a patent similarity calculation is performed on the 406-stroke patent and the 44 search patents to generate a patent similarity matrix. The patent similarity calculation used in the present invention includes the following sub-steps: (S31) establishing a 406x44 citation association matrix; (S32) generating a 406x1 correlation strength matrix according to the 406x44 citation association matrix, the 406x1 correlation strength The matrix contains one of the associated strength values of each of the cited patents, that is, the correlation strength (RSj) of each of the cited patents in the 406 cited patents is calculated via the citation value (Cij) in the citation reference matrix, and then the 406 pens are The average correlation strength (RSave) of the cited patents (RSj) is 1.8705; (S33) normalizes the associated intensity matrix of 406x1 to generate a normalized correlation strength value corresponding to each of the cited patents; (S34) From the 406 cited patents, the patent of the P-referenced patent with the normalized correlation strength value greater than one is selected, wherein P is equal to 115; (S35) determining whether P is greater than N, (S351), if yes, establishing a PxN patent similarity matrix That is, a 115x44 patent similarity matrix is established. In practical applications, this embodiment uses the International Patent Classification Number (IPC) as the basis for the similarity score, and defines the patent similarity as a value between 0 and 1, and sequentially determines the ranks in units of one level. One order increases the similarity value of 0.2.
請參閱圖五,圖五係繪示本發明之具體實施例之第一主成分係數之示意圖。於本發明步驟(S4)中,根據該115x44之專利相似度矩陣進行主成分分析。其中本發明所採用之主成分分析包含有以下子步驟:(41)根據該115x44之專利相似度矩陣計算一共變異數矩陣;(S42)根據該共變異數矩陣進行特徵分解以產生一特徵值及一特徵向量;(S43)計算一主成分係數以及一主成分解釋變異百分比,於本實施例中的第一主成分為:Y 1 =4.149506X 1 +3.432632X 2 +.....+3.720075X 115 ,其中,X 項的各個係數即代表所對應的引證專利之權重值,權重值越大即表示該引證專利與該44筆檢索 專利的相似度越高,故本發明之第一主成分之係數以折線圖的方式呈現,如圖五所示;(S44)根據該主成分係數以及該主成分解釋變異百分比選取Q筆相似專利,其中主成分係數之第一係數峰值4.590093所對應之引證專利共計有9筆,第二係數峰值3.807923有1筆,第三係數峰值3.798306包含3筆,而第四係數峰值3.641812計有8筆,經分析判斷後可知,由於第四係數峰值所對應之8筆專利以及第三係數峰值中的1筆專利皆非本實施例之檢索標的,因此本實施例選取第一係數峰值與第二係數峰值所對應之引證專利共12筆相似專利,亦即Q等於12,最後再將該12筆相似專利與該44筆檢索專利合併成為56筆檢索結果;(S5)將該56筆檢索結果取代步驟(S2)之該44筆檢索專利並重複步驟(S2)至步驟(S4)直到相似專利筆數等於零為止。Referring to FIG. 5, FIG. 5 is a schematic diagram showing a first principal component coefficient of a specific embodiment of the present invention. In the step (S4) of the present invention, principal component analysis is performed based on the patent similarity matrix of 115x44. The principal component analysis used in the present invention includes the following sub-steps: (41) calculating a common variance matrix according to the 115x44 patent similarity matrix; (S42) performing feature decomposition according to the common variance matrix to generate a eigenvalue and a feature vector; (S43) calculating a principal component coefficient and a principal component interpretation variation percentage, the first principal component in this embodiment is: Y 1 =4.149506 X 1 +3.432632 X 2 +.....+3.720075 X 115 , wherein each coefficient of the X term represents the weight value of the corresponding cited patent, and the greater the weight value, the higher the similarity between the cited patent and the 44 search patents, so the first principal component of the present invention The coefficient is presented in the form of a line graph, as shown in FIG. 5; (S44) according to the principal component coefficient and the percentage variation of the principal component interpretation, a Q-like patent is selected, wherein the first coefficient of the principal component coefficient corresponds to the peak of 4.590093. There are a total of 9 patents, a second coefficient peak of 3.807923 has 1 pen, a third coefficient peak of 3.798306 contains 3 pens, and a fourth coefficient peak of 3.641118 has 8 pens. After analysis and judgment, it is known that the fourth coefficient peak The eight patents corresponding to the value and one of the third coefficient peaks are not the search target of the present embodiment. Therefore, the present embodiment selects 12 similar patents for the cited patent corresponding to the first coefficient peak and the second coefficient peak. , that is, Q is equal to 12, and finally the 12 similar patents and the 44 search patents are combined into 56 search results; (S5) the 56 search results replace the 44 search patents of the step (S2) and repeated Step (S2) to step (S4) until the number of similar patents is equal to zero.
相較於習知技術,本發明藉由提出一種應用主成分分析之專利檢索方法,將主成分分析與專利引證網路結合應用於專利檢索上,即可在不喪失專利檢索精準度的前提下,有效提升專利檢索之廣度,以提高專利檢索的可信度。Compared with the prior art, the present invention proposes a patent search method using principal component analysis, and combines principal component analysis and patent citation network to apply for patent search, without losing the accuracy of patent search. Effectively enhance the breadth of patent searches to improve the credibility of patent searches.
藉由以上較佳具體實施例之詳述,係希望能更加清楚描述本發明之特徵與精神,而並非以上述所揭露的較佳具體實施例來對本發明之範疇加以限制。相反地,其目的是希望能涵蓋各種改變及具相等性的安排於本發明所欲申請之專利範圍的範疇內。因此,本發明所申請之專利範圍的範疇應根據上述的說明作最寬廣的解釋,以致使其涵蓋所有可能的改變以及具相等性的安排。The features and spirit of the present invention will be more apparent from the detailed description of the preferred embodiments. On the contrary, the intention is to cover various modifications and equivalents within the scope of the invention as claimed. Therefore, the scope of the patented scope of the invention should be construed in the broadest
S1~S5‧‧‧步驟S1~S5‧‧‧Steps
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW102127235A TWI502382B (en) | 2013-07-30 | 2013-07-30 | A patent serch method applied with principal component analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW102127235A TWI502382B (en) | 2013-07-30 | 2013-07-30 | A patent serch method applied with principal component analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201504827A TW201504827A (en) | 2015-02-01 |
TWI502382B true TWI502382B (en) | 2015-10-01 |
Family
ID=53018900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW102127235A TWI502382B (en) | 2013-07-30 | 2013-07-30 | A patent serch method applied with principal component analysis |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI502382B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200617766A (en) * | 2004-11-29 | 2006-06-01 | Univ Nat Yunlin Sci & Tech | Patent classification system and method |
TW200945070A (en) * | 2008-04-22 | 2009-11-01 | jian-xun Chen | Classification method of patent technology correlation |
TW201123064A (en) * | 2009-12-30 | 2011-07-01 | Univ Nat Taiwan Science Tech | Method for patent valuation and computer-readable storage medium |
US20120317040A1 (en) * | 2011-06-08 | 2012-12-13 | Entrepreneurial Innovation, LLC. | Patent Value Prediction |
-
2013
- 2013-07-30 TW TW102127235A patent/TWI502382B/en not_active IP Right Cessation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200617766A (en) * | 2004-11-29 | 2006-06-01 | Univ Nat Yunlin Sci & Tech | Patent classification system and method |
TW200945070A (en) * | 2008-04-22 | 2009-11-01 | jian-xun Chen | Classification method of patent technology correlation |
TW201123064A (en) * | 2009-12-30 | 2011-07-01 | Univ Nat Taiwan Science Tech | Method for patent valuation and computer-readable storage medium |
US20120317040A1 (en) * | 2011-06-08 | 2012-12-13 | Entrepreneurial Innovation, LLC. | Patent Value Prediction |
Also Published As
Publication number | Publication date |
---|---|
TW201504827A (en) | 2015-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shraga et al. | Web table retrieval using multimodal deep learning | |
CN106372117B (en) | A kind of file classification method and its device based on Term co-occurrence | |
Deng et al. | Contextualized knowledge-aware attentive neural network: Enhancing answer selection with knowledge | |
Wang et al. | Indexing by L atent D irichlet A llocation and an E nsemble M odel | |
CN102693316B (en) | Linear generalization regression model based cross-media retrieval method | |
Ma et al. | Light self-Gaussian-attention vision transformer for hyperspectral image classification | |
Cui et al. | Distribution-oriented aesthetics assessment for image search | |
Gao et al. | Style-adaptive photo aesthetic rating via convolutional neural networks and multi-task learning | |
CN113468291A (en) | Patent network representation learning-based automatic patent classification method | |
CN105786622B (en) | A kind of node selecting method calculated under cloud environment for real-time collaborative | |
Meng et al. | MLANs: image aesthetic assessment via multi-layer aggregation networks | |
TWI502382B (en) | A patent serch method applied with principal component analysis | |
Menon et al. | Improving ranking in document based search systems | |
Dokun et al. | Single-document summarization using latent semantic analysis | |
Aletras et al. | A hybrid distributional and knowledge-based model of lexical semantics | |
CN115630964B (en) | Construction method of high-dimensional private data-oriented correlation data transaction framework | |
Goswami et al. | An efficient feature selection technique for clustering based on a new measure of feature importance | |
Gvaladze et al. | PCovR2: A flexible principal covariates regression approach to parsimoniously handle multiple criterion variables | |
Xie et al. | Neural network model pruning without additional computation and structure requirements | |
Zhao | [Retracted] Optimization of Machine Online Translation System Based on Deep Convolution Neural Network Algorithm | |
Jung et al. | GAE-Based Document Embedding Method for Clustering | |
Yin et al. | Query-focused multi-document summarization based on query-sensitive feature space | |
Huang | Manifold learning for financial market visualization | |
Hou et al. | Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation | |
Guo et al. | A note on Stochastic Dominance and the Omega Ratio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |