TW200529063A

TW200529063A - Automatic query clustering

Info

Publication number: TW200529063A
Application number: TW094102362A
Authority: TW
Inventors: Andrzej Turski; li-li Cheng; Matthew Maclaurin; Richard F Rashid
Original assignee: Microsoft Corp
Priority date: 2004-01-26
Filing date: 2005-01-26
Publication date: 2005-09-01
Also published as: JP2005235196A; HK1080164A1; US20080021896A1; KR101029403B1; EP1557774A3; RU2368948C2; KR20050077036A; JP4101239B2; BRPI0500784A; ZA200500736B; MXPA05001072A; RU2005101735A; US20050165825A1; AU2005200286A1; CN1648903A; US7257571B2; MY145961A; EP1557774A2; CA2494410A1

Abstract

The present invention relates to a system and methodology for automatic clusterization and display of data items in a local or remote database. Such clusterization can be based on properties associated with the data items such as a type, location, people, date, time, user-defined, and so forth, wherein an initial property may be employed to form a first level of clusterization and subsequent property may automatically determined to form an optimized clusterization from which to find and retrieve desired information. A computerized interface for organizing and retrieving data is provided. The interface includes a property analyze to determine an item distribution for at least two cluster properties and an organizer that forms new cluster based in part on the item information.

Description

200529063 玫、發明說明：【發明所屬之技術領域】本發明-般而言係有關於電腦系統，具體言之，係有關於—種藉由分析數種不同屬性叢集之項目分配，自動地將資訊項目排列成—較小的項目之子集(s—) 的系統及方法。【先前技術】200529063 Mei, description of the invention: [Technical field to which the invention belongs] The present invention is generally related to computer systems, specifically, it is related to-a kind of automatic allocation of information by analyzing the assignment of several different attribute clusters Systems and methods for arranging items into—a smaller subset (s—) of items. [Prior art]

一以身料庫為主之作鮮、蘭重要雜為其藉由執行_可包含數個項目屬性之查詢快速找到所需項目的能力，此應與先前之系統做—比較，例如比較何者需要知道-Μ案細層狀-檔案驗置村娜所需資訊，當_查詢方法功能強大時’ k新的系統-般藉域立—般制者可簡易地依直覺做查觸使用者介面達成該快速查询之成效。在它的原有形式中，資料庫查詢⑼如，以丁题語吕表不者)不僅對專業程式設計員而言，變得_處理，常常也令終端使用者難以適應。解决该查詢問題之方法為揭示提供對某些事先定義之查詢進行直接存取的吏者"面p 7舉例而5，可提供一種事先定義之查詢搜尋一磁碟(例如圖片庫) j的所有則檔，或所有未讀取的郵件，此外，系統可建議將搜尋結果以一種特疋方式刀、’且例如圖片可依據期取得日期自動地放入數個群組中。此事先定義之查·方式對於許多常見情形而言，非常實用，但無法全面開啟資料庫的所有功能，普遍性尚嫌不足，以圖片的範例為例，其可能發生所有圖片皆於同一天取得之情形(或者可能相機未設定日期） )0如日齡組·得無益，錢理其他外掛之屬式定義、系統#理者定義或細者定義)時，這種情形更加嚴重，由於作業系統的開發之事蚊義查詢近乎不可能。員並不知叙些屬性，輯該些屬性所用 3 200529063 另一種方法係提供使用者一種以似於一種自然語言表現的文字查詢資料庫的功能，以資料庫的觀點而言，此種查詢之普遍性足夠，使用者也易於了解；但是， , 若允許自然語言查詢所採用之格式完全不受限，則難以開發出可正確了解任何俨況下使用者之意向的解析器(Parser),若加上一些文法的限制，則使用者要構成逆 " 句構造上正確的查詢又要其有時可以大量的表示式處理就變得更加困難，在每— 種情況下，查詢文字需被鍵入的想法很難吸引多數使用者，小孩、非英語使用者或使用無鍵盤裝置的使用者在文字輸入上都會產生問題，因此吾人需要一種具有 | 搜尋並擷取資訊所用之簡單易用的點選查詢介面。【發明内容】以下將揭示本發明之實施内容以提供對本發明之一些態樣的基本了解，此實施内容並非本發明之廣泛性的概觀，其非意於識別本發明之重要或關鍵性組件或描述本發明之範圍，其唯一目的為以一簡化之形式揭示本發明之某些内容，並作為以下將揭示之實施方式的前序。本發明係關於自動取得及顯示所需資訊並將其轉為可簡單管理之資訊叢集的子集’在m统使用者介面巾，欲從清單搜尋與擷取所需資訊時，藉由例如 • 將該項目以清單方式顯示大量項目之劉覽即會產生問題，本發明提供一種改良式點選介面，藉由將項目之相關屬性分類以幫賴覽大量項目。以這些屬性被叢集之項目可以類似資料夾的方式(或其他顯示類型)呈現，藉此自動叢集化可按一不同、缝續之屬性將查詢結補分或排列為可簡單管理之叢集的子集，之後這些子集 • 可被· _取所《訊或執行其他叢餘序(例如驗叢集化），用以叢集的最佳屬性可藉由分析在各不同屬性叢集中的項目分配決定。本發明之-態樣係提供-叢集化屬性的自動選擇，決定此屬性可能產生之問題說明如下：若給定-起始組項目及一組可被用以分組的項目屬性，則該組項目 4 200529063 之何相關屬性提供最佳叢錢果？齡最佳叢集結果，本發日級提供—致性分且方法將、、Ό果分為具適當數目之叢集的群组，因此，不欲碰到每一叢集中具有大里項目之夕數叢集，或具有少數項目之大量叢集的情形，才能有效地搜尋及娜 " 資訊。藉由4日派叢集化分數(Clusterizationscore)予每'一項目屬性並選擇呈有最高分數之屬性解決上述問題，該叢集化分數可透過將每一叢集之項目數目相輯算得出，若項目數等於#，將該叢集化分數計算為叢集大小之乘積的函式在項目被 φ 劃分為*個叢集，且各叢集具有游個項目時具有最大值。在其他分配上，該分數係被用輯量並味制分配與_理想分配相去料，另—種分數函式的範例可根據例如二項分配邱__減此〇11)，對此類型之分配而言，該分數值具有 -統計_，即提供-些ΛΓ—脑/項目可被劃分成指定大小之叢集的方式該提供使用者最大值的叢集化為-種將替代性分配之最大數降低的叢集化。若要比較後續叢集化可用之不同屬性，可計算所有屬性所用的叢集化分數，其中該項計算僅需以一項目清單之單一傳遞路徑執行即可。為達成上述及相關之目的，本發明之一些例證說明的態樣在此藉由以下之實施方式及附加圖示詳述之，這些態樣係表示數種可實施本發明之方式，其全體係 • 於在本發明之麵内，本發明之其他優點與新的功能特性由以下之本發明的實施方式搭配圖示說明得以彰顯。【實施方式】本發明係有關-觀以在區域端或遠端麟庫系財自動將f料項目叢集化並顯示的祕與錢，此叢集化可依據料項目侧之屬性(例如麵、位置、人員、曰期、時間、使用者定義等)為之，其中—起始屬性可被用以構成叢集化的第 -層，且-_叢氣可自驗妓簡.最佳化鶴化，從域佳化叢集化 5 200529063 ==_„訊。在—_，提供__湖及_料的電腦化二Γ性分析器以決定至少二_性之項目分配及—依據 β亥項目刀配之部份構成新叢集之排列器。當使用於該項應用時，該名稱「元件」、「分析器」、「叢集」、「系統」等係音於表不-電酬的實體，繼體、細猶、峨崎軟體，舉例言之，-元件可為(但不限於)執行於處理器上之處理程序、處理器、物件、可執行程式、執行之線程、程式及難腦’ #如，—執行於咖上之朗程式及該飼服器可為-元件，—處理器及(或)執行線程中可駐留—或多元件，且—讀可位於一電腦上及猶散於：或多電腦之間，此外，這些元件可執行自儲存數種離構之不同電腦可讀取雜，該元件可透過區域端及(或)遠端處理程序進行資鄕輸’其中該處理程序例如根據具有一或多資料封包(例如來自一與區域端系統、分散她充中及(或)跨網路其他元件交互操作之元件的資料，其中該跨網路例如透過化號與其他系統通訊傳輸之網際網路），首先參照第1圖，其描述一根據本發明一態樣之查詢叢集系統湖，該系統 ⑽包含-資料儲存器m，其储存許多被顯示於—使用者介面(圖中未顯示)之資料項目，該項目12〇可包含可衫種可檢視狀態呈現於制者介面之文件、槽案、資料夾、圖像、音效樓、程式碼等，此將詳述於後。項目12Q亦與數種不同:性如圖像、文件、工作表、二元檔等）、建立日期、項目相關人員、位置、種類、使用者定義屬性等。-匯集器(Aggregat〇仰匯集該項目12〇及其相關屬性並將該項目交付-屬性分析器140，由其執行對各項目與屬性的分析，舉例而言，此分析可包含自動決定數種可能叢集情境或項目之可能分組顧的分數。依據該分析器140的分析，一叢集排列器15〇將最佳化分組提示予使用者新叢集160 ’該叢集16G之最佳化分組有助於自f料儲存$ nQ搜尋並擷取所需資 6 200529063 ’、中心料儲存$可包含區域端儲存舰、遠雜存雜或區域端與遠端儲存器的組合。在自動叢集的實例中，一預設之最上層叢集化可將項目以項目類型分組， ^員使用者研九中發現以項目類型分組之第—層很實用使用者也易於瞭解；但疋同時也么現按其他屬性進行之第二層的叢集化並不明顯且不易找到，因此，本發月之〜、樣即為提供—叢集化屬性的自動選擇，其問題可敘述如下：給定一起 ^，員目及、、、且可用以分組的項目屬性後，何屬性提供最佳化叢集化結果？藉由 • 最佳叢集化、、。果’提供——雜分組方法將結果分為具適當數目之叢集的群組為一目標，上述目私可藉由指派一叢集化分數(Clusterizati如沉〇比)予每一項目屬性並選用-有最同刀數之屬性來達成，該叢集化分數(sc㈣可透過將每一叢集之項目數目相乘計算得出，如下列等式所示： score n—items 叢集，n—uems 叢集 2岑 *員14於#’將《叢集化分數計算為叢集大小之乘積的函式在項目被劃 :為㈤叢集且各叢集具有νΛΓ個項目時具有最大值。在其他分配上，該分數係被用以衡1並比較該項分配與一理想型或最佳化分配相去多遠，在測試例中發 φ I上述刀數函式可產生合理的結果；但是應注意的是以上所用之分數函式為-示One is based on the body library, and the important one is its ability to quickly find the required items by executing a query that can contain several item attributes. This should be compared with the previous system—for example, comparing what is needed Know the fine layer of -M case-the information required by the file inspection village, when the _ query method is powerful, k new system-general borrowed domain-general system can easily check the user interface intuitively The effectiveness of this quick query. In its original form, database queries (such as Ding's inscription Lü Bubu) are not only difficult for professional programmers, they often make it difficult for end users to adapt. The method of solving the query problem is to reveal the public who provides direct access to some predefined queries. "P7 is an example and 5", a predefined query can be provided to search a disk (such as a picture library) j All files, or all unread mail, in addition, the system can suggest that search results be cut in a special way, and 'for example, pictures can be automatically placed into several groups based on the date of acquisition. This pre-defined search method is very practical for many common situations, but it cannot fully open all the functions of the database, and the generality is not enough. Take the example of pictures, which may happen that all pictures are obtained on the same day Situation (or the camera may not set a date)) 0 If the age group is unhelpful, the other definitions of the money management system, such as system definitions or system definitions are more serious, due to the operating system The development of mosquito query is almost impossible. I do n’t know about the attributes, and I use them to edit the attributes. 3 200529063 Another method is to provide users with a function of querying the database in a text similar to a natural language. From the perspective of the database, such queries are common. It is sufficient and easy for users to understand; however, if the format used to allow natural language queries is completely unlimited, it is difficult to develop a parser that can correctly understand the user's intentions under any circumstances. With some grammatical restrictions, it becomes more difficult for the user to construct an inverse " constructively correct query and it can sometimes handle a large number of expressions. In each case, the query text needs to be typed It ’s difficult to attract most users. Children, non-English speakers, or users who use keyboardless devices have problems typing text, so I need an easy-to-use click-to-search query with | for searching and retrieving information. interface. [Summary] The following will disclose the implementation of the present invention to provide a basic understanding of some aspects of the present invention. This implementation is not an extensive overview of the present invention, and is not intended to identify important or critical components of the present invention or The sole purpose of describing the scope of the invention is to disclose some aspects of the invention in a simplified form as a prelude to the embodiments that are discussed below. The present invention is about automatically obtaining and displaying required information and converting it into a subset of the information cluster that can be easily managed. In the system user interface towel, when you want to search and retrieve the required information from the list, for example, • Displaying a large number of items in the list in a list manner will cause problems. The present invention provides an improved point-and-click interface, which helps to browse a large number of items by classifying related attributes of the items. The clustered items with these attributes can be presented in a folder-like manner (or other display type), so that automatic clustering can supplement the query results with different, seamless attributes or arrange them into sub-clusters that can be easily managed. These subsets can then be retrieved or executed by other clustering sequences (such as cluster checking). The best attributes for a cluster can be determined by analyzing the assignment of items in different attribute clusters. The aspect of the present invention provides automatic selection of clustering attributes. The problem that may be caused by this attribute is explained as follows: If a given starting group item and a group of item attributes can be used to group, the group of items 4 What are the relevant attributes of 200529063 that provide the best results? The best age cluster results are provided by the class of this issue-consistent scores and methods to divide the fruits and capsules into clusters with an appropriate number of clusters. Therefore, you do not want to encounter the clusters with the Dali project in each cluster. , Or a large cluster with a small number of items, to effectively search for " information. The clustering score is assigned to each item on the 4th and the attribute with the highest score is selected to solve the above problem. The clustering score can be calculated by compiling the number of items in each cluster. Equal to #, the function that calculates the clustering score as the product of the cluster size when the items are divided by φ into * clusters, and each cluster has a maximum value when there are moving items. In other allocations, this score is used to mix the quantity and flavor allocation with _ideal allocation. Another example of a fractional function can be based on, for example, binomial allocation (Qiu __ minus this 〇11). In terms of distribution, the score value has -statistics, that is, provides-some ΛΓ-the way that the brain / item can be divided into clusters of a specified size, the cluster that provides the maximum value of the user is reduced to a maximum number of alternative allocations Reduced clustering. To compare the different attributes available for subsequent clustering, calculate the clustering score for all attributes, where the calculation need only be performed with a single delivery path for a list of items. In order to achieve the above and related objectives, some examples of the present invention are described in detail below through the following embodiments and additional diagrams. These aspects represent several ways in which the present invention can be implemented, and the entire system • In the aspect of the present invention, other advantages and new functional characteristics of the present invention are highlighted by the following embodiments of the present invention with illustrations. [Embodiment] The present invention is related to viewing secrets and money that are automatically clustered and displayed at the regional end or remote Linku system, and this clustering can be based on the attributes of the material items (such as noodles, locations). , Personnel, date, time, user-defined, etc.), among which-the initial attribute can be used to form the first layer of clustering, and -_cluster can be self-examined. Optimized crane, From the domain optimization clustering 5 200529063 == _ „Xun. In —_, provide a computerized binary analyzer of __ lake and material to determine the allocation of at least two genders and — according to the β Hai project knife distribution Part of it constitutes the sequencer of the new cluster. When used in this application, the names "component", "analyzer", "cluster", "system", etc., are entities that represent-electrical rewards, successors , Xieyou, Ezaki software, for example,-components can be (but not limited to) processing programs, processors, objects, executable programs, executing threads, programs and hard brains running on the processor '# 如, — The Lang program executed on the coffee and the feeder can be-the component,-the processor and / or the execution thread Can reside—or multiple components, and—reads can be located on one computer and scattered between: or multiple computers, in addition, these components can perform self-storing from different computer readable miscellaneous storage of several kinds of deconstruction, the component can be transmitted through Regional and / or remote processing program for data input, where the processing program is based on, for example, having one or more data packets (e.g., from a system that interacts with a regional system, decentralizes, and / or interacts with other components across the network). Operational component data, in which the cross-network, such as the Internet that communicates with other systems through communication numbers, is first referred to FIG. 1, which describes a query cluster system lake according to one aspect of the present invention. Contains-data storage m, which stores many data items displayed on the user interface (not shown in the figure). The item 120 may include documents, slots, Folders, images, sound effects, code, etc., which will be detailed later. Project 12Q is also different from several types: nature (such as images, documents, worksheets, binary files, etc.), creation date, project-related personnel, location, type, user-defined attributes, etc. -Aggregator (Aggregate) collects the item 120 and its related attributes and delivers the item-the attribute analyzer 140, which performs the analysis of each item and attribute, for example, this analysis may include automatically determining several Score of possible clustering scenarios or possible groupings of items. According to the analysis of the analyzer 140, a clusterer 15 presents the optimized clustering to the user's new cluster 160 'The optimized clustering of the cluster 16G helps Search and retrieve the required data from the material storage $ nQ 2005200563 ', the central material storage $ may include the area-side storage ship, remote miscellaneous storage, or a combination of area-side and remote storage. In the example of automatic clustering, A preset top-level clustering can group projects by project type. The user found that the first-level grouping by project type is very practical and easy to understand; but at the same time, it is also based on other attributes. The clustering of the second layer is not obvious and not easy to find. Therefore, the ~ and samples of this month are provided-the automatic selection of clustering attributes. The problem can be described as follows: Given a ^, members and, After the item attributes that can be used for grouping, what attributes provide optimized clustering results? By • Best Clustering, .... Fruits provide-the heterogeneous grouping method divides the results into groups with an appropriate number of clusters As a goal, the above-mentioned objective can be achieved by assigning a clustered score (Clusterizati such as Shen 0 ratio) to each item attribute and selecting-the attribute with the most number of the same knife, the clustered score (sc㈣ can be achieved by It is calculated by multiplying the number of items in a cluster, as shown in the following equation: score n-items cluster, n-uems cluster 2 Cen * member 14 in # 'Function to calculate the clustered score as the product of the cluster size The item is classified as a ㈤ cluster and each cluster has νΛΓ items. It has a maximum value. On other allocations, the score is used to weigh 1 and compare how far the allocation is from an ideal or optimized allocation. , In the test example, the above function of φ I can produce reasonable results; but it should be noted that the fractional function used above is-

IlL例舉例而。’其他函式可用以提供不同相對權數予不屬於該理想型分配之分配。種刀數函式的範例可以一項分配(Binomial distribution)作為依據，如下所示： score {Nj〇tal)\l {{njtems y)!* (njtems ^2)1 * ·. ·) 在此不|(L例中，該分數值具有一統計轉，即提供一些可將/項目劃分成才曰疋大小之叢集的方式，該給予使用者最大值的叢集化為一種將替代性分配之 7 200529063 最大數降低的叢集化’若要比較可由後續叢集化所用之不同屬性，需計算所有屬性所用之叢集化分數’其僅需透過一單一傳遞路徑將所有項目列出清單即可，其將於第2圖所描述之處理程序中加以詳述。第2圖為-描述根據本發明—態樣之自動叢集化處理程序的流程圖，同時，為簡化說明起見，該方法係以一系列動作的方式圖示並說明，吾人應知道並了解本發職未受隨咖之順序，根縣發明，有_作之發生鱗可異於㈤同於在《示並制之其他動作’舉例言之，熟習本項雖者應知道並了解一方法可選擇性地以-連串有交互關係之狀態或事件呈現，例如以狀態圖呈現之，此外，實施根據本發明之-方法並非需要圖中描述的所有動作。，假設需要比較則固項目及M個屬性，該處理程序2〇〇可實施如下：在21〇中，初始化Μ個雜湊表(Hash table)，在220中，重複於#個項目中’在細中，對每 -項目重複於卿屬針，在樹，計算綺—屬性賴之雜紐(咖她e)， -雜湊函式之選定方式為對兩個分至同—叢集之屬性值回傳該相同之雜凑值，舉例而言’當按日期咖叢集化時，祕湊函式可僅依據日卿分，省略時間部分，在250中，該雜湊表係被用以追縱叢集之數目及每一叢集中的項目數目，在⑽ 中’每-屬性之叢集化分數係使麟自其細之齡表的:雜計算得出，在270中’該清單中的屬性係以其所能產生之叢集數排序若該項目之數目超過某一閥值(例如超過1〇個項目），、其結果可自動地使用280中5之清單中的最上層屬性叢集化，且其他叢集可被建議接續排列以作替代，舉例而言，在郵件訊息的測試實例中，當選用所有郵件訊息類型之項目，則上述之處理程序自動地將結果按寄件者叢集，·但是，舉例而吕’若選擇之項目類型為版^文件，則根據最近更新日期建立叢集，反之若項目類型為C#·程式制依據其:雜夾之包含分組(此對應於按程式規劃之分組）’上述方法的共同特性係允許決定最適於一組指定項目分組的規則，包含用戶 8 200529063 自定與外掛程式的屬性。第3至_列舉數個實例描述-或多自動叢集系統及處理程序的使用者介， ®。應注意的是，這些介面可包含-顯示器，其具有-或多顯示物件，包含的態樣有可設定圖示、按鍵、滑動軸、輸入框、選擇項、功能表、標籤等具有幫助操 . 作系統100之可設定之大小、形狀、顏色、文字、資料及聲音者，此外，該介面亦可包含其他數種用以調整或設定本發明之-或多態樣的輸入褒置及控制器，此將於以下詳述之’此可包含自滑鼠、鍵盤、聲音輸人、_、遠端網路服務及(或）纽裝置(例如相機或影像輸入〉接收使用者命令以影響或修改系統100之介面操作 W 4其他態樣。下文中說明本發明之數種不同態樣且並參照第3至1〇圖中所舉例描述的介面，當軟體設計者在設計資料夾或其他類型之結構時，其可自由地允許在最上層檢視中隱藏不重要或鮮少使用之項目並將該項目放入一隱藏之資料炎中，同樣地，建立-檢視屬性之職II時，即使叢集酬蚊之分數很高，該機制仍可提供數種不同機制以隱藏不重要或不實用之屬性。屬性之上移/下移可發生於不同層級中，在應用程式層，應用程式的設計者可指示何屬性為揭示於使用者介面中的主要屬性，何者為次要屬性或輔助屬性，此 • 通常為分別定義每一項目類型，前段中所說明之自動查詢叢集-般被視為主要屬性’此外，每-項目類型應定義所有項目公用之屬性·的屬性對映(朽叩卿 mapping)，舉例言之，一公用之「日期」屬性用於圖片可對映於「取得日期」，而，肖於文件則可對映於「最近修改日期」，同樣地，-「人員」_於文件可為「作者」，用於郵件可為「寄件者」等以此類推，一般而言，只有使用者才可決定檢視其個別資料之最佳屬性，且可具有將任何特疋屬性升級或降級的明確性υτ(使用者介面），但本發明亦可隱含地(implidtly) 學％自使用者動作(例如透過學習規則），每一屬性可具有其所屬權數，該權數於使 9 200529063 用者自不同的屬性叢集化轉變至另—者時被提高，酬時降低，每—屬性之最後《平刀（肋^叢集所依據的屬性為何)為該屬卿數與叢集化分數(根據上述之公式計算得出）的積，如上述，使用者-般比較喜歡於平展式清單上之項目類型叢集的階層性排列，該階層採用某義型之順序並將其簡化以搜尋被要求的「項目類型」值，該相同情形應亦適用於任何具有不同屬性值的屬性，以下列舉特定實例說明將屬性值排列為一階層性檢視畫面的方法，在一般槽案(Regnlarfile)的實例令，「項目類型」係由附加槽名定義之，當檔案類1由目則檢疋知式疋義可以使用者易懂(User_fri_y)檔名作為檔案類型，同樣產生相同紐名稱之不_加檔名通常會被分成—組(例如λ及以的樓皆被稱為C/〇~M票頭檔(Header File))，此外，可將所有相醜型之樓案分組以採用一或多階層，在-原型(Prototype)中，「文件槽」、「圖片檔」、「影音槽」、「程式槽」及「其他樓案」之後設群組(Metagroup)係被納入考慮並處理，此外，人員之後設群組可經處理為類別物件(dass object)。舉例而言，一「項目類型=人員」之清單可按可用以聯絡一指定人員之通訊管道類型劃分為較小的區塊，此包含可以寄送郵件、電話、即時訊息或電子郵件被連絡到之人員的群組，如有需要，可再進一步一一劃分這些群組，舉例而言，在一公司環境中，郵件位址可被劃分為内部（自公司之通訊錄分出)與外部(通常來自使用者個人的連絡清單）’有些人可能擁有多種通訊方式，如此一來造成其具有數個叢集，屬性叢集有別於習知之資料夾，並不限制項目僅可位於單一位置上。，資料夾代表一使用者建立之項目群組，且預期當一段時間過後，以屬性為主進行之項目叢集會將資料夾之需求度及重要性降低，但仍支援資料夾，資料夾一般以階層方式排列，而資料夾叢集應類同於此階層，資料夾階層的缺點之一為其包含數個目錄，例如程式集或附沉fows目錄，令使用者提不起使用興趣，當使用 200529063 既有資料夾將項目排列成叢集時，很明顯的改善為之處為其僅將確實包含於檢視晝面中之某些項目資料夾的部份資料夾階層顯示出來。第3圖為一(C槽中）包含程式檔案之實例介面300，舉例而言，在Windows 樓案總管中，其檢視畫面中包含完整資料夾結構，在一原型中，按「種類」叢集槽案僅包含與實際選定之項目組相關的資料夾（完整資料爽之樹狀結構的子集）。第4圖為-介面4〇〇其示範說明按資料夾進行之叢集，另一種資料夹階層的態樣為其結合實體位置(這個或那個猶，或外部共㈣源)與邏輯位置(一資料失中的配置)的概念，由於邏輯群組之建立可能橫跨幾個實體位置，所以實體位置與資料爽屬性可能是分開的，雜此_呈現具有_名稱之射枝，不論其實體位置為何，同理，按位置進行之分組亦可提出。第5圖為存在於兩個磁槽（c槽與〇槽)上之資料失(娜)的實例介面獅，檢閱該「麵」VSS時，該介面，在51G上結合來自該實體位置的資料夹内容，此功能係基於若有二或多資料失具有相同名稱其即依目的發生的假設，若此非上述之情形，則槽案可由第6圖所描述之使用者介面61〇中一位置屬性分隔即可。第7圖為-介面700其示範說明按日期屬性進行之叢集，以日期與時間之叢集化曰自動產生年/月/日/時/分的階層；但是仍存在—相對時間的概念，即相對於「現在」’吾人相信上述概念兩者皆重要，日期叢集包含數個事先定義靖動態群組），其中該事先定義查詢包含來自「今日」、「昨日」等之項目。 ^之員目刀類為按人員之相關項目分類者，被用以建立此關聯性之員目屬陡很彳例如，用於電子郵件或附加樘案的「寄件者」或「收件者」、用於文件之「作者」、麟則之「被拍攝人」等，由於人員階狀任何表示傳達人際内涵，因此按人員叢隼 ^ 月問題，舉例而言，人員可能按某種較正式的特性分組，例如「内部」或「外 W連、.Ό人」，但疋這些群組中有些仍然過大，無法有效管理，譬如，一實例電子郵午斤參之内邛連絡人的清單具有約5,〇〇〇個 11 200529063 連絡人姓名。該清單可按字侧賴列或按前制畔母分組⑽於字典之料)，但是任何較 • *的清單通常仍很難容納得下，其一問題是對-使用者重要之人員的姓名會被意外出現之較不熟悉人員的姓名所混淆，·在此可假定大多數重要連絡人為該使用者 " 最常或最近以電子郵件連絡之人，或者為使时磁碟上《件的作者或共同作者等，透過某種加權分析的使用可建構一按對使用者之相對重要性排列之所有人員的清單。 _ 但是，提示-長串按0計算出之重要性制的人該名清單可料是一種合宜的解決方式，該已計算出之順序可能僅是突發性且並不能反映個人的重要感，所以搜尋位於清單中間或最底層的姓名仍為困難，重要性資訊應被用以選擇顯示於第一位或最上層之姓名為何，但是按字母順序排列姓名可使特定姓名之搜尋較為簡單，且減少有關人員之相對重要性的可能提示。第8圖為-實例介面_，描述用以檢視相關人員之半開展式(Semi c，_ 清單。此可包含-人員清單的階層式制，其係將按字母排狀平展式清單提示予使用者’當該清單第-次顯科，其僅包含按字母順序排列之前幾⑼至2〇)個最重要的姓名，此允許以滑鼠單擊存取最重要相關人員的相關資訊，同時，第一 • 個姓名的角色如同字典的書籤，且其可一一被展開以顯示第二層的姓名或其他第三層以後的層級，此稍似於階層式展開，但將所有已展開姓名顯示於最上層作為第一層姓名時、除外，舉例而言，若該階層不符於排列階層則可能產生某一人優於其他人之上的、負面認知，因此後者之提出係以消除該種含意為目的，當重要性清單之最下層的姓名出現於檢視晝面前，可以持續將清單展開，但是，由於該項展開可被執行於該清單之已選定區上，因此可限制可觀看姓名的總數，通常一次十組姓名，在任何指定時間，該可觀看姓名係按字母順序排序並提示為一單一清單，這樣可簡單 12 200529063 地找到所需的姓名，請注竟丰、土數明顯實例包含_字清單(種類)及字典清單(大全)資’非僅人貝’少將現有輸入作為目錄索引使用的概念非常常見 — A.66 4®^^. I, . , e 貫上，其即為排列印刷子〇的心準方式’但是’在該標準字典的並指干兮百融—L 被置於母-頁的頁首及頁尾私不奸_谷，此可稱鱗續之索⑽的選用夕令全* dfc曰士心1置(Constant space)」，索引之文子並非具有特定意義，翻其正好位於該頁之頁首或頁尾。在本發明中，索引所選用之姓名為位於「重要性」清單之最上时字典邏輯之使用，姓賴是最a ^ ^ ^ u 文子此外，适些姓名其自身即為不闕即可選用，如此僅以滑鼠單擊即可存取最常見資料項，而滅」換&之，各索引之間可具有不同數目之第二順養項’當第二順_項的數目很大，可再建立位，以此類推。第9圖描述半開展群組900，而第1〇圖顯示自群組·中選定一群組讎的展開狀態’第1〇圖亦於刪中描繪該群組_之半開展狀態，在呈現叢银其他將項目分在同-_方法)時的另_個_是該叢集如何舰在書面中一般的顯現群組做法是顯示整個群組之某部分表示(半開展檢視)，或者顯示該群組中所有的項目(展開檢視），在標準WindGWS的表衬，將謝列於左方而項目列於右方可被視為-目前可觀看資料夹所用之展開檢視與其他所有資料夾所用的半開展檢視，目前資料夾之子f料夾通常以半開展方式顯示，即使該子龍夹之簡略内谷之中亦可包3/數項目之拼凑亦然，有時_個以上的展開群組可同時被觀看或當该項目顯示時被分成較小堆疊。在允許分組且可同時顯示多組群組的槽案檢視器中，該些群組通常為可開展 (C〇¥ble)，也歧-群組之内容可各自被_或隱藏，不過該群組仍可以兩種狀態存在，而制狀態允許其_群财的各項目妓操作，在較大群組的實例 13 200529063 中，一群組的展開會掩蓋到其他群組，使得多群組(Multi-group)檢視變得不實用，在本發明中，係採用一第三狀態，其顯示該群組之前幾個項目，其即編號900 中之群組的「壓縮」或「半開展」狀態，重複點選一單鍵即可在編號1000之展開狀態、1010的壓縮狀態與900的半開展狀態之間循環作切換，介面900係一檔案檢視器，其顯示兩組半開展群組與該第三組較小所以於910中可以完開啟。IlL example. 'Other functions can be used to provide different relative weights to allocations that are not part of the ideal allocation. An example of a knife number function can be based on a Binomial distribution, as follows: score {Nj〇tal) \ l {{njtems y)! * (Njtems ^ 2) 1 * ·. ·) Here No | (In the L case, the score value has a statistical conversion, that is, it provides some ways to divide / item into clusters of the same size. The cluster giving the maximum value to the user is a kind of alternative allocation. 7 200529063 Clustering with reduced maximum number 'To compare different attributes that can be used by subsequent clustering, the clustering scores for all attributes need to be calculated' It only needs to list all items through a single transmission path, which will be The processing procedure described in Fig. 2 is described in detail. Fig. 2 is a flowchart describing an automatic clustering processing procedure according to aspects of the present invention. At the same time, to simplify the description, the method is based on a series of actions. The method is illustrated and explained. I should know and understand the order of this post, but Genxian invented that the scale of the work can be different from the other actions in the same way as in the "show and system." For example, Those familiar with this item should know and A solution method can optionally be presented as a series of states or events with interactive relationships, such as a state diagram. In addition, implementing the method according to the present invention does not require all the actions described in the figure. Assuming comparison is required Fixed items and M attributes, the processing procedure 2000 can be implemented as follows: in 21, initialize M hash tables (Hash table), in 220, repeat in # items' in the details, for each- The item is duplicated in the genus, and in the tree, the Qi-attribute Lai Zhizai (Kaeta e) is calculated.-The selection method of the hash function is to return the same hash to the attribute values of the two-to-the-cluster. Value, for example 'when clustering by date, the hash function can be based only on the date and omit the time part. In 250, the hash table is used to track the number of clusters and each cluster. The number of items in ⑽ 'clustering score per-attribute is based on the detailed age table: Miscellaneous calculation, in 270' Attributes in the list are sorted by the number of clusters they can generate If the number of items exceeds a certain threshold (for example, more than 10 items The results can be automatically clustered using the top-level attribute in the list of 5 in 280, and other clusters can be suggested to be arranged in succession as a replacement. For example, in the test example of mail messages, when all mail messages are selected Type of project, the above processing procedure automatically clusters the results by sender cluster, but, for example, if the selected project type is version ^ file, the cluster is created based on the latest update date, otherwise if the project type is C # · The program system is based on: the inclusion grouping of miscellaneous folders (this corresponds to the grouping according to the program). The common feature of the above method allows to determine the rules that are most suitable for the grouping of a specified group of items, including users 8 200529063 Attributes. Numbers 3 to _ list several examples describing the user interface of one or more automatic clustering systems and processes, ®. It should be noted that these interfaces can include-a display, which has-or more display objects, including a set of icons, buttons, slide axes, input boxes, selections, menus, labels, etc. have helpful operations. Those who can set the size, shape, color, text, data, and sound of the system 100. In addition, the interface can also include several other input settings and controllers for adjusting or setting the present invention or multi-states. , Which will be detailed below. This may include receiving user commands to affect or modify from mouse, keyboard, voice input, remote network services, and / or new devices (such as camera or image input) The interface of the system 100 operates in other aspects. The following describes the different aspects of the present invention and the interfaces described by way of example in FIGS. 3 to 10, when a software designer designs a folder or other type of interface. In the structure, it is free to allow unimportant or rarely used items to be hidden in the top-level view and put the item into a hidden data flame. Similarly, when creating-viewing attribute position II, even cluster pay The score is high, and the mechanism can still provide several different mechanisms to hide unimportant or impractical attributes. Up / down attributes can occur at different levels, at the application level, the designer of the application can instruct What attributes are the main attributes disclosed in the user interface, and which are the secondary or auxiliary attributes. This is usually to define each item type separately. The automatic query cluster described in the previous paragraph-is generally regarded as the primary attribute. Each item type should define attributes common to all items. Attribute mapping (for example, a common "date" attribute for pictures can be mapped to "acquisition date", and In the document, it can be mapped to the "Last Modified Date". Similarly, "Person" can be the "Author" in the document, the "Sender" for the mail, and so on. Generally speaking, only use Can determine the best attributes of their individual data, and can have the certainty of upgrading or degrading any special attribute υτ (user interface), but the present invention can also implicitly learn from User actions (for example, through learning rules), each attribute can have its own weight, which is increased when the user changes from a cluster of different attributes to another, while the reward time is reduced. In the end, "Flat Knife (what is the attribute on which the cluster is based) is the product of the genus number and the clustering score (calculated according to the above formula). As mentioned above, users generally prefer the flat list. Hierarchical arrangement of item type clusters. This layer takes a certain type of order and simplifies it to search for the required "item type" value. The same situation should also apply to any attribute with different attribute values. Specific examples are listed below. Describes the method of arranging attribute values into a hierarchical view. In the example of a general slot file (Regnlarfile), the "item type" is defined by an additional slot name. When the file type 1 is determined by a visual inspection, the meaning is known. User_fri_y file names can be used as the file type, and the same __ file names that generate the same button name are usually divided into groups (for example, λ and Yi are called C / 〇 ~ M Header File), In addition, you can group all the ugly buildings to use one or more levels. In the Prototype, the "document slot", "picture file", "video slot", Metagroups after "program slots" and "other building cases" are taken into consideration and processed. In addition, staff later groups can be processed as dass objects. For example, a "item type = person" list can be divided into smaller blocks according to the type of communication channel that can be used to contact a designated person. This includes mailings, phone calls, instant messages, or emails that can be contacted to Groups of people can be further divided if necessary. For example, in a company environment, mail addresses can be divided into internal (divided from the company's address book) and external ( (Usually from the user's personal contact list) 'Some people may have multiple communication methods, which results in multiple clusters. Attribute clusters are different from conventional folders and do not restrict items to a single location. , The folder represents a user-created project group, and after a period of time, it is expected that a cluster of projects based on attributes will reduce the demand and importance of the folder, but still support the folder. Hierarchical arrangement, and the folder cluster should be similar to this hierarchy. One of the shortcomings of the folder hierarchy is that it contains several directories, such as assemblies or attached fows directories, which makes users not interested in using it. When using 200529063 When the existing folders arrange the items in a cluster, the obvious improvement is that they only show a part of the folder hierarchy of some item folders that are actually included in the view of the day. Figure 3 is an example interface 300 (in slot C) that contains program files. For example, in the Windows Building Explorer, its view contains the complete folder structure. In a prototype, clusters are sorted by "Kind" The project only contains folders related to the actual selected project group (a complete subset of the cool tree structure). Figure 4 is-interface 400, which demonstrates clustering by folder, and another folder hierarchy looks like combining physical location (this or that, or external common source) and logical location (a data (Missing configuration) concept, since the establishment of a logical group may span several physical locations, the physical location and the data attributes may be separated. Miscellaneous _ presents a shoot with _ name, regardless of its physical location. Similarly, grouping by position can also be proposed. Figure 5 is an example of the data loss (na) existing on two magnetic slots (c slot and 0 slot). The interface lion, when reviewing the "face" VSS, the interface combines data from the physical location on 51G Folder content, this function is based on the assumption that if two or more data have the same name, it will occur according to the purpose. If this is not the case, the slot can be located at one of the positions in the user interface 61 as described in Figure 6. Separated by attributes. Figure 7 is-interface 700 which exemplifies clustering by date attributes, and clustering by date and time automatically generates a hierarchy of year / month / day / hour / minute; but the concept of relative time, that is, relative In "now", I believe that the above concepts are both important. The date cluster contains several pre-defined dynamic groups. The pre-defined query includes items from "today" and "yesterday". ^ The item category is classified by the person's related items, and the item used to establish this association is very steep. For example, the "sender" or "recipient" used for e-mail or attachment "," Author "for documents," Photographer "of Lin Ze, etc., because any representation of the staff hierarchy conveys interpersonal connotation, so it is a question of staff ^ month, for example, the staff may , Such as "internal" or "outside W-connected, .Ό 人", but some of these groups are still too large to effectively manage, for example, the list of contacts in the e-mail address About 5,000,000 11 200529063 Contact names. The list can be listed next to the word or grouped by the parent group in the dictionary), but any list that is more than * is still difficult to accommodate. One problem is the name of the person who is important to the user Will be confused by the name of an accidentally less familiar person, it can be assumed here that most important contacts are the user " the person most or most recently contacted by email, or Authors or co-authors, etc., through the use of some kind of weighted analysis, can construct a list of all persons ranked by their relative importance to users. _ However, the reminder-long list of people based on the importance system calculated by 0. The name list is expected to be a suitable solution. The calculated order may only be sudden and does not reflect personal importance. So searching for names in the middle or bottom of the list is still difficult. Importance information should be used to choose which names are displayed first or top, but sorting names alphabetically makes searching for specific names easier, and Possible hints to reduce the relative importance of the people involved. Figure 8 is-Example interface _, which describes the semi-expanded list (Semi c, _) used to view related personnel. This can include-a hierarchical system of personnel lists, which will be used to prompt the use of alphabetical flat list The 'when the list is the second-most prominent department, it only contains the first few to 20) most important names in alphabetical order. This allows one-click access to relevant information about the most important relevant people. At the same time, The role of the first name is like a bookmark of a dictionary, and it can be expanded one by one to display the names of the second level or other levels after the third level. This is slightly similar to the hierarchical expansion, but displays all the expanded names. Except when the top level is the first level name, for example, if the level does not match the ranking level, it may produce a negative perception that one person is superior to the other. Therefore, the latter is proposed to eliminate the meaning of Purpose, when the lowest-level name of the importance list appears before the inspection day, the list can be continuously expanded, but because the expansion can be performed on the selected area of the list, it can be restricted Look at the total number of names, usually ten groups of names at a time. At any given time, the viewable names are sorted alphabetically and presented as a single list, so that you can easily find the name you need. 12 200529063 The obvious examples include the list of _words (types) and the list of dictionaries (encyclopedias). The concept of using the existing input as a directory index is very common — A.66 4® ^^. I,., E , Which is the accurate way of arranging the printed children 〇 but 'in the standard dictionary and refers to the dryness of the hundred-Long -L is placed at the top and bottom of the mother-page, and not guilty, this can be called scale The selection of the continuation of Suo Yanxuanquan * dfc is called "Constant space". The index text does not have a specific meaning. It is located at the top or the end of the page. In the present invention, the name used in the index is the use of dictionary logic when it is at the top of the "importance" list. The last name Lai is the most a ^ ^ ^ u. In this way, the most common data items can be accessed with a single click of the mouse. In other words, each index can have a different number of second-order items. When the number of second-order items is large, Bits can be re-established, and so on. Fig. 9 depicts the semi-expanded group 900, and Fig. 10 shows the expanded state of a group selected from the group. Fig. 10 also depicts the semi-expanded state of the group _ in the delete. Cong Yin's other _ method when the project is divided into the same _ method is how the cluster appears in writing. The general practice of displaying the group is to display a part of the entire group (semi-expanded), or display the All items in the group (expanded view) are listed on the left side of the standard WindGWS table and items listed on the right side can be considered-the expanded view used by the currently viewable folder and all other folders used The semi-explored view of the current folder is usually displayed in a semi-expanded manner, even if the sub-clip of the sub-folder can be bundled with 3 / numerical items, and sometimes more than _ expanded groups Groups can be viewed simultaneously or divided into smaller stacks when the item is displayed. In the case viewer that allows grouping and can display multiple groups of groups at the same time, these groups are usually expandable (C ¥ ble), and the content of the groups can be _ or hidden separately, but the group Groups can still exist in two states, and the control state allows the operation of various projects of its _ group wealth. In the larger group instance 13 200529063, the expansion of a group will be masked to other groups, making multiple groups ( Multi-group) viewing becomes impractical. In the present invention, a third state is used, which displays the previous items of the group, which is the "compressed" or "semi-expanded" state of the group in number 900. Repeatedly click a single button to cycle through the unfolded state of 1000, the compressed state of 1010, and the semi-expanded state of 900. The interface 900 is a file viewer that displays two sets of semi-expanded groups and the The third group is smaller so it can be opened in 910.

該壓縮狀態的優點之一為群組所占用的畫面空間雖較一開啟者小，但其所提供給予使用者群組之相關資訊也比關閉狀態多，如此允許可觀看的群組更多且仍可提供該群組之内容的相關詳細資訊，而使用者可更快速地估計一大組項目中的群組，依次對數組大群組之項目提供更有效的評估與控制。第二個優點是半開展狀態仍提供以滑鼠之直接單擊存取少數可觀看項目，假設可觀看項目按其對於使用者之「重要性」（最新或過去最常被存取)被選用，則該可觀看項目即最常被使用者搜尋者，舉例而言，若要列印一最近寄出給某人之圖片，使用者可捲動「圖片」群組並搜尋，而該檔案應該就在該清單的最上層（即最近被存取者），此可比作目前檢視器’即若該圖片的縮圖以資料夾圖示顯示，則使用者仍須開啟該檔案夾來存取該檔案，總而言之，壓縮式檢視大約介於半開展式與展開檢視之中間，其試圖在檢視及控制整個群組與對各項目之單一存取間作平衡。由於半開展式檢視提供一便捷之方式存取選自群組中之項目(不需處理其中之所有項目），所以使用者可被賦予控制權以控制顯現於該半開展式檢視書面中之項目及顯示之數目，在一方法中，該項目可依據一事先預定之標準排序並自該已排序清單之最上層顯示項目，使用者可更改用以搜尋的標準及顯示項的數目，舉例而言，依最近修改曰期排序即為一便捷又實用的排序方式，依預設，該半開展式檢視可顯八、、主單中前η個最新文件，並具有一用以顯示接續之"項之按鍵，其一替代方式為3具One of the advantages of this compressed state is that although the group occupies less screen space than an opener, it provides more relevant information to the user group than the closed state. This allows more viewable groups and Detailed information about the content of the group can still be provided, and the user can more quickly estimate the group in a large group of items, and in turn provide more effective evaluation and control of the items in the large group. The second advantage is that the semi-expanded state still provides direct click access to a small number of viewable items, assuming that the viewable items are selected according to their "importance" (latest or most commonly accessed) for the user , The viewable item is the one most commonly searched by the user. For example, to print a picture that was recently sent to someone, the user can scroll through the Pictures group and search, and the file should Right at the top of the list (ie recently accessed), this can be compared to the current viewer ', ie if the thumbnail of the picture is displayed as a folder icon, the user must still open the folder to access the Files, all in all, compressed views are somewhere between half-expanded and expanded views, which try to balance the view and control of the entire group with a single access to each item. Since the semi-expanded view provides a convenient way to access items selected from the group (you do not need to process all of them), users can be given control to control the items that appear in the semi-expanded view as written And displayed number. In one method, the item can be sorted according to a predetermined standard and displayed from the top of the sorted list. The user can change the number of criteria and displayed items for searching. For example, According to the latest modification of the date sorting, it is a convenient and practical sorting method. By default, the semi-expanded view can display eight or n of the latest documents in the master list, and has a display for continuation " Item keys, one alternative is 3

200529063 有-按鍵·示今天、昨天、上週、上则等之其他剩餘文件，—般而言，在所有這些情況下’顯示項目的順序用以_可觀看項目者；但是’另—種方法係以-種對賴者最便捷的方式排剩目，且其不需選擇項目之標準，舉例而言’人員之剩通如按字母順序為最佳，即使其順序依據「重要性」亦壓縮群組中的項目可顯示為半開展式清單，而該半開展式清單可選擇是否展開以顯不例目備，侧軸貞目。）蝴展式清單檢視可使縣_物目娜喻嶋伽綱，（細序與選用順序相_柯使闕伟展式清單純）。_神母餐騎之最纽曲清單即為一例，使用者可展開該清單之部分來顯示較少的流行歌曲，但是後續可以依歌曲之受歡迎程度選入檢視晝面。 ▲在建立-細冑層時，較㈣_㈣通常含括所_咖的内容，舉例而吕’「文件」叢集含括所有⑽文件、㉞文件等，同樣地，來自細年資目W來自各月的項目，各月再依次包含各日之項目，任何容器(叢集或負料夹)可被視為—被修改鱗—實體之獨立項目，或僅是-用以排韻檢視書面的項目群組。一 ,項目劇覽器之主要功用為簡單地啟動被請求項目的搜尋；但是，向下追縱屬集^為方法之錢功能性可藉由允許階層式搜尋找到某些相關連項目進 3幅提升’非向下深入搜尋該屬性階層，此外，覽器應允許以任何使用者 t式湖項目’當使用者搜尋項目時，其通常_將項目聯想在-起的方 t打’舉例而言’使用者可能不知文件之最近編輯的確切日期，但是其可能記 j發生於-重要會議之前，該會議日期可能容易找到，基於此點最切題的查詢為顯不所有來自相同日期的文件」。多圖用以H施本發明之不同態樣的示範性環境1110包含一電 15 200529063 腦1112’該電腦包含-處理單元跡一系統記憶體⑽及一系統匯流排⑴8, 該系統匯流排m_多數系統元件包含(但不限於）連接系統記憶體⑴^與處理單元m4 ’該處理單元11M可為數種不同可取得之處理时任一者，雙處理器及其他多工處理器架構亦可當作處理單元之用。系統匯流排1118可為數種類型之匯流排架構中任一者，其中該數種類型之匯流排架構包含記Μ匯流排或雜體控繼、週邊裝置匯流排及個數種可取得之匯流排架構之任-者的區域端匯流排，其中該可取得之匯流排架構包含(但不限於)16位元匯流排、工業標準架構(ISA)、微通道架構（msa)、延伸工業標準架構 (EISA)、智慧型電子驅 _)、VESA區域匯流排（vlb)、周邊零件連接介面 (Ρα)、通用序列匯排流(USB)、_加速埠（AGP)、PCMCIA架構及小型電腦系統介面（SCSI)。系統5己憶體1116包含揮發記憶體丨⑽及非揮發記憶體lm ;非揮發記憶體 1122中儲存基本輸入/輸出系統（BI〇s)，其包含在電腦1112之元件間傳送資訊之基本例行常式(例如開機時所需者)。舉例而言(且不限於)非揮發記憶體1122可包含唯讀記憶體（ROM)、可程控唯讀記憶體（PR0M)、電子式可程控唯讀記憶體、 (EPROM)、電子式可消除唯讀記憶體(EEPROM)或快閃記憶體；揮發記憶體112〇包含可作為外部快取記憶體之隨機存取記憶體（RAM),舉例而言(且不限於），可用之RAM的形式有數種，例如同步隨機存取記憶體(SRAM)、動態隨機存取記憶體(DRAM)、同步動態隨機存取記憶體(SDRAM)、同步雙倍資料傳輸動態隨機存取記憶體(DDRSDRAM)、增強式同步動態隨機存取記憶體(ESDRAM)、同步連結動態隨機存取記憶體(SLDRAM)及直接Rambus隨機存取記憶體（drram)。電腦1112亦包含抽取式/非抽取式、揮發/非揮發電腦儲存媒體。例如，第u 圖所描述之儲存磁碟1124,該儲存磁碟1124包含(但不限於)一些裝置例如磁性磁碟機、軟碟機、磁帶機、Jaz磁碟機、Zip磁碟機、LS-100磁碟機、快閃記憶卡或 16 200529063 長條狀記憶卡，此外，儲存磁碟㈣可分別包含储存媒體或是將其結合其他健存200529063 Yes-Button · Shows the remaining files of today, yesterday, last week, last, etc.-In general, in all these cases, the order of the displayed items is used to _ viewable items; but 'another way It is the most convenient way to eliminate the leftovers of the opponents, and it does not need to choose the criteria for the project. For example, 'the leftovers of personnel are the best in alphabetical order, even if the order is compressed according to "importance" The items in the group can be displayed as a semi-expanded list, and the semi-expanded list can be selected to expand or not to show examples, sideways. ) Butterfly exhibition list Review can make the county _ Wu Mu Na Yu 纲 gang, (fine order and selection order phase _ Ke Shizhen Wei Wei exhibition list pure). _ The list of the most beautiful songs of God Mother's Meal Ride is an example. The user can expand the part of the list to display fewer popular songs, but then they can be selected to view the daytime according to the popularity of the songs. ▲ When establishing the -detailed layer, the content of ㈣_㈣ usually includes the contents of the _coffee. For example, the "Documents" cluster includes all the files of ⑽, ㉞ files, etc. Similarly, from the junior title W from each month Items, each month contains items for each day in turn, any container (cluster or negative clip) can be regarded as-a modified scale-an independent item of the entity, or just-a group of written items for rhyme inspection. First, the main function of the item viewer is to simply start the search for the requested item; however, following the genus set ^ as a method of money functionality can be found by allowing hierarchical search to find some related items into 3 frames Promote 'Non-Down Searching This Attribute Hierarchy. In addition, the browser should allow any user's t-like lake items.' When a user searches for an item, it usually _ associates the item with the starting point. For example 'The user may not know the exact date of the most recent editing of the document, but he may remember that it happened before the important meeting, the meeting date may be easy to find. The most relevant query based on this point is to show all documents from the same date.' The multiple images are used to illustrate different aspects of the present invention. The exemplary environment 1110 includes an electricity 15 200529063 brain 1112 '. The computer includes a processing unit trace, a system memory, and a system bus. 8, the system bus m_ Most system components include (but are not limited to) connecting the system memory ⑴ ^ and the processing unit m4 'The processing unit 11M can be any of several different types of processing available. Dual processors and other multiplexed processor architectures can also be used. Used as a processing unit. The system bus 1118 can be any of several types of bus architectures, where the several types of bus architectures include memory buses or hybrid control relays, peripheral device buses, and several available buses The regional-side bus of any one of the architectures, wherein the available bus architecture includes (but is not limited to) a 16-bit bus, an industry standard architecture (ISA), a microchannel architecture (msa), and an extended industry standard architecture ( EISA), intelligent electronic drive_), VESA area bus (vlb), peripheral component connection interface (Pα), universal serial bus (USB), _ accelerator port (AGP), PCMCIA architecture and small computer system interface ( SCSI). System 5 memory 1116 contains volatile memory and non-volatile memory lm; non-volatile memory 1122 stores a basic input / output system (BI0s), which includes a basic example of transmitting information between components of computer 1112 Routines (such as those required when booting). For example (but not limited to) non-volatile memory 1122 may include read-only memory (ROM), programmable read-only memory (PR0M), electronic programmable read-only memory (EPROM), electronic erasable Read-only memory (EEPROM) or flash memory; volatile memory 112 〇 contains random access memory (RAM) that can be used as external cache memory, for example (but not limited to), the form of available RAM There are several types, such as synchronous random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), synchronous double data transfer dynamic random access memory (DDRSDRAM), Enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM), and direct Rambus random access memory (drram). Computer 1112 also includes removable / non-removable, volatile / non-volatile computer storage media. For example, the storage disk 1124 described in Figure u, which includes (but is not limited to) some devices such as magnetic drives, floppy drives, tape drives, Jaz drives, Zip drives, LS -100 drive, flash memory card or 16 200529063 strip memory card. In addition, the storage disk can contain storage media separately or combine it with other storage

媒體，例如(但不限於)光碟機如唯讀光碟存取裝置(即CM0M)、燒錄光碟機(CMMedia, such as (but not limited to) optical disc drives such as read-only optical disc access devices (ie CM0M), burning optical disc drives (CM

Dnve)、可讀寫光碟機(CD-RWDrive)數位影音光碟機①奶初⑹，通常該儲存磁 • 魏置1124與祕匯流排1118之連接可—抽取式或非抽取式介_如介面 - 1126)。吾人應了解第11圖所描述之軟體係作為使用者與適用之作業環境111〇中所描述之電腦主要資源間的媒介，此軟體包含作業系統1128，作業系統1128(可健存 _ _存磁碟1124上)係作為控制及分配電腦1112的資源之用；系統應用程式ιΐ3〇係透過健存於系統記憶體U16或儲存磁碟1124上的程式模組1132及程式資料 1134善加利用_業系統管理之資源，由此可知，本發明可藉由不同作業系統或作業系統之組合實施之。使用者透過輸人裝置11；?6輸人命令或資訊至電腦lm，該輸人裝置1丨36包含(但不限於)躺裝置如職、獅球、針筆、板、_、麥姐、遊戲操縱桿、遊戲控制板、碟型天線、掃描器、電視視訊卡、數位相機、數位攝錄機、網路攝影機料，這些及其他輸人裝置时稱1138透過系麵流排⑽連接處理單元11M，其中該介面埠㈣包含例如序列埠、平行埠、遊鱗及通用序列埠 • 卿）。輸出裝置U4G制某些與輸人裝置測相醜型之連接埠，因此，舉例而吕，一 USB埠可被用以提供輸入至電腦lm及自電腦im輸出資訊至一輸出裝置1140，輸出轉接卡1142被提供以顯示其他輸出裝置测中有一些輸出裝置 • U40例如監視器、·及印表機需要專用之轉接卡。輸出轉接卡1142包含，例如 . 限於)影音轉接卡，其提供一工具連接輸出裝置與系統歸排1118，吾人應注 a”、他咸置(或)裝置之系統啊提供輸人與輸出的功能，例如遠端電腦胸。 ❿電腦112可使用邏輯連線連接一或多遠端電腦例如遠端電腦ιΐ44運行於一網路％土兄中，其中该遠端電腦1144可為一個人電腦、词服器、路由器、網路％、 17 200529063 工作站、微處理器應用裝置、端點裝置或其他功用的網路節點等，且其通常包含所有上述連接電腦1112之組件中的多數，為簡化起見，遠端電腦1144僅以記憶體儲存裝置1146顯示，遠端電腦1144邏輯上透過一網路介面1148，實際上透過傳輸線路1150連接電腦1112，網路介面1148包含之通訊網路例如區域網路(LAN) 及廣域網路(WAN)。LAN的技術包含光纖分散式資料介面(FDDI)、銅纜分散式資料界面(CDDI)、Ethemet/IEEE 1102.3、記號環/IEEE 1102.5 等；WAN 技術包含(但不限於)點對點連結(Point-to-point link)、電路交換網路(Circuit switching network)如 t &服務數位網路(ISDN)及其變化形式、封包交換網路(packet switching network) 及數位用戶迴路（DSL)。傳輸連線1150係指用以將網路介面1148連接匯流排1118之硬體/軟體，雖然傳輸連線1150於圖中清楚顯示位於電腦中1112,但其亦可位於電腦1112之外部，連接至網路介面1148之連線所需之硬體/軟體包含(僅以舉例說明為主）包含内部與外部之技術例如數據機(包含一般電話調變數據機、纜線數據機(Cablem〇dem)、 DSL數據機、ISDN配接卡及乙太網路卡）。第12圖為一本發明可交互操作之典型電腦環境12〇〇的概要方塊圖。系統 1200包含一或多客戶端121〇,該客戶端121〇可為硬體與(或)軟體(例如線程、處理程序、電算裝置）；系統亦包含一或多伺服器123◦，該伺服器123〇可為硬體與(或)軟體(例如線程、處理程序、電算裝置），舉例而言，該词服器可收存(House)線程以執行根據本發明之程式轉換，一客戶端121〇與伺服器η%間允许的傳輸可以在二或多遠端電麟裡程序之間傳送資料所採狀資料封包的形式進行’系統1200包含-傳輸架構125〇，其可被用以幫助該客戶端i2i〇與饲服器 1230之間的通訊傳輸，該客戶端121〇之運行係連接一或多可被用以儲存客戶端 1210本機之資訊的客戶端資料儲存器丨，同樣地，該舰器㈣之運行係連接-或多可被用以儲存伺服H 123〇本機之資訊的伺服器資料齡器。 18 200529063 上述所有已說明者包含本發之元件或方法的組合--閣述，伯β 疑地’本發明之說明無法將可想到合及變更柯制，_，杯日尋U之其他更夕組改及㈣化，^ 3此合於本伽之精神與範_更替、修改及變化，此外，該用詞「为人卜 3」的賴_於實施方式或切專利範圍中，此用鉤之意在於以同於用詞「至少勿主乂包含」之方式做包含，如此當厂至少包含」一詞被用於一申請專利範圍項中，即可被了解。【圖式簡單說明】第1圖為-根據本發明—態樣之叢集系統的要方塊概圖。第2圖為-描述根據本發明—_之自魅詢叢钱理程序的流程圖。第3至1〇圖舉例描述根據本發明—態樣之自動查詢叢集賴之使用者介面。第11圖描述根據本發明一態樣之一適用作業環境的概要方塊圖。Dnve), CD-RWDrive digital audio and video optical disc drive ① milk early, usually the storage magnetic • Wei Zhi 1124 and the secret bus 1118 connection can be-removable or non-extractable interface _ 如 Interface- 1126). I should understand that the software system described in Figure 11 is used as a medium between the user and the main computer resources described in the applicable operating environment 111. This software contains the operating system 1128, operating system 1128 (Can be stored _ _ magnetic storage (On disk 1124) is used to control and allocate resources of computer 1112; the system application program ιΐ30 is used through program modules 1132 and program data 1134 stored on system memory U16 or storage disk 1124 to make good use of Resources for system management. It can be seen that the present invention can be implemented by different operating systems or combinations of operating systems. The user inputs commands or information to the computer lm through the input device 11;? 6, the input device 1 丨 36 includes (but is not limited to) a lying device such as a job, a lion ball, a stylus pen, a board, _, Maijie, Game joysticks, game control boards, dish antennas, scanners, TV video cards, digital cameras, digital camcorders, webcam materials, these and other input devices are called 1138 when connected to the processing unit through the system stream. 11M, where the interface port includes serial ports, parallel ports, scales, and universal serial ports. The output device U4G makes certain types of ports that are ugly with input devices. Therefore, for example, a USB port can be used to provide input to the computer lm and output information from the computer im to an output device 1140. A card 1142 is provided to show that some output devices are tested. • U40 such as monitors, and printers require a dedicated adapter card. The output adapter card 1142 includes, for example, limited to) the audio and video adapter card, which provides a tool to connect the output device and the system arrangement 1118. We should note a ", the system where he installed (or) device to provide input and output The computer 112 can use a logical connection to connect one or more remote computers, such as a remote computer ι 44 running in a network, where the remote computer 1144 can be a personal computer, Server, router, network, 17 200529063 workstation, microprocessor application device, endpoint device or other functional network node, etc., and it usually contains most of all the above components connected to the computer 1112, in order to simplify See, the remote computer 1144 is only displayed by the memory storage device 1146. The remote computer 1144 is logically connected to the computer 1112 through a network interface 1148. In fact, the communication interface included in the network interface 1148 is a local network. (LAN) and Wide Area Network (WAN). LAN technologies include Fiber Distributed Data Interface (FDDI), Copper Distributed Data Interface (CDDI), Ethemet / IEEE 1102.3, Sign Ring / IEEE 1102 .5, etc .; WAN technologies include (but are not limited to) point-to-point links, circuit switching networks such as t & Service Digital Network (ISDN) and its variants, and packet switching Network (packet switching network) and Digital Subscriber Circuit (DSL). Transmission connection 1150 refers to the hardware / software used to connect the network interface 1148 to the bus 1118, although the transmission connection 1150 is clearly shown in the figure on the computer 1112, but it can also be located outside the computer 1112. The hardware / software required for the connection to the network interface 1148 includes (only for illustration purposes) includes internal and external technologies such as modems (including general Telephone modulation modem, cable modem (DSL modem, DSL modem, ISDN adapter card and Ethernet card). Figure 12 shows a typical computer environment of the present invention which can be operated interactively. Outline block diagram. The system 1200 includes one or more clients 121. The clients 121 can be hardware and / or software (such as threads, processing programs, and computing devices). The system also includes one or more servers 123. , The server 123〇 can be hard And / or software (such as threads, processing programs, and computing devices). For example, the server can store threads to perform program conversion according to the present invention. A client 121 and a server η The allowable transmission between% can be performed in the form of a data packet transmitted between two or more remote programs. The system 1200 contains a transmission structure 125, which can be used to help the client i2i〇 and The communication between the feeders 1230. The operation of the client 121 is connected to one or more client data storages that can be used to store the information of the client 1210. Similarly, the ship Operating system connection-or server data aging device that can be used to store servo H 123o local information. 18 200529063 All of the above described components include the combination of the elements or methods of the present invention-the statement, the question "the description of the present invention can not be conceived and changed the system, _, the other day of the U Reorganization and transformation, ^ 3 is combined with the spirit and scope of Benga _ replacement, modification and change. In addition, the word "for people" 3 depends on _ in the implementation or the scope of the patent. It is meant to be included in the same way as the word "at least do not include it," so that when the factory includes at least the term is used in a patent application, it can be understood. [Brief Description of the Drawings] FIG. 1 is a schematic block diagram of the cluster system according to the present invention. FIG. 2 is a flowchart describing a self-inquiring bank money management program according to the present invention. Figures 3 to 10 illustrate the user interface of the automatic query cluster according to the aspect of the present invention. FIG. 11 is a schematic block diagram illustrating an applicable working environment according to one aspect of the present invention.

第12圖為一本發明可交互操作之典型電腦環境的概要方塊圖。【主要元件符號說明】 100查詢叢集系統 110資料儲存器 120項目 130匯集器 140屬性分析器 150叢集排列器 160叢集 1112電腦 1114處理單元 1116系統記憶體 1118系統匯流排 1120揮發記憶體 1122非揮發記憶體 1124儲存磁碟 1126介面 1128作業系統 U30系統應用程式 1132程式模組 19 200529063 1134程式資料 1138介面埠 1142輸出轉接卡 1146記憶體儲存裝置 1150傳輸線路 1230伺服器 1250傳輸架構 1136輸入裝置 1140輸出裝置 1144遠端電腦 1148網路介面 1210客戶端 1240伺服器資料儲存器 1260客戶端資料儲存器FIG. 12 is a schematic block diagram of a typical computer environment in which the present invention is interoperable. [Key component symbol description] 100 query cluster system 110 data storage 120 items 130 aggregator 140 attribute analyzer 150 cluster arranger 160 cluster 1112 computer 1114 processing unit 1116 system memory 1118 system bus 1120 volatile memory 1122 non-volatile memory 1124 storage disk 1126 interface 1128 operating system U30 system application program 1132 program module 19 200529063 1134 program data 1138 interface port 1142 output adapter card 1146 memory storage device 1150 transmission line 1230 server 1250 transmission architecture 1136 input device 1140 output Device 1144 Remote computer 1148 Network interface 1210 Client 1240 Server data storage 1260 Client data storage

2020

Claims

200529063 Scope of patent application and application ····· A computerized interface for data presentation, which includes at least: -attribute analysis n, less rib running: Cong She soft_item sorter, rib root_item allocation Make up a new cluster. 2. The system as described in item 1 of the scope of patent application, wherein the cluster attribute is associated with-or multi-data items, wherein "枓 items are stored in at least one of the _area end and the -remote storage location".

'Where the data item contains a computer representation of documents, slots, data, messages, and external objects, of which: 3. System image, audio file, video file, code as described in item 2 of the scope of patent application , Objects include people or locations. 4. The system as described in item 2 of the scope of patent application, wherein the cluster attribute is associated with at least one of the item type-the establishment date or time and the person, wherein the person is related to the data item, the location, the type And-system, application 'system management or user-defined attributes. 5. The system as described in item i of the patent application scope, wherein the attribute analyzer determines an attribute according to the type of an item, and then determines another attribute-a subsequent cluster. ^ 6_ The system as described in item i of the patent application scope, wherein the cluster assigns a clustered score to different project attributes and selects the attribute with the highest score. 7. The system as described in item 6 of the scope of patent application, wherein the clustered score (calculated by reading the product of the following equation: ^ 21 200529063 score = njtems tM i 1 n_items 1 · .. ° 8 The system as described in item 6 of the scope of patent application, wherein the clustering score is based on the binomial distribution shown below:-= (7 ^ as as /)! / (〇2— 如腐龙 # /)! 1 («一此 ⑽ μ2)! 1 · · ·) 〇9. The system described in item 1 of the scope of patent application, which further includes a user interface for displaying the following information. At least-the cluster result, receiving query selection and receiving attribute information, displaying information about a data item in a cluster. 1〇 ·· ## There are "readable age computer-readable media for implementing the attribute analyzer and cluster arranger of item 1 of the patent scope of the hat." 11. A system for automatically clustering query results, comprising at least: an acquisition component for acquiring attributes of a majority of items; a score determining component for determining a score of the majority of items based on the attributes; and Φ — The component of automatic clustering data, which automatically clusters the relevant data of these items according to the determined score. 22 1 12. — A method for automatically querying clustering, which includes at least the step of: 'its order-or multiple attributes are associated with most data items;-the step of determining the allocation of data items, which is based on the attributes Distribution of the data item; and-an automatic clustering step, which automatically clusters the data item based on the determined allocation. 200529063 13. The method described in item 12 of the scope of patent application, which allows the allocation to be determined by itself: score = njtems ι * njtems 2 * ... score = ^ 幻加 /)! / 一你 ^ η / )! * 〇2— 做心 η2)! * .. ·). At least one of the following equations 14. The method as described in item 12 of the scope of patent application, which further includes processing AM solid items and dependents 15. The method as described in item 14 of the scope of patent application, which further includes the following At least one step · A hash table, repeating to W items, and repeating to each attribute for each item. Initialization 16. A method as described in item 15 of the scope of patent application, which further includes calculating a hash value for each attribute c 17. A method as described in item 16 of the scope of patent application, which further includes using The feedstock calculates a clustering score for each attribute. The value of an associated hash table 18 · = The method described in item 12 of the patent scope, which further includes an automatic permutation of Lu clusters based on a -predetermined threshold. 9. The method as described in item 18 of the δ-Month Patent Scope, which further includes a proposed alternative cluster grouping. 20. The method as described in claim 18 of the scope of patent application, which further includes arranging clusters according to user-defined attributes. 21 · —A kind of graphic user interface, which includes at least: one or more data items and related attributes stored in the database; 23 200529063 one or more display objects created for the data item only; s One file is used to select the data item and related attributes; and-one k I ^ ° points to present the display object according to the automatic analysis of the attribute The control item used to interoperate with the attribute 0 23. The interface as described in item 22 of the I patent scope, wherein the attribute is used for the nested query of the result. 24. The interface as described in claim 22 of the patent scope, wherein the attribute includes at least one of the following: a type, a location, a category, a person, a date, a time, and a user-defined attribute. 25. The interface described in item 22 of the scope of patent application, further comprising an element for implicitly learning a user action. 26. The interface as described in item 22 of the scope of patent application, which further comprises at least half of an expanded list or group.

27. The interface described in item 26 of the Shen Qing patent scope, further comprising a control item for expanding the list or group. 0 28. The interface described in item 27 of the patent scope, wherein the at least one large cluster Attributes are presented in a compressed view using a half-expanded list. twenty four