CA2514165A1 - Metadata content management and searching system and method - Google Patents

Metadata content management and searching system and method Download PDF

Info

Publication number
CA2514165A1
CA2514165A1 CA002514165A CA2514165A CA2514165A1 CA 2514165 A1 CA2514165 A1 CA 2514165A1 CA 002514165 A CA002514165 A CA 002514165A CA 2514165 A CA2514165 A CA 2514165A CA 2514165 A1 CA2514165 A1 CA 2514165A1
Authority
CA
Canada
Prior art keywords
metadata
terms
documents
business
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002514165A
Other languages
French (fr)
Inventor
Craig Statchuk
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cognos Inc
Original Assignee
Cognos Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cognos Inc filed Critical Cognos Inc
Priority to CA002514165A priority Critical patent/CA2514165A1/en
Priority claimed from CA002545232A external-priority patent/CA2545232A1/en
Priority claimed from CA002545237A external-priority patent/CA2545237A1/en
Priority claimed from CA002545366A external-priority patent/CA2545366A1/en
Publication of CA2514165A1 publication Critical patent/CA2514165A1/en
Application status is Abandoned legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce, e.g. shopping or e-commerce
    • G06Q30/02Marketing, e.g. market research and analysis, surveying, promotions, advertising, buyer profiling, customer management or rewards; Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

A method and system is provided for building a searchable corpus that includes taxonomy definitions (or topic hierarchies) obtained from the structure of business reporting metadata.

Description

Metadata Content Management and Searching System and Method FIELD OF INVENTION
[0001] The present invention relates to a metadata content management and searching system and method.
BACKGROUND OF THE INVENTION

[0002] Competitive economies motivate business managers and other users to obtain maximum value from their investments for Corporate Performance Management (CPM) tools, such as Business Intelligence (BI) toots, that are used to manage business oriented data and metadata. These CPM tools provide authored reports or authored drill-through targets. However, the users often encounter similar problems in finding important reports or relevant data or drilling to related content.

[0003] Traditional search technologies often provide incomplete or irrelevant results in the CPM environments. There exist metadata search tools running against relational databases, but they also fall short since they do not leverage customer's CPM tools and applications. Relying on authored drill-through targets can be problematic as new cube, reports, metrics or plans are added, but new drill targets are not always kept up-to-date. Users can have difficulties moving seamlessly between CPM tools or applications, particularly when CPM
applications are created by different individuals or departments.

[0004] It is therefore desirable to provide a mechanism that allows more effective searches of business oriented metadata context.

[0005] There exist search engines that use a full-text index combined with statistical methods to create ordered search results. An example of such a search engine is page ranking that is described in US Patent No. 6,526,440 issued to Bharat. However, these search engines are not sufficient to search complex data like business oriented metadata.
_Z_

[0006] Some search engines use taxonomies to improve results. Creation of taxonomies has been carried out by a manual process or by an automated process based on advanced linguistic analysis.

[0007] However, business taxonomies are difficult and expensive to build manually. Also, linguistic analysis are often complicated and thus prone to result in inaccurate outcome.

[0008] It is therefore desirable to provide a system that manages business taxonomies automatically without the need for complicated and potentially inaccurate linguistic analysis.
SUMMARY OF THE INVENTION

[0009] It is an object of the invention to provide an improved metadata content management system that obviates or mitigates at least one of the disadvantages of existing systems. The invention uses a taxonomy management system.

[0010] In accordance with an aspect of the present invention, there is provided a method and system for building a searchable corpus that includes taxonomy definitions (or topic hierarchies) obtained from the structure of business reporting metadata.
[0011 J This summary of the invention does not necessarily describe all features of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] These and other features of the invention will become more apparent from the following description in which reference is made to the appended drawings wherein:
Figure 1 is a block diagram showing a metadata content management system in accordance with an embodiment of the present invention;
Figure 2 is a block diagram showing an embodiment of the metadata content management system;

Figure 3 is a block diagram showing an embodiment of a taxonomy management system;
Figure 4 is a block diagram showing an embodiment of a synonymous management system;
Figure 5 is a block diagram showing an embodiment of an index population system;
Figure 6 is a diagram showing metadata and report values; and Figures 7-45 are diagrams showing examples of reports and user interface displays by the metadata content management system.
DETAILED DESCRIPTION
[0013] Referring to Figure 1, a metadata content management system 10 in accordance with an embodiment of the invention is described. The metadata content management system 10 is suitably used for an enterprise or organization that has sources of business oriented information, i.e., business oriented metadata 20. The metadata content management system 10 interacts with the business oriented metadata 20, as well as one or more search tools or components 30 and user reporting applications 40 used by the organization.
[0014] An enterprise or organization typically has un untapped sources of information, e.g., business oriented metadata 20 and associated values found in authored reports and reporting applications 40.
[0015] The metadata content management system 10 indexes the content of the business oriented metadata 20. It analyzes published information and the underlying metadata, specifications and key report values to create a search index that is suitable for the enterprise or organization. These information, metadata and values may be collectively called as business oriented metadata in this specification. The metadata content management system 10 promotes navigation between BI tools and reporting applications 40, creating a strategic view of CPM assets. The metadata content management system 10 captures application context, e.g., "viewing location" or "query parameters", it enables many unique navigation options beyond traditional folder browsing and text searching.
[0016] The metadata content management system 10 enhances search and drill-thru capabilities across the range of user report applications 40 without requiring drill-through authoring in source content. A report author simply publishes target reports and lets the metadata,content management system 10 find drill locations to the published content.
[0017] The metadata content management system 10 organizes business oriented metadata content in ways that are more relevant and meaningful to users. The metadata content management system 10 also includes several personalization and administration options.
[0018] The metadata content management system 10 describes data using names and labels from actual reports. These names are often more familiar and relevant to report users. The metadata content management system 10 also provides enhanced report-to-report drilling and product-to-product navigation.
It expands the number of places where report users can "drill-to" and "drill-from" in a report. Most drilling requires no advance authoring. The metadata content management system 10 improves the capabilities of search tools. This includes the concept of 'federated' search across a variety of portal and web search indices.
[0019] User reporting applications 40 often generate authored relational and OLAP reports. Those reports provide a wealth of new metadata, including schema information, that is largely hidden from other tools and reporting applications. The metadata content management system 10 exposes this data in a standard format that can be re-used by other CPM applications and tools.
[0020] Figure 2 shows an embodiment of the metadata content management system 10: The metadata content management system 10 has a content index component 12, navigation aid drill erg-ine 14 and tools 1-fi. A content index is an index that catalogs individual words or terms along with their usage in indexed content. It provides term searches and links to additional data stored in the content index component 12. The content index may be an XML content index that describes each indexed item. The XML content index stores applicable metadata, metrics and planning information that improve search relevance.
Also, the metadata content management system 10 uses a full-text index which may or may not be part of the metadata content management system 10. The navigation and drill engine 14 is a server component that analyses each user's "context"
within their active reporting application 40. It leverages that information to provide better search and drill results. Tools 16 provide various features. They allow integration with search tools 30 and portals.
[0021] The metadata content management system 10 uses indexing so that the metadata content can be searched and organized in real-time. Indexing is normally performed by the metadata content management system 10 when the metadata content is published or updated. Indexing can be performed by a scheduled administrator task (example: nightly cron job). It can also be performed manually by an administrator or user.
[0022] A single set of index files is maintained in the content index component 12 for all users and user groups. The search and drill engine 14 may use a unique security algorithm to ensure that users see only the results they are authorized to access. The metadata content management system 10 validates its search results against the referenced reporting application. A user only sees items that he/she has permission to access. Each reporting application allows different levels of access:
[0023] The full-text Index contains an entry for each unique word (called a term) across all indexed content items (called documents). Each indexed content term contains the list of documents that has that term. Each indexed content term also contains usage statistics and the position of the term within each indexed document lwhere possible). The- index is stored- ire application server flat files .
This index is optimized to minimize disk reads and keep term storage as low as possible. The metadata content management system 10 may use an implementation of an existing full-text engine, e.g., the open source Apache Jakata Lucene full-text engine.
[0024] The XML content index is associated with each indexed document is an XML file that that catalogs metadata, report values and other reporting application-specific information. The XML content index items are stored in flat files in the application server's file system. A relational database can optionally be configured to store this XML index data. "Read" activity related to XML
items is low compared to full-text index items. XML records are read only after a list of preliminary search results has been made.
[0025] As shown in Figure 2, the metadata content management system 10 also has a taxonomy management system 50, synonymous management system 60 and that an index population system 70. The metadata content management system 10 provides searchable metadata and report data in a form of knowledge base documents 54 in the content index component 12 using these components 50-70.
[0026] The business taxonomy management system 50 is used for building a searchable corpus that includes business taxonomy definitions obtained from the structure of business reporting metadata. A taxonomy is a hierarchy of topics or subjects. The business taxonomy is used to classify terms and phrases. The taxonomy lets search components 30 find terms within a given subject or topic.
The taxonomies improves many search engine functions including: search results relevance, refinement of search criteria and creation of related business reporting content.
[0027] An example of a taxonomy is described for a system in which the term "Cost" is used as a Measure with names: Billing Cost, Average Billing Cost, Average Billing Cost per Customer, Average Billing Cost per Product, and Actual Cost. Also, it is used as a Report Columns/Heading with names: Product Cost, Planned Total Cost, and Cost of Goods Sold. (n a "taxonomy-aware" system, any of these subjects can be used to help find more relevant results for the otherwise ambiguous term "Cost".
[0028] The business taxonomy management system 50 uses the structure of business metadata extracted from reports and other documents to create a living, de facto taxonomy definition of topics for a given business entity. This taxonomy defines how terms are used in the business.
[0029] The business taxonomy management system 50 can create taxonomy definition of topics automatically without human intervention. It uses a deterministic algorithm that provides reliable results without the need for complicated and potentially inaccurate linguistic analysis.
[0030] As shown in Figure 3, the business taxonomy management system 50 comprises three main components: Content Scanner 52, Knowledge-Base documents 54, and Taxonomy Engine 56. The Knowledge-Base documents 54 form part of the Content Index 12.
[0031] External components which interact with the main components are:
business oriented metadata or Business Reporting Metadata 20, Full-Text Index and Search Component 32, and End-users or reporting applications 40 that provide search terms and consume taxonomy responses.
[0032] Figure 3 shows the flow of information between these components.
[0033] Metadata documents are documents that define query, layout; labeling and annotation of other content. Business Reporting Metadata 20 is metadata that exists anywhere in a business or organization. Examples of metadata documents include:
Business reporting and analysis metadata documents authored with report authoring and creation tools. Common examples of these tools include Business intelligence suites from Business Objects, Hyperion and Cognos.
Business modeling and optimization metadata documents.

Budgeting, planning and forecasting metadata documents.
Financial consolidation metadata documents.
[0034] The Full-Text Index and Search Component 32 uses a Full Text Index 34.
The Full Text index 34 is a concordance of terms across all scanned or indexed documents. An entry is made for each scanned word (excepting, e.g., stop words which are too common to be useful) that lists the exact position of each occurrence of the word within corpus of documents. From such a list, it is relatively simple to retrieve all the documents that match a query, without having to scan each document. The Full Text Index and Search component 32 is a typical embodiment that provides users and applications 40 with interfaces to build and search its Full Text Index 34.
[0035] Users or applications 40 are consumers of the system 50. Users or applications 40 may be referred to as Operators hereinafter. The user applications 40 may be web browsers.
[0036] The Content Scanner component 52 reads Business Reporting Metadata Documents 20. It builds a Knowledge-Base representation of the metadata with one or more of the following details:
Unique Document Identifier.
Document date.
The structured hierarchy of reporting elements from the source document.
Typical examples include data grouping, headings and labels. Each element provides a display name used in the reporting elements produced by the source metadata.
Database queries used in each structured reporting element.
Linkages to other structured reporting elements in this document and other Business Reporting Metadata Documents.
[0037] The content scanner 52 builds or updates a Knowledge-Base Document 54 for each source metadata document 20. A knowledge base document 54 is _8_ used to store a knowledge base representation of each term in the each source metadata document 20 along with references to content that uses the term.
[0038] In this system 50, a Knowledge-Base Document 54 is encoded in Extensible Markup Language (XML) and stored in system data files. In a different embodiment, any storage or encoding mechanism can be used. For example:
data can be stored in database records and accessed with SQL.
[0039] Each Knowledge-Base Document 54 is consumed by the Full-Text Index and Search Component 32 which adds a reference back to the Knowledge-Base Document 54 for each term found in the document. The Full-Text Index 34 is subsequently used by an example like a Full Text Index and Search Engine 32 to retrieve Knowledge-Base Documents 54 that contain specified search terms.
[0040] The Taxonomy Engine 56 provides services to Users and other applications 40. When one or more terms are provided, the Taxonomy Engine 56 provides all indexed terms that are Parent Topic Terms, Sibling Topic Terms and Descendent Topic Terms. This taxonomy is created dynamically for each term provided.
[0041 ] The procedure for determining Parent Topic Terms from a given term is:
Use Text Index and Search Engine to find all Knowledge-Base Documents that contain the given term.
Find structured elements in the matching document where the given term is used.
Get all structured parent elements relative to the structured element for each matching term.
Return the list of Display Names from elements found across all matching documents.
[0042] The procedure for determining Sibling Topic Terms from a given term is:
Use Text Index and Search Engine to find all Knowledge-Base Documents that contain the given term.
_g_ Find structured elements in the matching document where the given term is used.
Get all structured sibling elements relative to the structured element for each matching term.
Return the list of Display Names from elements found across all matching documents.
[0043] The procedure for determining Descendent Child Topic Terms from a given term is:
Use Text Index and Search Engine to find all Knowledge-Base Documents that contain the given term.
Find structured elements in the matching document where the given term is used.
Get all structured child elements relative to the structured element for each matching term.
Return the list of Display Names from elements found across all matching documents.
[0044] Other embodiments may include building taxonomies with "Crawler Task"
that performs functions of the Content Scanner, building Knowledge-Base corpus with dedicated tools instead of Content Scanner, and/or using a relational database for the Knowledge-Base corpus.
[0045] Figure 4 shows ari embodiment of the synonymous management system 60. The synonymous management system 60 uses some components which are commonly used by the taxonomy management system 50. Similar internal and external components to those shown in Figure 3 are denoted with the same reference numerals.
[0046] The synonymous management system 60 is used for building a searchable index corpus that includes synonymous and exemplar terms. A typical search engine 32 uses these associations to qualify any given search term as either synonymous with that term, or an example of that term.
-io-j0047] The synonymous management system 60 improves a search engine 32 by allowing its operators to search using synonymous terms and exemplar terms.
Synonymous terms are words or phrases that have the same meaning as other terms. Exemplar terms are words or phrases that are examples of other terms.
[0048] The synonymous management system 60 builds a corpus of exemplar and synonymous terms from business oriented content and then allows these terms to be used with any full-text search engine to extend the domain, relevance and quality of content returned from search queries.
[0049] Traditionally, search engines provide support for similar features using some type of thesaurus for synonymous terms. They use a combination of thesaurus and taxonomy components to provide support for example terms.
Search engines also use these components to improve results by generating better queries that are refinements of operator input.
[0050] Creation of thesauri and taxonomies is: a) a manual process or b) an automated process based on advanced linguistic analysis. Each of these systems is potentially expensive to maintain and can produce inconsistent results.
[0051]The synonymous management system 60 uses the structure of business metadata extracted from reports and other documents to create a living, de facto thesaurus and example term corpus for a given business entity. Processing is completed automatically without human intervention. The synonymous management system 60 uses a deterministic algorithm that provides reliable results without the need for complicated and potentially inaccurate linguistic analysis.
[0052] As shown in Figure 4, the synonymous management system 60 comprises a Content Scanner 52, a Knowledge-Base documents 54 and a Synonym and Example Engine 62. It interacts with external components including business Reporting Metadata 20, a Full-Text Index and Search Component 32, End-users -lI-or applications 40, which may be collectively called Operators 40 in this specification.
[0053] Figure 4 shows the flow of information between these components.
[0054] The synonymous management system 60 also interacts with a Word Stemming Component 64. The word stemming component 64 may be available software that is capable of normalizing words to their base form. it removes pluralization, capitalization, punctuation and common stop words to produce a unique base terms where possible. Examples of "stemming" are: "Horses" is normalized to "horse"; "GEEse" is normalized to "goose"; "The Days of Specialists" is normalized to "day specialist"; and "Functions aren't comments" is normalized to "function not comment". Stemming may or may not be a part of the synonymous management system 60. It serves to reduce knowledge base and index sizes. It can also improve system performance.
[0055] Users or applications 40 are consumers of the synonymous management system 60.
[0056] The synonymous management system 60 identifies all content in the metadata documents 20 that needs to be processed. The set of all required content is given to the Content Scanner 52.
[0057] The Content Scanner 52 proceeds to produce Knowledge Base documents 54 from all content that it reads. Subsequent processes can be run to update the Knowledge Base documents 54.
[0058] The Content Scanner 52 reads Business Reporting Metadata Documents 20. It builds a Knowledge-Base representation of the metadata and stores it as knowledge base documents 54, as described above.
[0059] The Knowledge-Base Document association are provided on demand by the Synonym and Example engine 62 to the Full-Text Index 34. The Full-Text - iz Index 34 ultimately uses synonyms and examples to provide better searches to its users.
[0060] The Synonym and Example Engine 62 is further described in detail. The Synonym and Example Engine 62 improves the indexing capabilities of a standard full-text indexing system 32, 34. The synonym and example engine 62 creates logical associations between terms that are used to efficiently answer the queries, "What terms are synonymous with Term A?"; "Is Term B synonymous with Term A?", "What terms are examples of Term A?", and "Is Term B an example of Term A?".
[0061] By combining terms into phrases and querying each term independently with one or more of the queries from above, the following questions can be answered: "Is Phrase B synonymous with Phrase A?", and "Is Phrase B an example of Phrase B?".
[0062] All of these systems can optionally use word stemming to improve performance and accuracy.
[0063] Synonym processing is carried out as follows. Synonyms for a given term are found by searching the Knowledge Base 54 for documents that contain the given term. A list of documents with the term is retrieved for further processing.
The XML elements in each matching document is then read to find Term Isomorphisms (equivalent structures) for the given search term. Synonyms are retrieved from text in XML elements. Common embodiments include display headings, labels, sibling elements and links.
[0064] Example processing is carried out as follows. Examples of a given term are found by searching the Knowledge Base 54 for documents that contain the given term. A list of documents with the term is retrieved for further processing.
The XML elements in each matching document is the read to find Term Isomorphisms (equivalent structures) for the given search term. Examples are retrieved from text in XML elements. Common embodiments include query results, prompt value pick-list, child elements and links.
[0065] In a different embodiment, the Content Scanner 52 may be combined with a Full-Text Index Scanner that indexes terms, determines synonyms and determines example terms in one integrated component. A sophisticated embodiment of a Full-Text Index and Search Service may integrate itself with the Synonym and Example component or engine 62.
[0066] Figure 5 shows an embodiment of the index population system 70. The index population system 70 is used for populating the external search engine that allows referenced content to be found by that search engine 38. The index population system 70 ri~akes it easy to populate such search engines 38 with references to content so that the content itself can be found when appropriate queries are provided by an operator or reporting applications 40.
[0067] Adding content references to an external index is complicated as there are hundreds of search engine choices available. No viable standards exist to allow promotion of content to all of these search engines. Each search engine potentially requires a different methods for populating its index with content, organizing content, rating search results, and adding security to search results.
[0068] Traditionally, programmers use APIs to populate indexes directly. Most API's are specific to a particular search engine thereby making it difficult to target multiple search engines.
[0069] Search engines themselves routinely use "crawlers" to roam through Internets and Intranets looking for content to index. Programmers can write "software adapters" to help crawlers understand different types of content.
For example, adapters are written for Word and PDF documents. Like search engine API's, these adapters are normally specific to a limited number of search engines.

[0070] Related indexing standards include OWL and RDF. As of this date, neither has the richness or flexibility required to adequately index complex data like BI
metadata.
[0071] The index population system 70 uses Index Business Cards 76 to create references to targeted content 22. The index business cards 76 may be standard HTML files. These files 74 allow the targeted content 22 to be easily indexed and subsequently found by search engines 38. Each Index Business Card 76 contains summaries of referenced content instances. These summaries include:
terms, topic hierarchies, report metadata, related information and URIs needed to show the content.
[0072] The information of the Index Business Cards 76 is provided in formats that are easily consumed by different search engines. This information is not specific to any single search engine 38.
[0073] Redundant presentation of data using different formats is used in an Index Business Card 76 to increase the number of search engines that can effectively consume its content.
[0074] Security restrictions may also be applied to referenced content and they are reflected in each business card. This allows external search engines 38 to apply a similar security restriction to the lists of results that they show.
[0075] The index population system 70 comprises a Card Generator 72, and a file system 74 containing~index business cards 76. The card generator 72 is a component that reads referenced content details and produces Index Business Card content references. The Index Business Cards 76 are files that provide index data for each content instance. These files 76 are placed on the File System 74 so that they are subsequently found by Search Crawlers 36.
[0076] The index population system 70 interacts with external components including content 22, a security provider 24, one or more search crawlers 36, one or more search engines 38 and operators 40. The content 22 is a collection of -is-original content instances. Other embodiments include an index corpus of content instances. The security provider 24 is knowledge of, or method of, determining security access for each content instance.
[0077] The search crawlers 36 are search engines that index content by "crawling"
through content. Examples include Google Web Server, Google Desktop Search, MSN Web Search, MSN Desktop Search and other Enterprise Search tools. The search engines 38 are related search engines that accept queries and provide search results over the index corpus built by the crawler 36.
[0078] The operators 40 are operators who issues search requests against the Search Engine 38, views results and navigates to referenced content .
[0079] The file system 74 is a file system for storing Index Business Card content references, and may be an external component of the index population system 70. The file system 74 may be Web servers.
[0080] Figure 5 shows the flow of information between components.
[0087] The index population system 70 identifies all Content that needs to be indexed. The set of all required content is given to the Card Generator 72.
The Card Generator 72 proceeds to produce one or more Index Business Cards 76 to represent each content instance. The format of each Index Business Card is variable. Each card may contain HTML, XML, RDF-XML and plain-text. The intention is to provide Search Crawlers 36 with the maximum amount of usable information.
[0082] The card generator 72 gives primary importance to individual terms present in the referenced content. A normalized list of these terms are placed in the Index Business Card 76. A list of related topics is added along with a list related concepts and subjects. XML and RDF-XML is normally used.
[0083] The card generator 72 may also add additional site-specific and index-engine-specific terms, topics; concepts and subjects.

[0084] URI's are added to provide viewing or execution references to content instances. Examples include URLs, files paths and application paths with required parameters.
[0085] Index Business Cards 76 may also include display text which is used to direct an operator 40 to the referenced content 22 when the business card 76 is displayed.
[0086] The security restriction applied to each content instance is retrieved from the Security Provider 24 and applied to the Index Business Card 76 using the appropriate security method. Examples include LDAP, Active Directory, UNIX
file security and Windows NT file security [0087) When the Card Generator processing is complete, ali generated Index Business Cards 76 are placed on the accessible file system 74 so that they can be found by Search Crawlers 40.
[0088] Once consumed by a crawler 36, referenced content instances are available to users on the related search engine 38. Operators 40 searching for content subsequently finds Index Business Cards 76 and be redirected to the target content 22.
[0089] In a different embodiment, Business Cards 76 may be placed on Web Servers. Business Cards 76 may include RDF-XML. Set of Content Instances may be stored in an index corpus which is subsequently used by the Card Generator 72 as the source for creating Index Business Cards 76.
(0090] Examples of operations of the metadata content management system 10 are now described.
[0091]As shown in Figure 6, metadata and associated values are produced by several CPM tools. Metadata export is expected from metadata modeling tools.
While authoring reports in reporting applications, the creation of new hierarchies and data de~inifions occur. These fiiPerai-chies and data-definitions are useful for -m -drilling and searching. In addition, this data often more recognizable to end-users since this is the text they see in reports and applications.
[0092] These metadata and report data are considered as Extended Metadata to describe the metadata created by different authoring and processing phases, and Extended Report Data refers to values created in a similar fashion.
[0093] The metadata content management system 10 leverages these extended metadata and report data, i.e., new BI data, to provide searching and drilling that was previously unavailable in existing systems.
[0094] Examples of extended metadata added by the authoring process includes:
Dimension names, Dimension levels, Category names, Alternate category names, Cube hierarchies, Table and record names, Group names, Parent/child relationships between categories, groups or tables, Authored drill target names, Framework Manager entities, including: packages, namespaces, query items, query sources and all relevant authored relationships. Examples of extended authored report values include: Items related by one of more dimensions, categories, measures groups or tables; Calculated values; and Annotations.
[0095] For example, a BI tool may provides crosstab providing dimension, category and measure names: These names represent extended metadata.
These names may or may not match tablelcolumn names in a star schema or other relational model. .lr'et each of these names represents an important potential target for drilling or searching. Values stored in a cube, including calculated values, represent extended data or values. They are a valuable target for searching. Like extended metadata, many of these values are not found in any other data store.
[0096] Another reporting tool may provide a report with columns. In such a report, each of the column heading represents extended metadata. The report grouping, e.g., by country, represents another form of extended metadata. Report values _ 18-themselves represent extended report data. They offer important linking and search targets.
[009i] In these cases, the extended metadata names are the same as those viewed by the report user. For this reason, extended metadata names are often most relevant and recognizable to the report user. These names may or may not match the names used in the underlying database.
[0098] Authored finks, (ike those anchored to the column name "Sales Rep Name"
provides additional summary information about linked a report. The metadata content management system 10 indexes this information to further increase search relevance about the destination content.
[0099] Research related to data searching and linking technologies commonly identifies two basic types of data: Structured data and unstructured data.
Structured data is defined by a formal schema. Typically searched with OLAP, SQL and XML utilities. Unstructured data is normally found in documents and static web pages. Searched using free-form queries with web tools like Google.
[00100] The metadata content management system 10 offers searching solutions over both types of data. Structured Data Searches are used to implement report-to-report drilling. This includes listing selecting from multiple targets. Full-Text Searches are used find reports for unstructured user queries.
For example: searches launched from IIS or Portal text search tools.
[00101] The metadata content management system 10 maintains a searchable database of reports, knowledge based documents 54 or the content index 12, that indexes the key elements of each report. This database is optimized for efficient searching of metadata names and hierarchies. It also offers searching of text. This information is used to provide the metadata content management system 10 drilling features. The metadata content management system 10 populates full-text search engines (like the Google Intranet "Search Appliance") with information about each report: It allows these search engines to find relevant content.
[00102] Searching functions and the user interface of the metadata content management system 10 are now described. Search functions are launched internally in reporting applications 40 using a specified user interface.
Alternatively, search may also be requested by any reporting application 40 with the URL, e.g., http://hostname/crnlxxxxxx?c=search&q=p1\p2\p3...&e=y &u=y &r=g&back=backURL; where q=p1\p2\p3...represents any number of arbitrary search terms separated by "\"; e=y shows search edit field with current terms (for refining search); u=y is used when clicking hyperlinked results, launch in separate window; r=g shows grouped results (default is list results); and backURL shows a return address.
[00103] Search results are typically shown in list format by default, sorted by relevance score, as shown in Figure 35. The user can click name to launch matching item, click a group button to show Grouped Results, or click an arrow button in frame on left to show Related Subjects. The result display also has the Search string edit held, which is optional.
[00104] Grouped Results shows search results organized by match criteria with inner sort by relevance score, as shown in Figure 36. The user can click name to launch matching item, click a list button to show List Results, or click an arrow button in frame on left to show Related Subjects.
[00105] The Subjects pane shows how the current search terms are used in the index, as shown in Figure 37. The user can click subject name to refine current search with matching item, or click an arrow button to hide this pane.
[00106] In order to provide Drill-Through functions, the metadata content management system 10 may provide a HTML User Interface that can be launched via URL:, such as http:/lhostname/crn/xxxxxx?d=<xml/>
&u=y&r=g&baek=ba~kURL; where d represents "XMLEncoded" XML t~rNl specification providing source content, a represents Value 'n' (default) specifies that hyperlinked results should be opened in the same window. Value 'y' specifies that hyperlinked results should be opened in a separate window. r shows grouped results (default is list results), and back provides "URLEncoded"
return address.
[00107] Drill-through functions may also be launched internally in reporting applications 40.
[00108] Examples of Drilling and Linking applications with the metadata content management system 10 are now described.
[00109] Meta drilling is feasible when a match can be found between an item selected in one application and the metadata exported from another report.
The metadata content management system 10 significantly expands the number of possible metadata values. Therefore, the number of potential drill targets is increased.
[00110] As shown in Figure 7, the dimension name "Channels" in the PowerPlay cube on the left is used to match a column name in the report on the right. This drill example illustrates that by using the metadata content management system 10, drill targets can be determined dynamically at run-time.
This differs from existing CPM tools where drill targets need to be specified when a report is being authored.
[00111] The metadata content management system 10 meta drilling means that applications 40 can drill in any direction. Metadata searches allow the user to drill in "non-traditional' ways. For example, it is now possible to drill from reporting applications. As shown in Figure 8, the column name "Channels" in a report on the left is used to match a dimension name (or any star schema element) in a cube on the right. Optionally, the selected "Independent" channel name on the left can be used to link to a category name on the right.

[00112) Meta drilling allows content to be linked using metadata only.
Report values and measures are not needed. As such, these reports can be efficiently indexed when they are initially published. Indexing needs to be updated only when a report specification changes. No authoring is needed to drill in either direction.
[00113] Meta drilling often results in multiple drill targets. The metadata content management system ,10 lists the hierarchy of matches and allows a user to pick an appropriate target, as shown in Figure 9. The metadata content management system 10 lists matching reports by report type and then sorts them by a calculated relevance rating.
(00114] The metadata content management system 10 calculates search relevance by first creating search criteria at the drill source location and then comparing this criteria with the resulting list of matching items.
[00115] Source metadata and values are used to create a search specification. For example: when drilling from an OLAP report, filter information, including the current crosstab dimension and category filters plus dimensions currently being displayed, are used to create a search specification.
[00116] Consider the following drill from a report shown in Figure 10 by drilling from the Mass Marketer category builds a search specification that includes terms "looks for item Mass Marketer within Channels", by drilling from the GO Sport Line category builds a search specification that includes terms "looks for item Go Sport Line within Products", and by drilling from the intersection of the Mass Marketer and GO Sport Line categories builds a search specification that includes both of the terms above.
[00117] If any additional filters are active in the crosstab, for example Years = 2003 or Location = California within USA, then the related filter terms will be added to the search criteria.

[00118] The search is submitted to the metadata content management system 10. The metadata content management system 10 calculates report relevance by comparing the number of matched terms with those found in each result item.
[00119] Value drilling and searching allows target report values and measures to be searched. This means that search criteria can include value ranges within metadata constraints. As shown in Figure 11, the Order Number value "160" in a report on the left is used to match same Order Number value in a report on the right that is generated by a different reporting tool.
[00120] Value searching allows content to be linked using metadata and associated report values/measures from actual reports. This type of drilling allows extends the number of reports that can be matched at the cost of using more index storage. It is also much slower than searching only metadata. As such, value searching is not a good choice for drilling. It is better suited for ad hoc searching where speed is less of a concern.
[00121 ] Examples shown thus far have concentrated on report-to-report "drilling". The metadata content management system 10 can also perform full-text searches against metadata and values.
[00122] One of the easiest ways to increase the visibility of Cognos applications in any enterprise is to expose reports through the user's standard search tools. The metadata content management system 10 allows enterprise search tools to be used to expose Bl content to report users. Full text search engines use proprietary technology to index content. The metadata content management system 10 is responsible for "pushing" index values to each supported engine. Search indexes are maintained by a search server associated with each search engine. Storage requirements are dependent on the amount of information provided by metadata content management system 10. Configuration options control how much information is "pushed" to these servers. The metadata content management system 10 maintains its own index that can be used standalone or in conjunction with the search engines. The result is: fast, relevant and predictable searches.
[00123] The metadata content management system 10 may also allow applications to create lists of "see also" links that show related content.
[00124] The metadata content management system 10 facilities are exposed as WSDL compliant Web Services.
[00125] Another example is described using a report generated using Cognos PowerPlay to describe how a list of the metadata content management system 10 search results is produced. In this example, a user wants to find related CPM content. They initiate their "search" by launching a dynamic "drill"
from inside a Cognos PowerPlay. The user presses the metadata content management system 10 "Drill" button, or enters terms in the metadata content management system 10 "search bar". Figure 12 is an example from PowerPlay 7.3 showing the "Drill" button. Alternatively, as shown in Figure 13, the user can type terms in the text search tool or browser SearchBar.
[00126] The navigation and search engine 14 accepts request and builds a "Source Context". This is carried out as follows. The navigation and search Engine 14 starts the actual index search. When drilling from a reporting application like PowerPlay, the engine 14 extracts current filter values, view settings and al! visible category information to create a "Source Context".
Figure 14 shows an example of "Drill" from PowerPlay and the resultant Source Context.
[00127] A "Source Context" is also built for "text" searches. The metadata content management system 10 uses the taxonomy management system 50.
Individual terms are inspected to see if they match subjects in the metadata content management system subject and term hierarchy, i.e., a "taxonomy".
When a match is found, terms are placed under their respective matching subjects. This allow terms to be treated like categories within OLAP
dimensions.

[00128] The metadata content management system 10 also maintains a list of "aliases" for terms and subjects. It uses the synonymous management system 60. This further extends the number of possible matches, particularly in enterprise environments, where several names can be used for the same thing.
[00129] Figure 15 shows an example of the SearchBar "Text" and the resultant Source Context.
[00130] The Navigation and Search Engine 14 now proceeds to process drill requests and text searches using the same algorithm. Unique words or terms are extracted from the "Search Context". These terms are passed to full-text search engine 32. !t returns a list of documents. Documents are sorted by number of term "hits" (i.e., documerits with the most occurrences of the given terms are returned first.). Multiple terms.are automatically processed. Documents with multiple term "hits" are sorted to the top. Single term hits then follow.
[00131] The number of documents returned may be limited by the "results page size", e.g., usually a number between 10 and 20. The metadata content management system 10 may. return 2x times the "results page size".
[00132] The metadata content management system 10 is optimized to quickly return the "next page" in a result set. Each page requires approximately the same amount of time to process, regardless of relative page number.
[00133] The metadata content management system 10 applies a security check to the intermediate~list of results returned thus far. Batch (or grouped) security queries are sent to the appropriate target reporting application 40.
Denied items are removed from the intermediate results. if the number of result items falls below the current page size, additional full-text searches are performed until a complete page of results is built .
[00134] The metadata content management system 10 may "score" the intermediate results to improve its relevance. The XML Content Index entry for each item is retrieved (either from a flat file or the linked rational database) . The original "Source Context" is used to establish the user's position with the calling application, or to establish the subject/term relationship of the request.
[00135] Several queries and optimizations are applied over the XML content.
These operations allow the metadata content management system 10 to ultimately select:
Cubes with best dimension and category matches. Dynamics filters that navigate the user into a matching cube, are automatically generated.
Reports with best matching of columns, groupings and/or values.
Prompted reports with the best matching prompts. Prompt answers are automatically generated. Any item with a significant number of matching terms.
Priority given to those terms listed in the metadata content management system "Enterprise Taxonomy" of subjects/terms.
[00136] The metadata content management system 10 returns a page of results. An example is shown in Figure 16. The results can be sorted by relevance (score) or by groups (for example, reports with matching prompts or Cubes with matching dimensions).
[00137] An example of reporting by example is now described. In this example, a product manager wants to know who provides customer support for a particular product at different US retailers. The product manager gives search terms to the metadata content management system 10. She does not remember the exact product name spelling, and she only remembers its short form from other reports she has seen. She uses the term USA for United States and she is not particular about spacing or capitalization. In each case, she is providing examples of report metadata, not the metadata itself. On her first try, she types:
"USA CM backpack staff details".
[00138] The metadata content management system 10 takes her request and matches metadata:
USA = United States = Location CM backpack = Canyon Mule Climber Backpack = Product [00139] The terms staff details did not match any metadata. Instead it matched a Report Detail template for an authored report. With the majority of information collected, the metadata content management system 10 simply asks for business role as shown in Figure 17.
(00140] She selects the Product Manager role and clicks Finish to get her answer. The metadata content management system 10 builds and runs a report, as shown in Figure 18.
[00141] She looks at the report and notices the information is correct except that all reporting years are included. She wants only year 2004 She returns to the original search page and adds "2004" so that the search terms are "USA CM
backpack staff details 2004"..
[00142] The metadata content management system 10 builds and runs a new report as shown in Figure 19. The product manager looks at the answer and is satisfied.
[00143] In this example, the metadata content management system 10 carried out matching metadata based on unstructured terms- which are aliased or even misspelled - is not a linguistic exercise, by using the indexing data structure.
For each metadata model Framework Manager element, the metadata content management system 10 indexes the following information as shown in Figure 20.
This means that for each indexed Term there exists zero or more Aliases and zero or more Examples. For example, consider a Framework Model element called: "Product". Aliases are obtained from actual published report column heading and titles that use the FM element Product. Indexed Alias values include things like Product List, Product Names, and Prod. Nam. Examples are obtained by running actual queries to get values for this element. Indexed Example values will typically include things like Star-Lite Tent, RayBan Sun Screen, and Eivis Retro Sunglasses. From the values shown in this example, we can determine that a user has entered a Product when they type "star-lite tent", "Prod Nam"
or "Elvis sunglasses". _ _ [00144] Hyper-Dimensional Navigation uses a Bar to find content quickly and easily. When a product manager wants to know who provides customer support for a particular product at different US retailers, she enters the two phrases that she feels are most prominent. and intends to narrow her search from there. She types: "2004 United States".
[00145] Initial search results show matching reports (as expected) with 2 new frames entitled: Hyper-Dimensional Topics and Related Topics, as shown in Figure 21.
[00146] Hyper-Dimensional Topics show the number of reports filtered by 2004 and United States. Other enterprise-wide dimensions are shown with the number of reports that contain some reference to that dimension. Clicking on any listed item shows a context menu that lists children and parents across all indexed content.
[00147] Related Topics shows parent and sibling dimensions that are related to current filters.
[00148] To find reports related to retailers and staff, she clicks Retailer and Staff items in Hyper-Navigation bar. This effectively searches for reports that deal with 2004 (selected previously), United States (selected previously) Retailer and Staff creating a "topic crosstab". The new search results show the number of reports matching the selected criteria, as shown in Figure 22. The exact report she is looking for is listed first under the heading Matching Reports.
[00149] An example of hyper-dimensional report creation is now described.
Hyper-Dimensional Navigation can be also used to create reports. This example shows a sales manager who wants to compare 2005 actual sales against his forecast for Projection TVs in US and Canada with detail breakdown by Order Method.
[00150] The sales manager begins at the top level of his enterprise Navigation Bar. It shows the most frequently referenced reporting categories across the entire enterprise. While this navigation bar has literally hundreds of topics, only the top seven items for the "Sales Manager Role" are shown in Figure 23.
[00151] As an alternative to typing search terms, e.g., "2004 United States Canada Projection TV", he simply clicks the down arrow symbol next to the related "hyper dimensional topics" shown. Under Products, he selects Projection TVs. Under Years he selects 2005. Under Distribution Channel he selects Order Method. Under Location he selects both United States and Canada. Search results show matching reports as shown in Figure 24.
[00152] He sees that no reports match his search criteria. Now he checks the topic Plan Versus Actual and clicks Create report to answer his question.
The metadata content management system 10 creates a report (using the Sales Manager business role that he selected last time) as shown in Figure 25.
[00153] Hyper-Dimensional navigation provides a concise definition of report items needed. Business Role also helps narrow the choices without asking too many questions..
[00154] Now report extensions are described. Consider the regional sales manager for GO Sporting Goods. As part of his quarterly expense activity, he needs to know: "Which salespeople exceeded their target last year?". This user goes to his "usual" set of reports that have been authored for him. Finding nothing that would appear.to directly answer his question, he open a report, entitled Sales Revenue by Salesperson. He sees it has some of the information needed as shown in Figure 26.
[00155] Verifying that the information shown is correct and appropriate, he selects the columns of interest - in this case: Sales Year, Staff Name and Actual Revenue - and right-clicks to see available options.

[00156] As shown in Figure 27, he selects Show related content to launch the metadata content management system 10 and find all related reports. He sees the following search results as shown in Figure 28.
[00157] Seeing that none of the reports match his needs, he decides to extend his current report by clicking Extend Report with Related Data. He is asked to describe the extra information he wishes to see as shown in Figure 29.
He types search terms "Sales Target Exceeded" and clicks Finish.
[00158] The component of the metadata content management system 10 is started to look for models and reporting examples that match the terms "Sales", "Target", or "Exceeded". The metadata content management system 10 creates a new report specification from the model and metadata elements found. The report is run and the results are shown in Figure 30. The generated report has new columns: "Sales target" and "Exceeds target". The sales manager got the numbers he was looking for.
(00159] Advanced users can launch Query Studio or Report Studio to fine tune the generated report. The user can bookmark this report for later.
(00160] The metadata content management system 10 performed this as follows. An element with a display title "Sales Target" is found in the model used most often by this user. It is deemed to be compatible with the original report's list-style format. Similarly, a calculation named "exceeds target" is found in a report that shows both actual revenue and sales target elements. This calculation is also deemed compatible. The metadata content management system 10 assembles the new report and displays the answer.
[00161] Visual Report Construction is now described. Visual Report Construction is a simple idea: A user views elements in two reports that he wants in a single report. Using a drag-and-drop gesture, he drags elements together.
Visually it's a trivial drag-and-drop gesture. Under-the-covers, it is a variant of the Report-Extension $xampleshown--previously.

[00162] In this example, a regional sales manager for GO Sporting Goods.
As part of his quarterly expense activity, he needs to know: "Which salespeople exceeded their target last year?". Once again he goes to his "usual" set of reports and opens Sales Revenue by Salesperson. He sees it has some of the information needed as shown in Figure 31. He opens another report, and sees it has some of the information he wants.
[00163] Dynamic Details are now described. Simple relational list reports are often created as authored drill-through targets in dimensional reporting tools, like PowerPlay. This example shows how the metadata content management system 10 can automatically produce these reports without authoring.
(00164] In this example, a product manager wants to know who provides customer support for a particular product at different US retailers. She is familiar with PowerPlay. She navigates to where she thinks the answer will be found as shown in Figure 32.
[00165] Realizing that this cube lacks the detail she needs, she clicks to the PowerPlay drill-through button. PowerPlay recognizes that no authored drill-through actions exist for the selected cells, so it passes the request to the metadata content management system 10 component.
[00166] Having sufficient knowledge about the source cube location from the selected cells: Year=2004, Product=Canyon Mule Climber Backpack, Location=United States, the metadata content management system 10 asks her for two simple clarifications (which can optionally be defaulted in the future):
"What business role are you currently performing?" and "What kind of details are you looking for?". She sees the page shown in Figure 33. From the custom options shown; she selects Product Manager and Staff Details. She clicks Finish to get her answer.
[00167] The metadata content management system 10 creates and runs a detaiF report that matches criteria in-her chosen rote using the original selected cell data as query values. She sees the results as shown in Figure 34. She verifies the report content by looking at title and column names. She sees her answer.
[00168] The metadata content management system 10 performed this example as follows. Filter information is extracted from the source PowerPlay report. It is combined with the caller's selected job role and detail template to create a report with appropriate content. Roles are defined by an administrator and users themselves. Detail templates are created by authoring real reports in Report Studio and optionally creating templates of the metadata content management system 10 when saving and updating .
(00169] Administration functions and user interface of the metadata content management system 10 are now described. The administration functions of the metadata content management system 10 administration are available only to users with administration capability.
[00170] Index Properties are viewed via a link on CM item property pages, as shown in Figure 38.
[00171 ] The item Type is Folder. The Open Folder link lets the user open the folder in a new window, as shown in Figure 39. Last Indexed shows the date/time of the most recent indexing on the folder itself (contained items will have different dates/times). Re-Index initiates indexing of the folder, then all of its content (this is a "deep index" operation- all contained folders, subfolders and items are indexed). Disable Indexing stop further indexing of this folder and all contained folders, subfolders and items. When first disabled, all contained index entries are removed. The indexing user name can be set as "inherited" (default for new items) or to a specific user. XML index data is shown (read only) OK
saves changes are returns to the Index tab. Cancel returns without making any changes.

[00172] Figure 40 shows an example of Item Property Pages. Properties for content items are the same as folders with minor label changes. It may also show a Model Index tab for a metadata model related to this report [00173] Figure 41 shows an example of Model Property Pages. Properties for models items are the same as folders with minor label changes. The Model tab is shown by itself for Model list items. It is shown with an Indexing Options tab for reporting application content items (e.g., ReportNet content items).
[00174] General administration functions of the metadata content management system 10 can be launched from an application, such as Cognos Connection, using a link in a lunch bar. This option may be shown only to users with Content Administration capabilities.
[00175] The opening page shows links available to open Terms, Alias, and Configuration functions.
[00176] The Terms tab shows usage for each term in the full-text index, as shown in Figure 42. The administrator can click a letterlnumber tab to move the Terms list to the first entry starting with that letter, type partial or complete Search text and click the Find button to search for a matching entry in the Terms list, click a term in the Terms window to show Documents and Subjects containing that term, and click a listed document name (in the Documents or Subjects window) to see its index information in a pop-up window.
j00177] The Alias tab allows synonyms to be created for common terms or phrases, as shown in Figure 43. It also allows Series 7 Names to be mapped to reporting application model (e.g., ReportNet model) item names.
[00178] Equivalent Phrases are initially built by linking indexed framework metadata models with corresponding labels in cube (e.g., PowerPlay cube) and reports (e.g., ReportNet reports). Referring elements in each application are used to build a list of common Phrases. The Alias tab lets the administrator edit, add and delete equivalence associations between these phrases. Phrases are shown by selecting the Show Equivalent Phrases drop-down item. By default, phrases may be displayed in alphabetic order.
[00179] The administrator can click a letter/number tab to move the Phrases list to the first entry starting with that letter, type partial or complete Search text and click the Find button to search for a matching entry in the Phrases list, and click an entry in the Phrases window to show Equivalent Phrases. All synonyms are listed and checked in both windows as appropriate.
[00180] Checking an additional entry in the Phrases window causes it to be added and checked in Equivalent Phrases window. Unchecking an entry in the Phrases window causes it to be unchecked (but still displayed) in the Equivalent Phrases window. Unchecking the selected entry in the Equivalent Phrases window causes all linked entries to be unchecked in both windows after confirming the unchecking.
[00181] The administrator can type a term in the Link new term: edit box, and click the Enter button to directly add a new synonym. If the term exists, the term will be listed and checked in both windows as appropriate. If the term does not exist in document, the user will be prompted to confirm the addition and the term will be added and checked in both windows.
[00182] Other sort orders may include: most common, least common, in most reports and in least reports. In these modes, the letterlnumber tabs are hidden and no letter/number separators are shown in the Phrases window.
[00183] Figure 44 shows an example of mapping of Dimension and Category names to a metadata model item names (e.g., ReportNet Framework Model Item names). Mapped names are shown by selecting the Show Model Names drop-down item. By default, phrases may be displayed in alphabetic order. Other sort orders may include: most common, least common, in most reports and in least reports.

[00184] The administrator can click a letter/number tab to move the Model Item Name list to the first entry starting with that letter, type partial or complete Search text and click the Find button to search for a matching entry in the Model Item Name list, selecting (highlighting) an entry in the Model Item Name window to shows equivalent Names, which causes all linked items are listed and checked, uncheck an entry in the Name window which causes it to be unlinked from the highlighted Model Item Name, and click the Add Name toolbar button or click Add Names button to open the Add Name Dialog.
[00185] The Add Name dialog show all indexed dimension and category names, as shown in Figure 45. The administrator can click a letter/number tab to move the Name list to the first entry starting with that letter, type partial or complete Search text and click the Find button to search for a matching entry in the Name list, and check desired equivalent names.
[00186] The metadata content management system 10 finds content using simple terms and phrases. Multi-faceted navigation aids refine searches using business terminology related specifically to the customer's enterprise. When content does not t exist, the metadata content management system 10 seamlessly creates a "made-to-order" report, with the help of the customer's enterprise business terminology.
[00187] Thus, the metadata content management system 10 allow the users to find relevant BI content using simple term or phrase searches. With the help of multi-faceted navigation aids - using business terminology related specifically to a particular enterprise or organization, the users can refine searches to zero-in on the answer the users need. When content does not exist for particular search criteria, the metadata content management system 10 seamlessly creates a "made-to-order" report. Business terminology from the user's enterprise or organization helps the users refine their report to meet their requirements.
Report creation is transparent.

[00188] The metadata content management system 10 allows users to search user's business oriented metadata and reporting applications. It provides a flexible "dynamic drilling" family of features, hyper-dimensional navigation, to user's applications. The "context awareness" feature allows seamless navigation to relevant related content. A search-oriented interface allows reports to be found and run easier and more efficiently. It can also use role-based reporting components in reporting applications. The metadata content management system 10 finds answers in existing reports and creates custom reports as needed. The metadata content management system 10 provides dynamic report construction. The metadata content management system 10 creates reports directly from search terms.
[00189] The hyper-dimensional navigation let users navigate their cubes and reports at a time, using the dimension metaphor of their reporting applications.
Fast searching combined with intrinsic knowledge of key enterprise reporting elements allows the metadata content management system 10 to build multi-dimensional hierarchies on-the-fly. This dynamic structure is used to show a hyper-dimensional view of all enterprise content that can be navigated like a PowerPlay cube.
[00190] The metadata content management system of the present invention may be implemented by any hardware, software or a combination of hardware and software having the above described functions. The software code, instructions and/or statements, either in its entirety or a part thereof, may be stored in a computer readable memory. Further, a computer data signal representing the software code, instructions andlor statements may be embedded in a carrier wave may be .transmitted via a communication network. Such a computer readable memory and a computer data signal and/or its carrier are also within the scope of the present invention, as well as the hardware, software and the combination thereof.
[00191] lNhile particular-embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the scope of the invention. For example, the elements of the metadata content management system are described separately, however, two or more elements may be provided as a single element, or one or more elements may be shared with other components in one or more computer systems.

Claims (16)

What is claimed is:
1. A business taxonomy management system comprising:

a content scanner for reading source metadata documents containing business reporting metadata, and for building a knowledge base representation of the metadata;

knowledge base documents containing the knowledge base representation for the metadata;

a taxonomy engine for indexing terms in the knowledge base documents.
2. The business taxonomy management system as claimed in claim 1 wherein the content scanner builds the knowledge base representation of the metadata including a unique document identifier, a document date, a structured hierarchy of reporting elements from the source metadata document, on or more database queries used in each structured reporting element, and/or linkages to other structured reporting elements in the source metadata document and other business reporting metadata documents.
3. The business taxonomy management system as claimed in claim 1 wherein each knowledge base document is encoded in Extensible Markup Language (XML) and stored in a system data file.
4. The business taxonomy management system as claimed in claim 1 wherein the taxonomy engine provides a taxonomy for indexed terms that are parent topic terms, sibling topic terms or descendent topic terms.
5. The business taxonomy management system as claimed in claim 1 wherein the taxonomy engine creates the taxonomy dynamically for each term provided in the knowledge base documents.
6. a method for managing business taxonomy comprising the steps of:

reading each business reporting metadata document containing business reporting metadata;

building a knowledge base representation for the metadata;

building one or more knowledge base documents containing the knowledge base representation for the metadata;

indexing terms in the knowledge base documents; and providing a taxonomy for each indexed terms.
7. The method as claimed in claim 6 wherein the taxonomy providing step provides a taxonomy for each of indexed terms that are parent topic terms, sibling topic terms or descendent topic terms.
8. The method as claimed in claim 6 wherein the taxonomy providing step comprises the step of determining parent topic terms from a given term by:

finding knowledge base documents that contain the given term;

finding structured elements in the matching documents where the given term is used;

getting all structured parent elements relative to the structured element for each matching term; and returning a list of names from elements found across all matching documents.
9. The method as claimed in claim 6 wherein the taxonomy providing step comprises the step of determining sibling topic terms from a given term by:

finding knowledge base documents that contain the given term;

finding structured elements in the matching documents where the given term is used;

getting all structured sibling elements relative to the structured element for each snatching term; and returning a list of names from elements found across all matching documents.
10. The method as claimed in claim 6 wherein the taxonomy providing step comprises the step of determining child topic terms from a given term by:

finding knowledge base documents that contain the given term;

finding structured elements in the matching documents where the given term is used;

getting all structured child elements relative to the structured element for each matching term; and returning a list of names from elements found across all matching documents.
11.A synonymous management system comprising:

a content scanner for reading source metadata documents containing business reporting metadata, and for building a knowledge base representation of the metadata;

knowledge base documents containing the knowledge base representation for the metadata;

a synonymous and example engine for building a searchable index corpus that includes synonymous and exemplar terms in the knowledge base documents.
12. The synonymous management system as claimed in claim 11 wherein the synonymous and example engine identifies terms in the metadata documents, locates documents with the terms, finds term isomorphisms for the given search terms, and creates logical associations between terms.
13.A method for managing synonymous and examples for business oriented metadata comprising the steps of:

reading each business reporting metadata document containing business reporting metadata;

searching knowledge base documents containing for a given term;

creating a list of documents with the given term;

finding term isomorphisms for the given search term; and creating logical associations between the terms.
14.An index population system comprising:

a card generator for creating references to targeted content and producing index business card content references to represent content instances, and index business cards for containing contains summaries of referenced content instances for populating an external search engine with references to content so that the content can be found by the external search engine.
15.The index population system as claimed in claim 14 wherein the summaries include one or more of terms, topic hierarchies, report metadata, related information and URIs needed to show the content.
16.A method for populating indexes to one or more external search engines, the method comprising the steps of:

reading business oriented content;

creating references to targeted content;

producing index business card content references to represent content instances; and generating index business cards for containing contains summaries of referenced content instances.
CA002514165A 2005-07-29 2005-07-29 Metadata content management and searching system and method Abandoned CA2514165A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA002514165A CA2514165A1 (en) 2005-07-29 2005-07-29 Metadata content management and searching system and method

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CA002514165A CA2514165A1 (en) 2005-07-29 2005-07-29 Metadata content management and searching system and method
CA002545232A CA2545232A1 (en) 2005-07-29 2006-04-28 Method and system for creating a taxonomy from business-oriented metadata content
CA002545237A CA2545237A1 (en) 2005-07-29 2006-04-28 Method and system for managing exemplar terms database for business-oriented metadata content
CA002545366A CA2545366A1 (en) 2005-07-29 2006-04-28 Method and system for populating an index corpus to a search engine
US11/494,974 US7885918B2 (en) 2005-07-29 2006-07-28 Creating a taxonomy from business-oriented metadata content
US11/494,936 US7873670B2 (en) 2005-07-29 2006-07-28 Method and system for managing exemplar terms database for business-oriented metadata content

Publications (1)

Publication Number Publication Date
CA2514165A1 true CA2514165A1 (en) 2007-01-29

Family

ID=37696155

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002514165A Abandoned CA2514165A1 (en) 2005-07-29 2005-07-29 Metadata content management and searching system and method

Country Status (1)

Country Link
CA (1) CA2514165A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010108261A1 (en) * 2009-03-21 2010-09-30 Matthew Oleynik Systems and methods for research database management
US8126912B2 (en) * 2008-06-27 2012-02-28 Microsoft Corporation Guided content metadata tagging for an online content repository

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8126912B2 (en) * 2008-06-27 2012-02-28 Microsoft Corporation Guided content metadata tagging for an online content repository
WO2010108261A1 (en) * 2009-03-21 2010-09-30 Matthew Oleynik Systems and methods for research database management
US8694535B2 (en) 2009-03-21 2014-04-08 Matthew Oleynik Systems and methods for research database management

Similar Documents

Publication Publication Date Title
Hotho et al. BibSonomy: A social bookmark and publication sharing system
Gupta et al. A survey of text mining techniques and applications
Chang et al. Mining the World Wide Web: an information search approach
Naumann Quality-driven query answering for integrated information systems
Sacco et al. Dynamic taxonomies and faceted search: theory, practice, and experience
Myllymaki Effective web data extraction with standard XML technologies
US7213024B2 (en) Method and apparatus for accessing information within an electronic system
US6820075B2 (en) Document-centric system with auto-completion
US7941446B2 (en) System with user directed enrichment
US7599950B2 (en) Systems and methods for collecting user annotations
US7783668B2 (en) Search system and method
US6240407B1 (en) Method and apparatus for creating an index in a database system
Tuchinda et al. Building mashups by example
US6519586B2 (en) Method and apparatus for automatic construction of faceted terminological feedback for document retrieval
Abiteboul Querying semi-structured data
JP6058705B2 (en) Search method and search system
Fensel et al. Ontobroker: Or how to enable intelligent access to the WWW
JP5607164B2 (en) Semantic trading floor
US7418452B2 (en) System and method for locating, categorizing, storing, and retrieving information
US6631496B1 (en) System for personalizing, organizing and managing web information
US6778979B2 (en) System for automatically generating queries
Laender et al. DEByE–data extraction by example
He et al. Automatic integration of Web search interfaces with WISE-Integrator
Aditya et al. Banks: Browsing and keyword searching in relational databases
US6609124B2 (en) Hub for strategic intelligence

Legal Events

Date Code Title Description
FZDE Dead