GB2503549A - Automatically associating tags with files in a computer system using search keywords. - Google Patents

Automatically associating tags with files in a computer system using search keywords. Download PDF

Info

Publication number
GB2503549A
GB2503549A GB1307488.5A GB201307488A GB2503549A GB 2503549 A GB2503549 A GB 2503549A GB 201307488 A GB201307488 A GB 201307488A GB 2503549 A GB2503549 A GB 2503549A
Authority
GB
United Kingdom
Prior art keywords
file
files
user
tag
tags
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1307488.5A
Other versions
GB201307488D0 (en
Inventor
Joseph Saib
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AppSense Ltd
Original Assignee
AppSense Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AppSense Ltd filed Critical AppSense Ltd
Publication of GB201307488D0 publication Critical patent/GB201307488D0/en
Publication of GB2503549A publication Critical patent/GB2503549A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation

Abstract

Disclosed is a method of automatically associating tags with files in a computer system. The method comprises receiving a search request from a user containing a search keyword, retrieving results including files responsive to the search request for presentation to the user, receiving file information and access information about the files, wherein the access information indicates whether the files has been previously accessed by the user. Selecting at least one eligible file from the files based on the access information and the file information, identifying a tag based on the search keyword, the access information, and the file information, tagging the eligible file with the tag by associating the tag with the eligible file and storing the association of the tag with the eligible file. The method may include retrieving results from a file system controlled by a second user.

Description

SYSTEMS AND METHODS FOR AUTOMATICALLY
ASSOCIATING TAGS WITH FILES IN A COMPUTER SYSTEM
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is rclatcd to thc following applications, filcd hcrcwith and hereby incorporated by reference: "Systems and Methods for Providing Data-Driven Document Suggestions", US Application No. 13/457,136, and "Systems and Methods for Mining Organizational Data To Form Social Networks" US Application No. 13/457,158
BAC KGROUND
Technical Field
[0002] Disclosed systems and methods relate to automatically associating tags with files in a computer system.
Description of the Related Art
[0003] Files in a computer system are often retrieved using search as a method for identifying the file, particularly in large collections of files where specifically identifying the file by file storage location becomes difficult. Search has limitations as well, stemming from its usc of text strings to identify files, as many files contain identical or similar strings of text. This causes a large number of results to be returned for certain searches performed by a standard text search engine.
[0004] Some search systems attempt to mitigate the problem of ineffective textual searches by adding additional information to the files in the form of "tags." Tags are short strings of tcxt that arc assigncd to individual lilcs or chunks of contcnt such as mctadata.
More than one tag may be assigned to a file. The assigned tags are chosen informally and personally by the user of the system, and do not necessarily relate to the file's location in a hierarchical storage system. Tagging was popularized by its use on the Web by websites such as Flickr and weblogs using the WordPress content management system.
[0005] However, tags still require a user to manually designate and apply tags. As with other types of metadata, the disadvantage of using tags is that the tags must be applicd. Further, tags are often idiosyncratic and specific to a user. It is difficult to automatically assign tags based on the contents of a document because such automatically-generated tags may not correspond to a user's specific preferences.
Additionally, in a corporate environment, it may be impractical to apply tags to a large number of potential documents.
100061 Therefore, there is a need in the art to provide alternative tagging systems for use on intranets and other netDrks. In particular, there is a need in the art to provide systems and methods that allow different users in an organization to perform tagging of documents on an intranet document storage system.
[0007] Accordingly, it is desirable to provide methods and systems that overcome these and other deficiencies of the related art.
SUMMARY
[0008] In accordance with the disclosed subject matter, systems and methods are provided for automatically associating tags with files in a computer system.
100091 The disclosed subject matter includes a method for automatically associating tags with files in a computer system, the method comprising receiving a search request from a user containing a search keyword; retrieving results responsive to the search request for presentation to the user, including one or more files; receiving file information and access information about the one or more files during a tagging process, selecting, during the tagging process, at least one eligible file and at least one tag based on the search keyword; and tagging the at least one eligible file with the at least one tag by associating each tag with each eligible file, wherein the selection of the least one eligible file is based on the search keyword, the results responsive to the search request, thereby not requiring the user to manually associate files and tags.
[0010] In accordance with the disclosed method, the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files.
The tag comprises at least one of the search keyword and a related term derived from the search keyword. The file information comprises one of a filename and a storage location.
The file information comprises a file location that is similar to that of the one or more files. Retrieving the results includes the one or more files responsive to the search request from a file system controlled by a second user for presentation to the user. Identifying the at least one tag is performed by evaluating numeric scores representing relevance.
[0011] The disclosed subject matter includes a system for providing document tagging in a communications network is disclosed, the system comprising: one or more interfaces configured to provide communication with a server via communication network; and a processor, in communication with the one or more interfaces, and configured to run a module stored in memory that is configured to: receive a search request from a user containing a search keyword; retrieve results including one or more files responsive to the search request for presentation to the user; receive file information and access information about the one or more files, wherein the access information indicates whether the one or more files has been previously accessed by the user; select at least one eligible file from the one or more files based on at least one of the access information and the tile information; identify at least one tag based on at least one of the search keyword, the access information, and the The information; tag the eligible file with the tag by associating the tag with the eligible file; and store the association of the tag with the eligible file.
100121 In accordance with the disclosed system, the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files.
The tag comprises at least one of the search keyword and a related term derived from the search keyword. The file information comprises one of a filename and a storage location.
The file information comprises a file location that is similar to that of the one or more files. The processor is configured to retrieve the results including the one or more files responsive to the search request from a file system controlled by a second user for presentation to the user. The processor is configured to identi the at least one tag by evaluating numeric scores representing relevance.
[0013] The disclosed subject matter includes a non-transitory computer-readable medium having executable instructions operable to cause a device to: receive a search request from a user containing a search keyword; retrieve results including one or more files responsive to the search request for presentation to the user; receive file information and access information about the one or more files, wherein the access information indicates whether the one or more files has been previously accessed by the user; select at least one eligible file from the one or more files based on at least one of the access information and the file information; identify at least one tag based on at least one of the search keyword, the access information, and the file information; tag the eligible file with the tag by associating the tag with the eligible file; and store the association of the tag with the eligible file.
[0014] Tn accordance with the disdosed medium, the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files.
The tag comprises at least one of the search keyword and a related term derived from the search keyword. The file information comprises one of a filename and a storage location.
The file information comprises a file location that is similar to that of the one or more files. The device is operable to retrieve the results including the one or more files responsive to thc search request from a file system controlled by a second user for presentation to the user. The device is also operable to identify the at least one tag by evaluating numeric scores representing relevance.
[0015] There has thus been outlined, rather broadly, the features of the disclosed subject matter in order that the detailed description thereof that follows may be better understood, and in order that the present contribution to the art may be better appreciated. There are, of course, additional features of the disclosed subject matter that will be described hereinafter and which will form the subject matter of the claims appended hereto.
[0016] In this respect, before explaining at least one embodiment of the disclosed subject matter in detail, it is to be understood that the disclosed subject matter is not limited in its application to the details of construction and to the arrangements of the components set forth in the following description or illustrated in the drawings. The disclosed subject matter is capable of other embodiments and of being practiced and carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting.
[0017] As such, those skilled in the art will appreciate that the conception, upon which this disclosure is based, may readily be utilized as a basis for the designing of other structures, methods and systems for carrying out the several purposes of the disclosed subject matter. It is important, therefore, that the claims be regarded as including such equivalent constructions insofar as they do not depart from the spirit and scope of the disclosed subject matter.
[0018] These together with the other objects of the disclosed subject matter, along with the various features of novelty which characterize the disclosed subject matter, are pointed out with particularity in the claims annexed to and forming a part of this disclosure. For a better understanding of the disclosed subject matter, its operating advantages and the specific objects attained by its uses, reference should be had to the accompanying drawings and descriptive matter in which there are illustrated preferred embodiments of the disclosed subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] Various objects, features, and advantages of the disclosed subject matter can be more fully appreciated with reference to the following detailed description of the disclosed subject matter when considered in connection with the following drawings, in which like reference numerals identify like elements.
[0020] FIG. 1 is a network connectivity diagram of a networked system in accordance with some embodiments of the invention.
[0021] FIG. 2 is a flow diagram of automatically tagging documents in accordance with certain embodiments of the invention.
[0022] FIG. 3 illustrates a block diagram of a client device in accordance with certain embodiments of the invention.
100231 FIG. 4 illustrates a block diagram ofa server device in accordance with certain embodiments of the invention.
DETAILED DESCRIPTION
[0024] In the following description, numerous specific details are set forth regarding the systems and methods of the disclosed subject matter and the environment in which such systems and methods may operate, etc., in order to provide a thorough understanding of the disclosed subject matter. It will be apparent to one skilled in the art, however, that the disclosed subject matter may be practiced without such specific details, and that certain features, which are well known in the art, are not described in detail in order to avoid complication of the subject matter of the disclosed subject matter. In addition, it will be understood that the examples provided below are only for examples, and that it is contemplated that there are other systems and methods that are within the scope of the disclosed subject matter.
100251 Users of present-day computer systems often use arbitrary textual keywords called tags as metadata for documents or arbitrary content objects. These tags help describe the document and allow them to be found later by browsing or searching. Tags do not need to be related to the content of the document; instead, they are chosen informally by the user of the system to facilitate understanding and retrieval. For this reason, often a document will have multiple tags, and several of these tags may be different synonyms for the same general concept. This reduces the need for a user to subsequently remember the exact phrasing used in the document for purposes of later retrieval.
[0026] Tags became popular as a result of their use on various websites. They possess many advantages. One advantage of tags is that they can be simply visualized to facilitate browsing and retrieval. For any arbitrary number of documents, all tags used by those documents can be listed to provide a simple visualization. Often, multiple documents sharc the same tag, and an additional layer of information can thus be made visible by increasing the size of the text for tags that are presented more than once in the collection of documents. The resultant visualization is called a tag cloud, and its visual appeal provides a visualization that is easy to create and that allows a user to browse a collection of documents.
[0027] Another advantage of tagging in a document system is that unlike a file system that organizes files hierarchically, there is no explicit information about the meaning of each tag. Another advantage is that multiple tags maybe applied to the same document.
In storage systems where documents are organized in hierarchical file systems, it may be difficult or impossible to create an organization that is uscfiil for all users. Tagging allows multiple users' organizational systems to coexist, facilitating document retrieval for all users.
[0028] Tagging thus depends heavily on users to identify appropriate and relevant tags. This can be considered a disadvantage of tagging. Often, applying tags to files is a repetitive and menial process. For example, if a user wishes to apply a tag to more the one file, the process may require repeating the tagging process for each file. If the files or the tags differ in one or more ways, the user must spend considerable attention applying the correct tags to each document. If more than one copy of the document exists on the system, the user may be required to tag all copies of the document. Further, it is likely that not all documents in the system are tagged. This may result in the user constantly encountering documents that need to be tagged, and tagging these documents may cause repeated interruptions to the user's workflow.
[0029] Additionally, although many users may spend the time needed to designate and apply such tags, not all users may do so. It is therefore one objective of this invention to allow users to apply tags with less effort, thereby increasing the number of users that apply tags, and improving the ability of all users to retrieve files as a result.
[0030] The present disclosure describes a method for automatically inferring tags for documents in a storage system. In some embodiments, the system may use information from other files, or other information, to infer tags and to retrieve documents based on the tags, without the user explicitly assigning the tags to each retrieved document. In some embodiments, the system may interpret a search based on search terms to automatically assign tags to documents based on the search terms. In some embodiments, the system may use tags assigned by other users on the system to infer tags for the current user, in somc embodiments.
[0031] Tags constitute metadata associated with files. In some embodiments, other metadata associated with the file may be used to infer tags. For example, tags may automatically be assigned to a document based on the modification date, creation date, or other date associated with a document. Tags may automatically be assigned to a document based on the contents of the document, such as in a system that performs statistical analysis of text in a document. Tags may automatically be assigned to a document based on the file type of a document, such as tags for a photo, video, or other document. Tags may automatically be assigned to a document based on where the document is stored in a hierarchical document storage system. In each of these examples, the tags may be assigned by the system during periods of user inactivity, or the tags may be dynamically inferred at the time a user performs a search, without explicit assignment of the tags to the document and storage of the tags in a metadata store associated with the document. A subsequent search will return both documents with inferred tags and explicit tags.
100321 For example, Joe may have pictures of his dog, Bubbles, stored in a directory titled "Pictures of my dog." The documents in this directory may automatically be tagged "dog," "Bubbles," "pictures," "photos," and "my dog," based on the information that can be gleaned from the directory structure. If the photos were taken recently, they could be tagged with the date, or with a human-readable tag related to the date, such as "last week." [0033] These tags could also be inferred at the time that Joe searches for these pictures. The system may retrieve results for a search, and then attempt to identify inferred tags based on the search terms, characteristics of the files retrieved by the search, or other factors. The system may performed an additional search based on the inferred tags, thereby retrieving additional documents that match the user's implied criteria. The inferred tags may be saved for future use by the system in a metadata store, associated with the documents retrieved by the search. Tn some embodiments, the user may be given the opportunity to confirm/accept or reject the association of the inferred tags to the retrieved documents. A subsequent search for the same search terms will return documents with the inferred tags, in addition to documents retrieved by the search based on their contents. In some embodiments, this subsequent search may be performed at the same time as the initial search, thereby augmenting all searches with the results of searching on inferred tags.
[0034] In some embodiments, tags explicitly assigned by the user may be handled differently than tags implicitly inferred by the system. For example, these tags may be given different weights. In some embodiments, a weight may be assigned to each inferred tag based on a level of confidence of the system in the specific inferred tag. This maybe useflul when some tags are based on predictive analysis of the contents of the documents, as such analysis may not always be correct. The level of confidence of the system in a specific inferred tag may consist of numeric weights, coefficients or scores, and may be based on one or more factors, including the specific data or metadata used to identify the tag, the user or users whose action was used to identi' the tag, etc. [0035] In some embodiments, the system may take into account documents tagged by other users. The documents may exist on a user's local machine or on a network server, or on a cloud store, or external to the network server, or on more than one of the above, or elsewhere. A search on these documents may incorporate the searching user's own tags, and may also incorporate tags assigned by other users. The search may also incorporate inferred tags, both from the searching user or from other users. The following method may be used to provide this functionality.
[0036] In some embodiments, a tag server is located on a network, and is accessible to two or more users on a system. The tag server includes a correlation module that serves to correlate tags assigned to documents by one user with tags assigned to the same documents by another user. The correlation module may also provide suggestions for inferred tags to one user based on tags assigned by, or inferred from tags assigned by, the other user. The tag server may also communicate tags assigned by one user to the other user. Storage of tags and search capability may be offered at the tag server, as well as on the computers of the two respective users.
[0037] Leveraging tags assigned by a large number of users within an organization has the ability to dramatically reduce manual assignment of tags by each individual user.
This method may thus be adapted for use in a system where users are connected to each other in a social network. This social network may automatically be derived from a directory server, as described in U.S. Patent Application No.13/457,158, "Systems and Methods for Mining Organizational Data to Form Social Networks," filed April 26, 2012, which is hereby incorporated by reference.
100381 In some embodiments, tags may be automatically assigned or inferred based on information from the public Internet or on the organizational intranet. The Internet contains a number of resources, such as: websites that provide ncws stories related to search terms, such as Google News (http://news.google.com!); websites that provide web searches for search terms, such as Google (http://www.google.com/); websites that provide dictionary definitions and thesaurus entries, such as Dictionary.eom (http://www.dictionary.com)); and websites that provide lexical databases for common words, such as WordNet (http://wordnet.princeton.edu/); or any other suitable website.
100391 These public resources may be accessed to enhance the system's understanding of search terms by obtaining lists of related words and phrases to be used as tags, either inferred or explicitly applied. For example, a search for a document with the term "transport" may be augmented by searching on WordNet for the search teim "transport," which results in several additional terms being returned, including "conveyance," "carry," "shipping," "transmit," "transfer," "ship," and others. These terms could be incorporated as inferred tags at a tag server.
[0040] Retrieved results are parsed into tags, which may be used as inferred tags or explicitly assigned to documents, and the tags may be associated with the files retrieved by the original search terms in a centralized database at a network server.
[0041] Other resources may be available on the local intranet, such as equivalents of the resources described above, as well as other search and textual analysis tools that have access to databases that are privately maintained by a corporation or organization. Such databases may contain username and organizational information, like a directory server or lightweight directory access protocol (LDAP) server; may index private data and provide search services, such as a search appliance provided by Google or intranet search software provided by Autonomy, Verity, Endeca, Microsoft Sharepoint, may provide access to customer relationship management (CRM) databases such as Sicbcl and PcoplcSoft databases; enterprise relationship planning (ERP) databases such as SAP databases; archives of email and instant messaging traffic; and other systems. By performing a search for a set of search terms requested by a user in one or more of these intranet databases and reformatting or parsing the results, a set of term suggestions may be obtained that relate to the user's needs in the context of the organization.
[0042] User interaction may be incorporated to improve the accuracy of the term suggestion system, in some embodiments. For example, in the example above for the term "transport," the user maybe presented with a list of documents that match the initial search term, with thc additional terms suggcstcd by WordNct listcd alongsidc each document. The user may click the terms that apply to each document to indicate which terms should be used as tags for that document. As another example, when showing a user results for the term search "greyhound," the user may be presented by a grid of images that arc returned by the search, but are not yet tagged with the term "greyhound." Once the user selects or clicks one or more of the displayed images in the grid, the selected documents may be explicitly tagged with the term.
I0O3l FIG. I is a network connectivity diagram of a networked system in accordance with some embodiments of the invention. Network system 100 is a client/server system, inwhich at least one clicnt 101 (e.g., devices 101-1, 101-1, ... 101-n), tagging sewer 102, and file server 103 communicate via a communication network 104. Tagging server 102 communicates with lexical suggestion server 105 via communication network 104.
Dcvicc 101 is a mobile dcvicc or uscr-opcratcd dcvicc associatcd with a user. Device 101 can be any suitable device, including desktop computers, mobile computers, tablet computers, and cellular phones, including smartphones (e.g., Apple iPhones, RIM BlackBerry devices, or Android-based smartphones). Users use device 101 to perform searches for documents and files on file sewer 103. When searches are performed on file server 103, device 101 also communicates with tagging server 102, which analyzes the search terms for the search and explicitly or implicitly assigns tags according to the disclosed embodiments of the invention.
[0044] File sewer 103 retrieves the requested documents and files, and sends them to device 101. File sewer 103 may bc a standard Microsoft Windows file server, web sewer, WebDAY server, or other file server, in which case tagging server 102 may provide proxy capability to intercept requests for files before they are sent to the file server. File sewer 103 directly sends the files to the devices 101, in some embodiments. Tagging sewer 102 may provide search capability, and this search capability may be provided via a webpage served by tagging server 102, or via an enterprise search application such as Autonomy, or a document management system such as iManagc WorkSitc, or another system, according to some embodiments.
[0045] Tagging sewer 102 communicates with lexical suggestion server 105, which may be a website such as WordNet, or a local lexical analysis tool that analyzes textual data based on the corpus of documents available within the organization, or another -10-system that can receive terms as input. The lexical suggestion server 105 is within communication network 104, which may bc the public Internet, or may be a private intranet controlled by an organization or company, or may be another network.
[0046] Tags are identified based on relevance, which is calculated using numerical scores that represent the relative likelihood that a search relates to a file based on one of several factors. When potential tags are identified, the tags maybe added to a list of potential tags, or they may be applied explicitly or implicitly to the file immediately.
Additionally, the tagging system correlates new potential tags with tags that already exist on the system. When possible, tags can be reused.
[0047] In some embodiments of the invention, tagging server 102 may incorporate a social networking server, and tags may be shared among users in a social network, tags may be applied based on actions taken by other users in the social network and their relationship status with the searching user or viewing user or file owner, and tags may be implicitly applied or suggested based on tags that are explicitly or implicitly applied by other users. Further information about the social network aspects of the invention may be understood from U.S. Patent Application No. 13/457,158, "Systems and Methods for Mining Organizational Data to Form Social Networks," filed April 26, 2012, which is hereby incorporated by reference.
[0048] FIG. 2 is a flow diagram of identifying documents for suggestion in accordance with certain embodiments of the invention. Flow diagram 200 shows the following steps. At step 201, a tagging server receives a search request containing search terms. At step 202, search results are retrieved. The search results include files and/or documents. These files or documents are then tagged with the search terms at step 203. In some embodiments of the invention, the searching user is requested to confirm tagging of the search results. The tagging can be on a per-document basis, on a subset of the search results, or on the entire search results. In other embodiments of the invention, search results are automatically tagged without user intervention, thus allowing the tagging system to rapidly increase the number of tagged documents in the system.
[0049] At step 204, additional potential tags are identified from file metadata. This metadata may include: filename, file path, file creation date, file modification date, file owner, file creator, user who last accessed the file, title, date sent, date received, subject, file size, file type, file comments, or other metadata.
100501 At step 205, access information for the tile in the search results is retrieved and used to determine and identify potential tags. This information may include file access time, file modification time, user who last accessed the file, file access history, file search history (i.e., whether the file was searched for or appeared as a result in a search), or other information. Although information originating from or pertaining to other users of the system maybe used at every step of chart 200, access information pertaining to other users is particularly pertinent to identifying potential tags. At this step, access information about other files may be incorporated. For example, if another file that is accessed by the same user or a different user in the same directory as the current file is noted to have been accessed, the current file may have an increased likelihood of being tagged with the directory or with one or more tags related to the other file, or with other tags.
100511 At step 206, the tile is reviewed for tags based on content. This may include a lexical analysis step of text within the file, image analysis if the file is an image, optical character recognition (OCR) of an image file, transcription of an audio recording, or analysis of other types of files.
100521 At step 206, textual information derived from the contents of the file may be passed to a lexical analysis server, such as lexical suggestion server 105 (shown in FIG. 1). The lexical analysis server may provide additional potential tags. Potential tags are correlated with other tags on the system and identified for implicit or explicit application to the file.
[0053] At step 207, lexical analysis is performed on the search terms. This analysis is similar to the analysis performed at step 206, and may include the use of lexical suggestion server 105, as described above.
I005l At step 208, if user intervention is required, the tagging server solicits the information. This may be performed at a webpage, using an interactive texts terminal, using a graphical user interface (GUI), using a touch interface, or using audio or other sensory feedback, as appropriate. User interaction may be accomplished at the client device. The user may indicate whether or not he wishes tags to be explicitly or implicitly applied to the file. Alternatively, this step may be performed after potential tags are identified for several or all files.
[0055] At step 209, this process is applied to the next file in the search results. -12-
100561 If the user has indicated that one or more tags should be applied, or if no user intervention is required, tags are applied at the tagging server, at step 210. This step may entail a local tagging server on a client device or a remote tagging sewer on a remote device or server device. When a local tagging server is used, it is configured to communicate with a remote tagging server in some embodiments, to synthesize, aggregate, and standardized tags across multiple users and client devices.
[0057] FIG. 3 is a block diagram of a client device in accordance with certain embodiments of the invention. Block diagram 300 shows client device 101, which includes processor 302, memory 303, document suggestion module 304, tagging service 305, local tag storage 306, and search service 307. Client device 101 is connected to tagging server 308 via interface 309 and fileserver 310 via interface 311. Interface 309 and interface 311 may be the same physical interface.
100581 In some embodiments of the invention, client device 101 can include additional modules, fewer modules, or any other suitable combination of modules that perform any suitable operation or combination of operations. The memory 303 can be a non-transitory computer readable medium, flash memory, a magnetic disk drive, an optical drive, a programmable read-only memory (PROM), a read-only memory (ROM), or any other memory or combination of memories. The software runs on a processor 302 capable of executing computer instructions or computer code. The processor 302 might also be implemented in hardware using an application specific integrated circuit (ASIC), programmable logic array (PLA), field programmable gate array (FPGA), or any other integrated circuit.
[0059] At least interfaces 309 and 311 provides an input and/or output mechanism to communicate over a network. The interfaces 309 and 311 enable communication with servers, as well as other network nodes in the communication network. The interfaces 309 and 311 are implemented in hardware to send and receive signals in a variety of mediums, such as optical, copper, and wireless, and in a number of different protocols some of which may be non-transient. The interfaces 309 and 311 may be the same interface.
[0060] The client device 101 can include user equipment of a cellular network. The user equipment communicates with one or more radio access networks and with wired communication networks. The user equipment can be a cellular phone having phonetic communication capabilities. The user equipment can also be a smart phone providing -13-services such as word processing, web browsing, gaming, c-book capabilities, an operating system, and a full keyboard. The user equipment can also be a tablet computer providing network access and most of the services provided by a smart phone. The user equipment operates using an operating system such as Symbian OS, Apple iOS, RIM BlackBerry OS, Windows Mobile, Linux, HP WebOS, and Android. The screen may be a touch screen that is used to input data to the mobile device, in which case the screen can be used instead of the frill keyboard. The user equipment can also keep global positioning coordinates, profile information, or other location information.
100611 The client device 101 also includes anyplatforms capable of computations and communication. Non-limiting examples can include televisions (TV5), video projectors, set-top boxes or set-top units, digital video recorders (DYR), computers, netbooks, laptops, and any other audio/visual equipment with computation capabilities. The client device 101 is configured with one or more processors 302 that process instructions and run software that may be stored in memory. The processor 302 also communicates with the memory and interfaces to communicate with other devices. The processor 302 can be any applicable processor such as a system-on-a-chip that combines a CPU, an application processor, and flash memory. The client device 101 can also provide a variety of user interfaces such as a keyboard, a touch screen, a trackball, a touch pad, and/or a mouse.
The client device 101 may also include speakers and a display device in some embodiments.
100621 When searching for one or more documents using search terms, or when assigning tags manually, tagging service 305 may perform several functions. These functions include: identifying tags, idcnti'ing whether user intervention is required, analyzing files and their contents to identify tags, correlating one set of tags with another set of tags to improve retrievability and consistency, and other functions. Tagging service 305 receives requests to assign tags as well. When tags are assigned, they are stored in local tag storage 306. Storing may occur in the form of an association between a file and a tag. They may also occur in the form of associations from between a file and a plurality of tags. Other associations may also be contemplated, in some embodiments. Local tag storage 306 may be synchronized with tagging server 308, periodically or on an as-needed basis or at other times. Search service 307 provides a user interface for searching for one or more files, and interfaces with tagging service 305. Search service 307 or tagging service 305 communicates with filcscrvcr 310 to retrieve requested documents. -14-
Document suggestion module 304, if present, is used in conjunction with search service 307 to provide document suggestions, as described in U.S. Patent Application No. 13/457,136 "Systems and Methods for Providing Data-Driven Document Suggestions," filed April 26, 2012 and hereby incorporated by reference.
[0063] FIG. 4 is a block diagram of a server device, in accordance with some embodiments. Server device 102 includes processor 402, memory 403, multi-user tagging service 405, multi-user tag storage 406, search service 407, and document suggestion module 408. Server 102 communicates with client device 101 (not shown) via interface 404. Server 102 may communicate with file server 103 via interface 410. Server 102 may communicate with intranet 411 via interface 412. Server 102 may communicate with Intcmet 413 via interface 414.
[00641 As described for block diagram 400, a multi-user tagging service 405, a multi-user tag storage 406, and a search service 407 are provided. The operation of these modules is similar to the operation of the analogous modules in block diagram 300, but their function is performed across multiple users and is performed on any and all files available to server 102, which may be a superset of the files available to each local device. Multi-user tagging service 405 thus corresponds to tagging service 305 and provides tagging services for one or more users and uses tags that are used by all users; multi-user tag storage 406 corresponds to local tag storage 306 and stores tags used by all users; and search service 407 corresponds to search service 307 and provides search for documents stored on behalf of all users or that arc made accessible on the network.
Additionally, tags may be requested from multi-user tag service 405 by client devices 101, in order to provide consistent tags throughout an organization consisting of many client devices that provide and include tagging services. Additionally, more resources for identifying tags, particularly based on lexical analysis, may be available on intranet 411 and Internet 413, as described elsewhere herein. Document suggestion module 408 is also optionally provided for document suggestions.
[0065] Processor 402 performs processing for one or more modules as disclosed in this specification. Memory 404 provides temporary storage of data as required by the processor 402. The memory 404 can be a non-transitory computer readable medium, flash memory, a magnetic disk drive, an optical drive, a programmable read-only memory (PROM), a read-only memory (ROM), or any other memory or combination of memories.
The software runs on a processor 402 capable of executing computer instructions or -15-computer code. The processor 402 may also be implemented in hardware using an application specific integrated circuit (ASIC), programmable logic array (PLA), field programmable gatc array (FPGA), or any other integrated circuit.
[0066] Although processor 402 performs each of the functions described in the flow diagram of FIG. 2, multiple sub-modules may exist within either the software or hardware of server 102 that provide supporting functionality.
[0067] The server 102 can operate using an operating system (OS) software. In some embodiments, the OS software is based on a Linux software kernel and runs specific applications in the server such as monitoring tasks and providing protocol stacks. The OS software allows server resources to be allocated separately for control and data paths. For example, certain packet accelerator cards and packet services cards arc dedicated to performing routing or security control functions, while other packet accelerator cards/packet services cards are dedicated to processing user session traffic. As network requirements change, hardware resources can be dynamically deployed to meet the requirements in some embodiments.
[0068] The server's software can be divided into a series of tasks that perform specific functions. These tasks communicate with each other as needed to share control and data information throughout the server 102. A task can be a software process that performs a specific function related to system control or session processing. Three types of tasks operate within the server 102 in some embodiments: critical tasks, controller tasks, and manager tasks. The critical tasks control functions that relate to the server's ability to process calls such as server initialization, enor detection, and recovery tasks.
The controller tasks can mask the distributed nature of the software from the user and perform tasks such as monitoring the state of subordinate manager(s), providing for intra-manager communication within the same subsystem, and enabling inter-subsystem communication by communicating with controller(s) belonging to other subsystems. The manager tasks can control system resources and maintain logical mappings between system resources.
[0069] Individual tasks that run on processors in the application cards can be divided into subsystems. A subsystem is a software element that either performs a specific task or is a culmination of multiple other tasks. A single subsystem includes critical tasks, controller tasks, and manager tasks. Some of the subsystems that run on the server 102 -16-include a system initiation task subsystem, a high availability task subsystem, a shared configuration task subsystem, and a resource management subsystem.
[0070] The system initiation task subsystem is responsible for starting a set of initial tasks at system startup and providing individual tasks as needed. A high availability task subsystem works in conjunction with the recovery control task subsystem to maintain the operational state of the server 102 by monitoring the various software and hardware components of the server 102. A recovery control task subsystem is responsible for executing a recovery action for failures that occur in the server 102 and receives recovery actions from the high availability task subsystem. Processing tasks arc distributed into multiple instances running in parallel so if an unrecoverable software fault occurs, the entire processing capabilities for that task are not lost. User session processes can be sub-grouped into collections of sessions so that if a problem is encountered in one sub-group users in another sub-group will not be affected by that problem.
[0071] Shared configuration task subsystem can provide the server 102 with an ability to set, retrieve, and receive notification of server configuration parameter changes and is responsible for storing configuration data for the applications running within the server 102. A resource management subsystem is responsible for assigning resources (e.g., processor and memory capabilities) to tasks and for monitoring the task's use of the resources.
[0072] In some embodiments, the server 102 can reside in a data center and form a node in a cloud computing infrastructure. The server 102 can also provide services on demand. A module hosting a client is capable of migrating from one server to another server seamlessly, without causing program faults or system breakdown. The server 102 in the cloud can be managed using a management system.
100731 It is to be understood that the disclosed subject matter is not limited in its application to the details of construction and to the arrangements of the components set forth in the following description or illustrated in the drawings. The disclosed subject matter is capable of other embodiments and of being practiced and carried out in various ways. For example, while this disclosure discusses search in detail, other methods for retrieving documents may also provide embodiments that are in accordance with the invention, such as retrieval via browsing, retrieval using a hierarchical file structure, retrieval using a tag cloud, etc. Also, it is to be understood that the phraseology and -17-terminology employed herein are for the purpose of description and should not be rcgarded as limiting.
[0074] As such, those skilled in the art will appreciate that thc conception, upon which this disclosure is based, may readily be utilized as a basis for the designing of other structures, methods, and systems for carrying out the several purposes of the disclosed subjcct matter. It is important, therefore, that the claims be regarded as including such cquivalcnt constructions insofar as they do not dcpart from thc spirit and scope of thc disclosed subject matter.
IOO7l Although the disclosed subjcct mattcr has bccn dcscribcd and illustratcd in the foregoing exemplary embodiments, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the disclosed subject matter may be made without departing from the spirit and scope of the disclosed subject matter, which is limited only by the claims which follow. -18-

Claims (24)

  1. Claims 1. A method for automatically associating tags with files in a computer system, the method comprising: receiving a search request from a user containing a search keyword; retrieving results including one or more files responsive to the search request for presentation to the user; receiving file information and access information about the one or more files, wherein the access information indicates whether the one or more files has been previously accessed by the user; selecting at least one eligible file from the one or more files based on at least one of the access information and the file information; identifying at least one tag based on at least one of the search keyword, the access information, and the file information; tagging the eligible file with the tag by associating the tag with the eligible file; and storing the association of the tag with the eligible file.
  2. 2. The method of claim 1, wherein the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files.
  3. 3. The method of claim 1 or 2, wherein the tag comprises at least one of the search keyword and a related term denved from the search keyword.
  4. 4. The method of any of claims I to 3, wherein the file information comprises one of a filename and a storage location.
  5. 5. The method of any preccding claim, wherein the one or more files has been previously accessed by the user, and wherein the file information comprises a ifie location that is similar to that of the one or more files. -19-
  6. 6. The method of any preceding claim, further comprising retrieving the results including the one or more files rcsponsive to the search request from a file system controlled by a sccond user for prcsentation to the user.
  7. 7. The method of any preceding claim, wherein identiring the at least one tag is performed by evaluating numeric scores representing relevance.
  8. 8. A system for providing document tagging in a communications network, the system comprising: one or more interfaces configured to provide communication with a server via communication network; and a processor, in communication with the one or more interfaces, and configured to run a module stored in memory that is configured to: receive a search request from a user containing a search keyword; retrieve results including one or more files responsive to the search request for presentation to the user; receive file information and access information about the one or more files, wherein the access information indicates whether the one or more files has been previously accessed by the user; select at least one eligible file from the one or more files based on at least one of the access information and the file information; identify at least one tag based on at least one of the search keyword, the access information, and the file information; tag the eligible file with the tag by associating the tag with the eligible file; and store the association of the tag with the eligible file.
  9. 9. The system of claim 8, wherein the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files. -20 -
  10. 10. The system of claim 8 or 9, wherein the tag comprises at least one of the search keyword and a related term derived from the search keyword.
  11. 11. The system of claim 8,9 or 10, wherein the file information comprises one of a filename and a storage location.
  12. 12. The system of any of claims 8 to 11, whcrein the one or morc files has been previously accessed by the user, and wherein the file information comprises a file location that is similar to that of the one or more files.
  13. 13. The system of any of claims 8 to 12, wherein the processor is configured to retrieve thc results including thc onc or more flics responsivc to the scarch rcqucst from a file system controlled by a second user for presentation to the user.
  14. 14. The system of any of claims 8 to 13, wherein the processor is configured to identify the at least one tag is by evaluating numeric scores representing relevance.
  15. 15. A computer-readable medium having executable instructions operable to cause a device to: receive a search request from a user containing a search keyword; retricyc results including one or more files responsive to the search request for prcsentation to the user; receive file information and access information about the one or more files, wherein the access information indicates whether the one or more flics has bccn previously accessed by the user; select at least one eligible file from the one or more files based on at least one of thc acccss information and the flIc information; identify at least one tag based on at least one of the search keyword, the access information, and the file information; tag the eligible file with the tag by associating the tag with the eligible file; and -21 -store the association of the tag with the eligible file.
  16. 16. The medium of claim 15, wherein the access information indicates whether the user has previously opened, copied, modified, or shared the one or more files.
  17. 17. The medium of claim 15 or 16, wherein the tag comprises at least one of the search keyword and a related term derived from the search keyword.
  18. 18. The medium of claim 15, 16 or 17, wherein the file information comprises one of a filename and a storage location.
  19. 19. The medium of any of claims 15 to 18, wherein the one or more files has been previously accessed by the user, and wherein the file information comprises a file location that is similar to that of the one or more files.
  20. 20. The medium of any of claims 15 to 19,herein the executable instructions are operable to cause the device to retrieve the results including the one or more files responsive to the search request from a file system controlled by a second user for presentation to the user.
  21. 21. Computer software which, when executed by a computer, is arranged to perform a method according to any of claims 1 to 7.
  22. 22. A computer readable medium substantially as described hercinbeforc with reference to the accompanying drawings.
  23. 23. An apparatus substantially as described hercinbefore with reference to the accompanying drawings.
  24. 24. A method substantially as described hereinbefore with reference to the accompanying drawings
GB1307488.5A 2012-04-26 2013-04-25 Automatically associating tags with files in a computer system using search keywords. Withdrawn GB2503549A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/457,150 US20130290323A1 (en) 2012-04-26 2012-04-26 Systems and methods for automatically associating tags with files in a computer system

Publications (2)

Publication Number Publication Date
GB201307488D0 GB201307488D0 (en) 2013-06-12
GB2503549A true GB2503549A (en) 2014-01-01

Family

ID=48626829

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1307488.5A Withdrawn GB2503549A (en) 2012-04-26 2013-04-25 Automatically associating tags with files in a computer system using search keywords.

Country Status (2)

Country Link
US (1) US20130290323A1 (en)
GB (1) GB2503549A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170167559A1 (en) * 2015-12-11 2017-06-15 Hyundai Motor Company Structure of mounting bracket

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013185329A1 (en) * 2012-06-14 2013-12-19 Nokia Corporation Method and apparatus for associating interest tags with media items based on social diffusions among users
US20140280188A1 (en) * 2013-03-15 2014-09-18 Perforce Software, Inc. System And Method For Tagging Filenames To Support Association Of Information
US9870422B2 (en) * 2013-04-19 2018-01-16 Dropbox, Inc. Natural language search
US11238056B2 (en) * 2013-10-28 2022-02-01 Microsoft Technology Licensing, Llc Enhancing search results with social labels
WO2015078754A1 (en) * 2013-11-29 2015-06-04 Koninklijke Philips N.V. Document management system for a medical task
US11645289B2 (en) 2014-02-04 2023-05-09 Microsoft Technology Licensing, Llc Ranking enterprise graph queries
US9870432B2 (en) 2014-02-24 2018-01-16 Microsoft Technology Licensing, Llc Persisted enterprise graph queries
US11657060B2 (en) 2014-02-27 2023-05-23 Microsoft Technology Licensing, Llc Utilizing interactivity signals to generate relationships and promote content
US10757201B2 (en) 2014-03-01 2020-08-25 Microsoft Technology Licensing, Llc Document and content feed
US10394827B2 (en) 2014-03-03 2019-08-27 Microsoft Technology Licensing, Llc Discovering enterprise content based on implicit and explicit signals
US10255563B2 (en) 2014-03-03 2019-04-09 Microsoft Technology Licensing, Llc Aggregating enterprise graph content around user-generated topics
US10061826B2 (en) 2014-09-05 2018-08-28 Microsoft Technology Licensing, Llc. Distant content discovery
KR101611388B1 (en) * 2015-02-04 2016-04-11 네이버 주식회사 System and method to providing search service using tags
CN109155152B (en) * 2016-05-16 2023-12-12 皇家飞利浦有限公司 Clinical report retrieval and/or comparison
CN106648592A (en) * 2016-09-29 2017-05-10 郑州云海信息技术有限公司 Method for rapidly retrieving target files under linux
CN107577777A (en) * 2017-09-12 2018-01-12 北京奇艺世纪科技有限公司 A kind of file reference method, apparatus and electronic equipment
US10866963B2 (en) 2017-12-28 2020-12-15 Dropbox, Inc. File system authentication
US11481377B2 (en) * 2018-10-30 2022-10-25 Microsoft Technology Licensing, Llc Compute-efficient effective tag determination for data assets

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110295593A1 (en) * 2010-05-28 2011-12-01 Yahoo! Inc. Automated message attachment labeling using feature selection in message content
US8386509B1 (en) * 2006-06-30 2013-02-26 Amazon Technologies, Inc. Method and system for associating search keywords with interest spaces

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8386509B1 (en) * 2006-06-30 2013-02-26 Amazon Technologies, Inc. Method and system for associating search keywords with interest spaces
US20110295593A1 (en) * 2010-05-28 2011-12-01 Yahoo! Inc. Automated message attachment labeling using feature selection in message content

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170167559A1 (en) * 2015-12-11 2017-06-15 Hyundai Motor Company Structure of mounting bracket

Also Published As

Publication number Publication date
US20130290323A1 (en) 2013-10-31
GB201307488D0 (en) 2013-06-12

Similar Documents

Publication Publication Date Title
US20130290323A1 (en) Systems and methods for automatically associating tags with files in a computer system
US11163777B2 (en) Smart content recommendations for content authors
US9288285B2 (en) Recommending content in a client-server environment
US20190005025A1 (en) Performing semantic graph search
US20130290347A1 (en) Systems and methods for providing data-driven document suggestions
US20190259040A1 (en) Information aggregator and analytic monitoring system and method
US10235476B2 (en) Matching objects using match rules and lookup key
US9129009B2 (en) Related links
JP2022184964A (en) Systems and methods for direct in-browser markup of elements in internet content
US11765176B2 (en) Method, apparatus, and computer program product for managing access permissions for a searchable enterprise platform
US20130166678A1 (en) Smart Suggestions Engine for Mobile Devices
US9639627B2 (en) Method to search a task-based web interaction
US10635725B2 (en) Providing app store search results
US20130346405A1 (en) Systems and methods for managing data items using structured tags
RU2743932C2 (en) Method and server for repeated training of machine learning algorithm
US10725618B2 (en) Populating contact information
US10108610B1 (en) Incremental and preemptive machine translation
US10108611B1 (en) Preemptive machine translation
US9692804B2 (en) Method of and system for determining creation time of a web resource
US11176312B2 (en) Managing content of an online information system
US10664332B2 (en) Application programming interfaces for identifying, using, and managing trusted sources in online and networked content
US9251125B2 (en) Managing text in documents based on a log of research corresponding to the text
US20200089714A1 (en) Method and server for indexing web page in index
US20130290372A1 (en) Systems and methods for associating tags with files in a computer system
US11461422B2 (en) Page personalization

Legal Events

Date Code Title Description
732E Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977)

Free format text: REGISTERED BETWEEN 20160602 AND 20160608

732E Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977)

Free format text: REGISTERED BETWEEN 20190523 AND 20190529

WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)