PERSON SEARCH UTILIZING ENTITY EXPANSION
BACKGROUND
[0001] Locating content regarding a specific person on the Internet can be challenging. There are many factors that make "people search" difficult: most names are not unique. In any given area there may be several individuals with the same name. Additionally, the web presence of any given person may be low such that search results for that person will be dominated by results referring to a better known individual with the same name.
SUMMARY
[0002] The following Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. The
Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
[0003] According to aspects of the disclosed subject matter, a search query is received from a computer user, the search query identifying a person for which content (or references to content) is sought. Upon receiving the search query from a computer user, related entity data is obtained from at least one related entity source for the identified person. Related entity data comprises at least one of a related entity (or entities) or a category associated with the identified person. An expanded search query is generated according to the search query from the computer user and the related entity data. Search results are obtained according to the expanded search query and a search results presentation is generated and returned to the computer user in response to the search query.
[0004] According to further aspects of the disclosed subject matter, a computer-readable medium bearing computer-executable instructions is presented. When executed on a computing system comprising at least a processor executing the instructions retrieved from the medium, the computing system is configured to carry out a method for responding to a search query from a user. More particularly, in response to receiving a search query from a computer user, where the search query identifies a person for which content (or references to content) is sought, related entity data is obtained from at least one related entity source for the identified person. An expanded search query is generated according to the search query from the computer user and the related entity data. Search results are obtained
according to the expanded search query and a search results presentation is generated and returned to the computer user in response to the search query.
[0005] According still further aspects of the disclosed subject matter, a computer system for responding to a search query for content related to a person is presented. The computer system comprises a processor and a memory, wherein the processor executes instructions stored in the memory as part of or in conjunction with additional components to respond to a search query for content related to a person. These additional components include (by way of illustration and not limitation) a query topic identification component, a related entity retrieval component, an expanded query generator, a search results retrieval component, and a search results presentation generator. In operation, the query topic identification component configured to determine the identity of a person from the search query for which related content is sought. The related entity retrieval component obtains related entity data corresponding to the identified person from a related entity source. After obtaining related entity data, the expanded query generator generates an expanded query from the search query for content related to the identified person and from the related entity data. According to various embodiments, the related entity data comprises at least one of a related entity or a category associated with the identified person of the search query. The search results retrieval component obtains search results from a content store according to the expanded search query. Thereafter, the search results presentation generator generates a search results presentation according to the search results referencing content corresponding to the identified person and returns the search results presentation to the computer user.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] The foregoing aspects and many of the attendant advantages of the disclosed subject matter will become more readily appreciated as they are better understood by reference to the following description when taken in conjunction with the following drawings, wherein:
[0007] Figure 1 is a block diagram of a networked environment suitable for
implementing aspects of the disclosed subject matter;
[0008] Figure 2 is a flow diagram illustrating an exemplary routine for providing improved results in response to a search query regarding content for a particular person through query expansion;
[0009] Figure 3 is a flow diagram illustrating an exemplary routine for generating an expanded search query according to aspects of the disclosed subject matter;
[0010] Figures 4 and 5 illustrate elements of expanded search queries; and
[0011] Figure 6 is a block diagram illustrating exemplary components of a search engine configured to provide improved results in response to a search query from a computer user.
DETAILED DESCRIPTION
[0012] For purposed of clarity, the use of the term "exemplary" in this document should be interpreted as serving as an illustration or example of something, and it should not be interpreted as an ideal and/or a leading illustration of that thing. An entity corresponds to an abstract or tangible thing that includes, by way of illustration and not limitation: person, a place, a group, a concept, an activity, and the like.
[0013] Turning to Figure 1, Figure 1 is a block diagram illustrating an exemplary networked environment 100 suitable for implementing aspects of the disclosed subject matter, particularly in regard to providing improved search results to a computer user in response to a search query regarding a person. The exemplary networked environment 100 includes one or more user computers, such as user computers 102-106, connected to a network 108, such as the Internet, a wide area network or WAN, and the like. User computers include, by way of illustration and not limitation: desktop computers (such as desktop computer 104); laptop computers (such as laptop computer 102); tablet computers (such as tablet computer 106); mobile devices (not shown); game consoles (not shown); personal digital assistants (not shown); and the like. User computers may be configured to connect to the network 108 by way of wired and/or wireless connections. For purposes of illustration only, the exemplary networked environment 100 illustrates the network 108 as being located between the user computers 102-106 and the search engine 110, and again between the search engine 110 and the network sites 112-116. This illustration, however, should not be construed as suggesting that these are separate networks.
[0014] Also connected to the network 108 are various networked sites, including network sites 110-116. By way of example and not limitation, the networked sites connected to the network 108 include a search engine 110 configured to respond to search queries from computer users, news sources 112 and 114 which host various news articles and content, a social networking site 116, and the like. A computer user, such as computer user 101, may navigate via a user computer, such as user computer 102, to these and other networked sites to access content, including news content.
[0015] According to aspects of the disclosed subject matter, the search engine 110 is configured to provide search results (typically in the form of references to content
available on the network 108) in response to a search query from a computer user. In particular, in response to receiving a search query from a computer user for information regarding a particular person, the search engine 110 identifies content related to the identified person according to information in its content store, generates a search results presentation based on at least some of the identified content, and provides the search results presentation to the computer user.
[0016] Figure 1 also illustratively includes a social network site 116 and various news sources, including news sites 112-114. As will be readily appreciated, a social network site 116 is an online site/service that provides a platform in which a computer user can establish a profile describing various aspects of the user, build relationships and social networks with other computer users, groups, and the like. In a social network site 116, a computer user can establish or indicate various interests, activities, and backgrounds with those in his/her social network. Indeed, those skilled in the art will appreciate that a computer user is often able to indicate a preference or an interest in a particular entity on a social networking service as might be hosted by social networking site 116, whether that entity is a person, a place, a group, a concept, an activity, and the like. Though only one social network site 116 is included in the illustrative network environment 100, this is merely illustrative and should not be viewed as limiting upon the disclosed subject matter. In an actual embodiment, there may be any number of social network sites connected to the network 108.
[0017] As is known in the art, the search engine 110 is configured to communicate (directly or indirectly through services calls and/or web crawlers) with multiple content sources, including news sites 112 and 114, social networking site 116, and other sites such as blogs and registries (not shown) to obtain information regarding the content that is available at each network site. Information regarding available content may also be pushed to the search engine from various services and/or networking sites. This information is stored (typically as references to the content) in a content store such that the search engine can obtain content from this content store in order to respond to a search query from a computer user, such as computer user 101. The search engine 110 may also obtain information regarding any given individual from search query logs, network browsing histories, purchase histories, and the like. This information and the content obtained from the various network sites is typically indexed according to key words and phrases such that the information may be quickly identified and accessed. Further, in addition to information that is stored in the search engine's content store, a search engine 110 may
also be configured to obtain information from other network sites when responding to a search query. For example, according to aspects of the disclosed subject matter, when responding to a search query, the search engine 1 10 may obtain data from one or more social networking sites, such as social network site 116, as relevant information to return to the requesting computer user and/or as information to assist the search engine in identifying relevant information to return to the requesting computer user.
[0018] To further illustrate aspects of the disclosed subject matter, reference is now made to Figure 2. Figure 2 is a flow diagram of an exemplary routine for providing improved results in response to a search query regarding content corresponding to a particular person through query expansion. Beginning at block 202, the search engine 110 receives a search query from a computer user, such as computer user 101, the search query requesting content corresponding to a particular person.
[0019] As will be readily appreciated, a search query is typically (though not exclusively) a text string. For example, a search query for content relating to a person may be "Bruce Wayne". Accordingly, as there may be several individuals who have the same name, at block 204, the search engine attempts to uniquely identify the person who is the subject matter of the search query. According to aspects of the disclosed subject matter, the search engine attempts to uniquely identify the person for which content is requested according to at least general information and specific information relating to the requesting computer user. The general information includes, by way of illustration and not limitation: popularity of search queries corresponding to a person with the name identified in the search query; trending popularity of a person with the name identified in the search query; other terms and/or phrases in the search query (e.g., "Bruce Wayne Seattle" or "Bruce Wayne Microsoft"); an image representative of the person; and the like. Specific information relating to the requesting computer user may include, by way of illustration and not limitation: current location; prior search query history; current and former workplaces; current and former educational institutions that were attended; social networks; preferences (both explicitly and implicitly identified); general graph
connectivity between the requesting computer user and potential subjects of a search query as well as the number of mutual friends; physical distance between the requesting user and the potential subjects; location of friends; former locations; and the like. Typically, though not exclusively, the search engine 110 may, at least internally, associate a globally unique identifier to the person who is the subject matter of the search query. Moreover, once the person who is the subject matter of the search query is identifier, the search engine 110
may use the associated globally unique identifier in obtaining, or reranking, search results in response to the search query.
[0020] Of course, the order presented in blocks 202 and 204 should be viewed as illustrative and not limiting upon the disclosed subject matter. Under various conditions, the identity of a person for whom content is sought may be known prior to
submitting/receiving a search request. For example, auto-suggest search recommendations may indicate a particular person as one of the auto-suggestions and, typically, that suggested person's unique identity is known. Alternatively, another service may submit a search request for a person that uniquely identities the person to the search service such that the identity of the person needs not be determined. Accordingly, while one embodiment is disclosed in regard to blocks 202 and 204 of Figure 2, this is illustrative of one embodiment, and is not limiting upon the disclosed subject matter.
[0021] In regard to the search request identifying a person for whom content is sought, there may also be times in which the name of that person is not known but some information is provided that may lead to uniquely identifying that person. For example, the computer user may not know the name of the general manager of the Seattle Seahawks, but in submitting the text "general manager of the Seattle Seahawks" the computer user often sufficiently identifies the person for whom content is sought that, in block 204, the identity of the person can be determined.
[0022] At block 206, after having identified the person who is the subject matter of the search query, the search engine 110 obtains related entity data corresponding to the identified person. According to aspects of the disclosed subject matter, related entity data includes entities related to the identified person. A related entity is an entity with which the identified person is related for some reason. While some of the reasons may be known, others may be unknown and implied according to statistical similarities. For example, assume that the identified person is an employee of Company A and is a member of Workgroup Z. Related entities to the identified person, based on this employment relationship, would typically include "Company A" and "Workgroup Z". Other related entities arising from this same employment relationship may include fellow co-workers. Still other entities, based on this same employment relationship, may also include other (previous) workgroups, past and present co-workers, and the like. In furtherance of the example above, the identified person may also be an alumnus of particular university. Hence, the university may be a related entity to the identified person, as well as the particular college in the university where the identified person studied, the degree that was
awarded, academic achievements of the identified person, fellow students, and the like. Still further, assuming that the identified person also has a passion for gardening, the identified person may be a member of a local master gardeners society and, as a result, the local master gardeners society may be a related entity to the identified person as well as fellow members of the society.
[0023] According to aspects of the disclosed subject matter, the search engine 110 obtains related entity data from one or more related entity sources. The search engine 110 may store host or store various information regarding the identified person from a user profile store (e.g., the user profile store 628 of Figure 6) and, therefore, be one of the related entity sources. For example, the search engine 110 may store user profile information corresponding to the computer user. This user profile information may be based on explicitly identified information (from the identified person) as well as implicitly identified information (such as information derived from search queries, browsing history, and the like.) Social networking sites, such as social networking site 116, represent additional related entity sources. As indicated above, a social networking site enables a person, such as the identified person of the search query, to establish relationships and social networks with other entities (that includes people, organizations, activities, causes, and the like.) Of course, there may be a variety of related entity sources, each of which hosting information that may indicate a relationship between the identified person and other entities, and the search engine 110 can be configured to obtained related entity data from any number of these related entity sources.
[0024] It should be appreciated that the related entity information that is hosted by each of the related entity sources may comprise information that the identified person wishes to keep private. To resolve this, according to aspects of the disclosed subject the search engine identifies the requesting computer user and, if identified, can use attempt to use the permissions afforded to the requesting computer user in obtaining the related entity information. In various embodiments, a computer user is required to authenticate himself or herself in order to access information regarding the identified person. Other
requirements may include, by way of illustration and not limitation, that the requesting computer user be logged into one or more services in order to access and/or view content that would otherwise be restricted.
[0025] As suggested in regard to the examples above, a related entity source may associate one or more categories to an individual (such as the identified person of a search query). Accordingly, the related entity data obtained from the related entity sources may
also include category data. Category data (both in regard to the set of potential
relationships defined by the category as well as the actual relationships of a person per a category) may be advantageously used in expanding a received search query (as discussed in greater detail below.) In the example above, a related entity source may have associated various categories with the identified person including "Employee", "Alumnus", and "Gardener". Moreover, each of the related entity sources may maintain category information that defines what is meant to be associated with the category. This category information often includes a list of potential, though not necessarily required, relationships that may exists between a first entity belonging to a specific category (such as the identified person) and other entities. The "Employee" category may define a set of potential relationships as including "employer", "work group", "current manager", "direct reports", "co-worker" and the like. Correspondingly, each entity that is categorized as an "Employee" could then have relationships with other entities as defined by the set of potential relationships. Of course, while a category that defines a set of potential relationships, an entity of that category is not required to be related to other entities based on each and every potential relationship. Further still, a given entity, such as an entity corresponding to the identified person of a search query, may be associated with a plurality of categories. In addition to defined categories, categories may also be inferred. For example, an employee may be interested in former work performed previously at a company such that an inferred category is "co-worker".
[0026] At block 208, a search model is identified/determined to apply to the expanded search query. This search model includes information for weighting various elements (terms and phrases) of the expanded search query to improve search results. Applying a search model to the expanded search query recognizes, at least in part, that not all query terms of the expanded search query are equal, i.e., some query terms are more important in identifying relevant search content for the identified person than others. Typically, though not exclusively, favoring/weighting employment-related query terms or education-related query terms provides improved search results when the relevancy of the various search results (or, more accurately stated, the content referenced by the search results) are presented to a particular user. According to various embodiments, selection of a search model may be based on information regarding the requesting computer user. For example, if it is known that the requesting computer user is in college then an education model may be selected. Alternatively, selection of a search model may be made according to information regarding the identified person, from information available to the search
engine 110 or external sources including from the related entity data. In yet additional embodiments, selection of a search model may be made according to information regarding both the requesting computer user as well as the identified person of the search query.
[0027] At block 210, an expanded search query is generated according to the determined search model for the identified person. Generating an expanded search query is discussed in greater detail in regard to Figure 3. More particularly, Figure 3 is a flow diagram illustrating an exemplary routine 300 for generating an expanded search query according to related entity data obtained from related entity sources. At block 302, the identified person and filter elements of the received search query are included as an initial section of the expanded search query. While this may entail simply copying the received search query into the initial section, the initial search query may not necessarily simply be copied. Often a requesting computer user may misspell the name of the person that is sought or any one of the identifying filter elements associated with the person. For example, a received search query may be "Bruse Wayn Microsoft", in an effort to find content corresponding to "Bruce Wayne" who works at "Microsoft". If it can be determined that the name (or one or more filter elements) is misspelled, it would be less productive to include the original search query in the expanded search query. Hence, in block 204 of routine 200, the person is identified. Correction to the filter elements may also be made (though not explicitly called out in routines 200 and 300.)
[0028] In addition to including the query terms of the search query into the expanded search query, query terms are derived from the obtained related entity data and
included/incorporated in the expanded search query. In particular, at block 304, the related entities (related to the identified person) from the obtained related entity data are included in a related entities section of the expanded search query in accordance with the determined search model. At block 306, query terms are derived from the category data including both the category (as an entity) and category entities (as described below) are included in a category entities section of the expanded search query according to the search model. Thereafter, at block 308, the expanded search query is returned and the routine 300 terminates.
[0029] To better illustrate the above-described sections of the expanded search query, reference is made to Figure 4. Figure 4 illustrates an exemplary expanded search query 400 corresponding to the example above, i.e., for the person "Bruce Wayne". For this example, it is assumed that this identified person, "Bruce Wayne", was associated with
only one category, Employee. As shown in the expanded search query 400, the initial section 402 includes the original search query text 404, "Bruce. Wayne", as well as alternative names related to the identified person, in this case "Batman Dark.Knight Matches. Malone Caped.Crusader". Of course, not all computer users will have access rights to all information. In the example able, not all people might know of the alternative names that might uniquely reference "Bruce Wayne". However, when the requesting computer user has full rights, such information may be useful to obtain improved results. Regarding the operator 406 "." between the two names of the search query, this is representative of an exemplary convention to indicate that the two names, "Bruce" and "Wayne", should be viewed as preferring "Bruce" occurring next to "Wayne" in that order, though it is not mandatory that the occur together or that both must occur - only that it is highly preferred. Of course, this convention (as well as the other operators in this Figure) is illustrative only and should not be viewed as limiting upon the disclosed subject matter. Other syntactical conventions include (by way of illustration and not limitation): the operator 408 "inbody:" indicating to the search engine 110 that it should match a document when any one of the words/terms between the parentheses is found in the body of the content; a "noalter:" operator that indicates that the spelling of the terms should not be modified; and a "norelax:" operator that indicates that the terms are important and may not be dropped in matching content. The operator 410 'Vindicates to a search engine a concatenation of other search operators and/or tokens.
[0030] The expanded search query 400 also includes a related entity section 412 that includes the related entities to the identified person of the search query, such as text 416 "Research". Still further included in the expanded search query is a category entities section 414 that includes the category entities of category "Employee". As mentioned above, the category entities section 414 includes the category ("Employee") as well as the category entities such as text 418 "Workgroup". These entries optionally help produce results based on how the computer user likely knows the identified person, in this case "Bruce Wayne". As can be seen, the expanded search query for a particular person takes a search query, such as "Bruce Wayne" and expands the query with related entities as well as category entities to better identify content corresponding to the identified person.
Regarding the operator "rankonly:", this operator operates to let the ranking of a document go up as a matching token/value is found in the document, such as "Research". It operates such that the specified terms are not required to be found in a resulting document but, if found, will result in the document being ranked as more relevant. The operator, "word:",
operates to match on a document if one or more of the tokens in the parenthesis, such as "Workgroup", is found in the document. In a sense, the operator "word:" operates as a type of max (or maximum value) operator, comparing each token between the parenthesis to the document and returning the single maximum value of the rank of the tokens.
Specifically, if more than one token match, only the value of the greatest match token is returned. A "norank:" token (not shown) would require that the specified tokens
(identified between the enclosing parentheses) be required in a results document but doesn't affect the ordering or relevance of the document in the overall results. In combination with the operator "rankonly:", the rank of a document in which the rank of the document is increased if any one or more of the tokens is found.
[0031] While the expanded queries 400 and 500 generally include textual tokens (such as "Bruce.Wayne"), it should be appreciated that this is illustrative and should not be viewed as limiting upon the disclosed subject matter. In alternative embodiments, one or more the tokens in an expanded search query could be specific identifiers that identify the sought- for person and/or related entities. For example, expanded search query 500 includes an operator 510 that includes a Facebook numerical identifier ("740049358") as well as an operator 512 that includes a Facebook user identifier ("t-drake"). Of course, any particular sources of identifiers may be used and Facebook identifiers are illustrative only.
[0032] As suggested above, an identified person may be associated with more than one category. Hence, while the expanded search query 400 of Figure 4 describes information from a single category, it is for illustration. Similarly, Figure 5 illustrates an exemplary expanded search query 500 corresponding to the example above, i.e., for the identified person "Bruce Wayne", but in this example includes information from two categories, Employer and Education. As can be seen, the expanded search query 500 includes the initial section 502 as well as related entities section 504 and category entities section 506. As can be seen in the related entities section 504 and category entities section 506, as more related entities are found for the identified person and as more information corresponding to various categories for the identified person are obtained, the expanded search queries become more detailed and encompassing to assist the search engine to identify content corresponding to the identified person of the search query.
[0033] At block 212 search results are obtained according to the expanded search query. Obtaining search results according to a search query, in this case a search query with expanded terms according to related entities and categories is known in the art. According to aspects of the disclosed subject matter, search results are obtained according to the
query terms from the received search query and optionally according to the query terms derived from the related entity data. Stated differently, the query terms of the expanded search query that are derived from the related entity data are intended to expand the scope of content/search results that correspond to the identified person, but these query terms that are derived from the related entity data are not mandatory terms. In this manner (i.e., that the query terms derived from the related entity data are "optional"), the expanded search query expands the scope of content that potentially relates to the identified person rather than narrowing the scope of content if those query terms were not optional.
[0034] At block 214, a search results presentation is generated, at least in part, according to the obtained search results. Typically, one or more search results pages are generated according to the obtained search results, with those results scoring the highest being presented in the first pages of the presentation. At block 216, after generating the search results presentation, at least a portion of the presentation is returned to the requesting computer user in response to the search query. According to various embodiments, the results that are returned to the requesting computer user are organized according to the various categories of information regarding the subject person. Thereafter, the routine 200 terminates.
[0035] While not displayed in routine 200, additional steps may be taken after the results are returned to the computer user. By way of illustration and not limitation, one or more processes on the computer user's device may monitor the computer user's activity with regard to the results provided, e.g., which references (hyperlinks) the computer user followed, which were avoided, how long the computer user spent with some content vs. other content, and the like. By monitoring the computer user's activity and submitting it to the search engine, inferences may be made regarding specific people and/or entities such that subsequent queries may take these inferences into account. Indeed, some or all of the inferences, both for and against specific results, may be used to form the search models discussed above.
[0036] Regarding routines 200 and 300, while these routines are expressed in regard to discrete steps, these steps should be viewed as being logical in nature and may or may not correspond to any actual and/or discrete steps of a particular implementation. Nor should the order in which these steps are presented in the various routines be construed as the only order in which the steps may be carried out. Moreover, while these routines include various novel features of the disclosed subject matter, other steps (not listed) may also be carried out in the execution of the routines. Further, those skilled in the art will appreciate
that logical steps of these routines may be combined together or be comprised of multiple steps. Steps of routines 200 and 300 may be carried out in parallel or in series, or pre-computed. Often, but not exclusively, the functionality of the various routines is embodied in software (e.g., applications, system services, libraries, and the like) that is executed on computer hardware and/or systems as described below in regard to Figure 6. In various embodiments, all or some of the various routines may also be embodied in hardware modules, including system on chips, on a computer system.
[0037] While many novel aspects of the disclosed subject matter are expressed in routines embodied in applications (also referred to as computer programs), apps (small, generally single or narrow purposed, applications), and/or methods, these aspects may also be embodied as computer-executable instructions stored by computer-readable media, also referred to as computer-readable storage media. As those skilled in the art will recognize, computer-readable media can host computer-executable instructions for later retrieval and execution. When the computer-executable instructions stored on the computer-readable storage devices are executed, they carry out various steps, methods and/or functionality, including those steps, methods, and routines described above in regard to routines 200 and 300. Examples of computer-readable media include, but are not limited to: optical storage media such as Blu-ray discs, digital video discs (DVDs), compact discs (CDs), optical disc cartridges, and the like; magnetic storage media including hard disk drives, floppy disks, magnetic tape, and the like; memory storage devices such as random access memory (RAM), read-only memory (ROM), memory cards, thumb drives, and the like; cloud storage (i.e., an online storage service); and the like. For purposes of this disclosure, however, computer-readable media expressly excludes carrier waves and propagated signals.
[0038] Turning now to Figure 6, Figure 6 is a block diagram illustrating exemplary components of a search engine configured to provide improved results in response to a search query from a computer user. As shown in Figure 6, the search engine 110 includes a processor 602 (or processing unit) and a memory 604 interconnected by way of a system bus 610. As those skilled in the art will appreciated, memory 604 typically (but not always) comprises both volatile memory 606 and non-volatile memory 608. Volatile memory 606 retains or stores information so long as the memory is supplied with power. In contrast, non-volatile memory 608 is capable of storing (or persisting) information even when a power supply is not available. Generally speaking, RAM and CPU cache memory
are examples of volatile memory whereas ROM and memory cards are examples of nonvolatile memory.
[0039] The processor 602 executes instructions retrieved from the memory 604 in carrying out various functions, particularly in responding to search queries with improved results through query expansion. The processor 602 may be comprised of any of various commercially available processors such as single -processor, multi-processor, single-core units, and multi-core units. Moreover, those skilled in the art will appreciate that the novel aspects of the disclosed subject matter may be practiced with other computer system configurations, including but not limited to: mini-computers; mainframe computers, personal computers (e.g., desktop computers, laptop computers, tablet computers, etc.); handheld computing devices such as smartphones, personal digital assistants, and the like; microprocessor-based or programmable consumer electronics; game consoles, and the like.
[0040] The system bus 610 provides an interface for the various components to inter-communicate. The system bus 610 can be of any of several types of bus structures that can interconnect the various components (including both internal and external components). The search engine 110 further includes a network communication component 612 for interconnecting the network site with other computers (including, but not limited to, user computers such as user computers 102-106, other network sites including network sites 112-116) as well as other devices on a computer network 108. The network communication component 612 may be configured to communicate with other devices and services on an external network, such as network 108, via a wired connection, a wireless connection, or both.
[0041] The search engine 110 also includes query topic identification component 614 that is configured to obtain identify the subject matter of the search query, such as a person identified in the search query, as described above. Also included in the search engine 110 is a related entity retrieval component 616. The related entity retrieval component 616 obtains related entity data corresponding to related entities of the identified person (or, more generally, related entities of the subject matter of the search query). As previously mentioned, the related entity data includes related entities, categories associated with the identified person, as well as category data corresponding to the associated categories. The related entity retrieval component 616 obtains the related entity data from related entity sources as described above in regard to Figure 2. An expanded query generator 618
generates an expanded search query from the search query received from a computer user according to the related entity data obtained by the related entity retrieval component 616.
[0042] A search results retrieval component is configured to obtain search results from a content store 626 according to the expanded search query generated by the expanded query component 618. A search model component 624 is configured to select a search model (as described above) and apply the search model to the obtained search results. The search results presentation generator 620 generates a search results presentation, typically including one or more search results pages, for presentation to the requesting computer user in response to the search query.
[0043] Those skilled in the art will appreciate that the various components of the search engine 110 of Figure 6 described above may be implemented as executable software modules within the computer systems, as hardware modules (including SoCs - system on a chip), or a combination of the two. Moreover, each of the various components may be implemented as an independent, cooperative process or device, operating in conjunction with one or more computer systems. It should be further appreciated, of course, that the various components described above in regard to the search engine 110 should be viewed as logical components for carrying out the various described functions. As those skilled in the art appreciate, logical components (or subsystems) may or may not correspond directly, in a one-to-one manner, to actual, discrete components. In an actual embodiment, the various components of each computer system may be combined together or broke up across multiple actual components and/or implemented as cooperative processes on a computer network 108.
[0044] In addition to operating on a search engine 110, aspects of the disclosed subject matter may be implemented on other computing devices and/or distributed on multiple computing devices, including a computer user's device. For example, according to various embodiments at least some highly relevant content to a search request may be hosted on a site that is access-protected, i.e., the content is available to the computer user when he/she is authenticated and/or maintains an open log-in status with the site, but the content is otherwise restricted to others. In response to a search request from the computer user, a search engine (or other service) may indirectly obtain related entity data from this access- restricted site by way of the computer user's device; the computer user's device (e.g., upon which the computer user maintains a current logged in status with the site) accesses related entity data on behalf of the search service. Indeed, in various embodiments, one or more
components on the computer user's device obtain data corresponding to others from the access restricted sites in anticipation of a search request.
[0045] While much of the disclosed subject matter has be made in regard to a computer user taking an active role in obtaining content relating to a particular person, aspects of the disclosed subject matter may be suitably and advantageously applied to auto-generation of content relating to people. For example, various search queries regarding one or more persons (expanded search queries) may be made such that the "latest" content on the Internet regarding that person (or persons) may already be available when requested. Yet another example would be to set up an environment such that a user may be notified when a new image/video/news story of that user occurs on the Internet. Of course, aspects of the disclosed subject matter may be applied to topics or entities other than people. For example, an auto-generation page may be set up to display the latest regarding rock climbing, the Supreme Court, and the like.
[0046] While various novel aspects of the disclosed subject matter have been described, it should be appreciated that these aspects are exemplary and should not be construed as limiting. Variations and alterations to the various aspects may be made without departing from the scope of the disclosed subject matter.