US20230113420A1 - Predicting accuracy of submitted data - Google Patents

Predicting accuracy of submitted data Download PDF

Info

Publication number
US20230113420A1
US20230113420A1 US18/064,657 US202218064657A US2023113420A1 US 20230113420 A1 US20230113420 A1 US 20230113420A1 US 202218064657 A US202218064657 A US 202218064657A US 2023113420 A1 US2023113420 A1 US 2023113420A1
Authority
US
United States
Prior art keywords
user
knowledge base
subsystems
search
search system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/064,657
Inventor
Krzysztof Czuba
Evgeniy Gabrilovich
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US18/064,657 priority Critical patent/US20230113420A1/en
Assigned to GOOGLE LLC reassignment GOOGLE LLC ENTITY CONVERSION Assignors: GOOGLE INC.
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CZUBA, KRZYSZTOF, GABRILOVICH, EVGENIY
Publication of US20230113420A1 publication Critical patent/US20230113420A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Definitions

  • This specification relates to determining whether a submission of data by a user is accurate.
  • a search system can provide one or more knowledge panels in response to a received search query.
  • a knowledge panel is a user interface element that provides a collection of information or other content related to a particular entity referenced by the search query.
  • the entity may be a person, place, country, landmark, animal, historical event, organization, business, sports team, sporting event, movie, song, album, game, work of art, or any other entity.
  • a knowledge panel provides a summary of information about the entity.
  • a knowledge panel for a famous singer may include the name of the singer, an image of the singer, a description of the singer, one or more facts about the singer, content that identifies songs and albums recorded by the singer, and/or links to searches related to the singer.
  • Other types of information and content can also be presented in the knowledge panel.
  • Information presented in a knowledge panel can include content obtained from multiple disparate sources, e.g., multiple different web pages accessible over the Internet.
  • a search system can maintain a knowledge base that stores information about various entities.
  • the system can assign a unique entity identifier to each entity.
  • the system can also assign one or more text string aliases to a particular entity.
  • the Statue of Liberty can be associated with aliases “the Statue of Liberty” and “Lady Liberty.” Aliases need not be unique among entities.
  • “jaguar” can be an alias both for an animal and for a car manufacturer.
  • the system can also store information about an entity's relationship to other entities.
  • the system can define a “located in:” relationship between two entities to reflect, for example, that the Statue of Liberty is located in New York City.
  • the system stores relationships between entities in a representation of a graph in which nodes represent distinct entities and links between nodes represent relationships between the entities.
  • the system could maintain a node representing the Statue of Liberty, a node representing New York City, and a link between the nodes to represent that the Statue of Liberty is located in New York City.
  • This specification describes how a system can compute a likelihood that a user will provide accurate updates to a knowledge base based on information in the user's profile.
  • the system can train a model using previous knowledge base submissions by users and use the model to predict whether a particular user will provide accurate updates.
  • one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving, from a user, an update to an attribute of an entity related to a topic; obtaining user profile data of the user; determining from the user profile data that the user is reliable relative to the topic; and in response to determining that the user is reliable relative to the topic, updating a knowledge base with the update to the attribute of the entity.
  • Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
  • a system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions.
  • One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.
  • Determining that the user is reliable relative to the topic comprises computing, using the user profile data as input to a user model, a likelihood that an update from the user to an entity related to the topic is accurate; and determining that the computed likelihood satisfies a threshold.
  • the user model is trained using training examples that represent previously submitted updates to the knowledge base by users and whether the previously submitted updates were accurate.
  • Each training example includes information from a user profile of a user that submitted the corresponding update.
  • the information from the user profile includes one or more statistics describing the accuracy of knowledge base submissions by the user or a topic of interest and a level of expertise for the topic of interest.
  • the information from the user profile includes information about subsystems accessed by the user.
  • the update to the attribute of the entity includes an update to a value of an existing attribute of the entity stored in the knowledge base.
  • the update to the attribute of the entity includes a new attribute of the entity that was previously not stored in the knowledge base.
  • the threshold is different for an existing attribute of the entity than for a new attribute for the entity.
  • the updated entity attribute in the knowledge base is provided in response to search requests by users.
  • another innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving, from a user, a search request related to a topic; obtaining user profile data of the user; determining from the user profile data that the user is reliable relative to the topic; in response to determining that the user is reliable relative to the topic, providing to the user a request for an update to an attribute of an entity related to the topic; receiving, from the user, an update to the attribute of the entity related to the topic; and updating a knowledge base with the update to the attribute of the entity.
  • Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
  • Determining that the user is reliable relative to the topic comprises computing, using the user profile data as input to a user model, a likelihood that an update from the user to an entity related to the topic is accurate; and determining that the computed likelihood satisfies a threshold.
  • the user model is trained using training examples that represent previously submitted updates to the knowledge base by users and whether the previously submitted updates were accurate.
  • Each training example includes information from a user profile of a user that submitted the corresponding update.
  • the information from the user profile includes one or more statistics describing the accuracy of knowledge base submissions by the user or a topic of interest and a level of expertise for the topic of interest.
  • the information from the user profile includes information about subsystems accessed by the user.
  • Providing to the user a request for an update to an attribute of an entity related to the topic comprises providing a knowledge panel that presents one or more items of information about the entity and requests the update to the attribute of the entity.
  • Receiving, from a user, an update to an attribute of an entity related to a topic comprises receiving the update to the attribute of the entity through a user interface control of the knowledge panel.
  • a search system can automatically determine whether a submission from a user is likely to be accurate based on the accuracy of previous submissions received from the user or other indications of user trustworthiness. This can reduce the amount of erroneous or spam inputs to the knowledge base.
  • a search system is more likely to receive accurate data updates for a particular topic by asking users who are interested in the particular topic to provide updates on the topic. This can reduce the likelihood that a user will be annoyed by being asked to provide an update and can increase the likelihood of receiving a response from a user.
  • FIG. 1 illustrates an example search results page that includes a knowledge panel.
  • FIG. 2 is a diagram of an example system.
  • FIG. 3 is a flow chart of an example process for training a user model.
  • FIG. 4 is a flow chart of an example process for computing the likelihood that a user will provide an accurate update.
  • FIG. 5 is a flow chart of an example process for asking particular users to update knowledge base information.
  • Some search systems allow users to update data stored in a knowledge base.
  • an update to the knowledge base updates an [attribute, value] pair associated with an entity. For example, a user can submit an update to the knowledge base for an existing attribute “date of birth,” where the updated value is “Feb. 12, 1809.”
  • a user may also submit an update to the knowledge base for a new attribute associated with a person entity, e.g., “Favorite food,” and a corresponding new value, e.g., “pizza,” providing both the new attribute and the new value.
  • a search system can create a user model to determine whether a particular user is likely to provide accurate updates to the knowledge base.
  • the search system can use machine learning to generate the user model based on the accuracy of updates to the knowledge base previously entered by the user and profile information associated with the user.
  • the search system can use the user model to predict the accuracy of knowledge base updates submitted by the user, determine whether to update the knowledge base with the submitted data, and determine whether or not to ask a the user for specific data or verification of data in the knowledge base.
  • FIG. 1 illustrates an example search results page 100 that includes a knowledge panel 130 .
  • a user can submit the query 102 to a search system through a graphical user interface of a software application, e.g., a web browser, or through a user interface of some other software application installed on a user device, e.g., a spoken query issued through a speech recognition application installed on a mobile user device.
  • the search system can provide a search results page 100 in a form that can be presented on the user device.
  • the search results page 100 can be provided as a markup language document, e.g., a HyperText Markup Language document, and the user device can render the document, e.g., using a web browser, in order to present the search results page 100 on a display of the user device.
  • a markup language document e.g., a HyperText Markup Language document
  • the user device can render the document, e.g., using a web browser, in order to present the search results page 100 on a display of the user device.
  • the search results page 100 includes three search results 122 a - c that the search system has obtained in response to the query 102 .
  • Each of the search results 122 a - c includes a title, a display link, and a text snippet.
  • Each of the search results 122 a - c is also linked to a respective resource, e.g., a web page at a location indicated by the display link. User selection of a search result will cause the application to navigate to the linked resource.
  • the search results page 100 also includes an indicator 110 that the user is currently logged in.
  • the search results page 100 also includes a knowledge panel 130 corresponding to an entity with an alias corresponding to the search query 102 .
  • the entity is Abraham Lincoln.
  • the knowledge panel 130 includes various items of information about Abraham Lincoln.
  • the knowledge panel 130 includes an entity name 132 , a picture of the entity 133 , items of information 134 , including an occupation, a date of birth, a date of death, and a spouse's name.
  • the search system can provide the knowledge panel 130 as an interface for the user to update one or more items of information maintained by the search system in the knowledge base.
  • the search system can invite the user to correct a specific one of the items of information 134 , or the search system can, upon user selection of any of the items of information 134 , provide an editable text-input field 136 for editing the item of information.
  • the search system can provide editable text-input field 136 through which the user can edit that particular item of information.
  • the user can submit the information, e.g., by selecting a “Submit” user interface control 138 .
  • the system can then evaluate the submitted information based on one or more criteria, e.g., the user's reliability or data submitted by other users. If the system determines that the update is likely to be accurate, the system can update the knowledge base with the submitted information. In this way, the system can use the knowledge panel 130 as an efficient way to ask for updates to information maintained by the search system from one place and in-line, e.g., without having to navigate away from the search results page 100 .
  • FIG. 2 is a diagram of an example system 200 .
  • the system includes a user device 210 coupled to a search system 230 over a network 220 .
  • the search system 230 is an example of an information retrieval system in which the systems, components, and techniques described below can be implemented.
  • the user device 210 transmits a query 212 to the search system 230 , e.g., over the network 220 .
  • the query 212 includes one or more terms and can include other information, for example, a location and a type of the user device 210 .
  • the search system 230 generates a response, generally in the form of a search results page 216 .
  • the search results page 216 can include search results 213 that the search system 230 has identified as being responsive to the query 212 .
  • the search system 230 can provide a data request 214 that requests an update to a particular item of information about the entity in the knowledge base 262 .
  • the data request 214 can be included in a knowledge panel for the entity, which can be used as an interface for the user to update the requested items of information.
  • the search system 230 transmits the search results page 216 over the network 220 back to the user device 210 for presentation to a user.
  • the search system 230 can receive updated information 218 that is either initiated by the user or initiated by a data request 214 .
  • the updated information can be received, for example, through a knowledge panel provided on the search results page 216 .
  • the search system 230 can then use the updated information 218 to update the knowledge base 262 .
  • the user device 210 can be any appropriate type of computing device, e.g., mobile phone, tablet computer, notebook computer, music player, e-book reader, laptop or desktop computer, PDA (personal digital assistant), smart phone, a server, or other stationary or portable device, that includes one or more processors 208 for executing program instructions and memory 206 , e.g., random access memory (RAM).
  • the user device 210 can include non-volatile computer readable media that store software applications, e.g., a browser or layout engine, an input device, e.g., a keyboard or mouse, a communication interface, and a display device.
  • the network 220 can be, for example, a wireless cellular network, a wireless local area network (WLAN) or Wi-Fi network, a mobile telephone network or other telecommunications network, a wired Ethernet network, a private network such as an intranet, a public network such as the Internet, or any appropriate combination of such networks.
  • WLAN wireless local area network
  • Wi-Fi Wireless Fidelity
  • the search system 230 can be implemented as computer programs installed on one or more computers in one or more locations that are coupled to each through a network.
  • the search system 230 includes a search system front end 240 , a search engine 250 , a data request module 260 , and a machine learning module 270 .
  • the computing device or devices that implement the search system front end 240 , the search engine 250 , the data request module 260 , and the machine learning module 270 may include similar components.
  • the search system 230 includes a user database 272 that stores information about users who access the search system 230 .
  • the user database 272 may include a user profile for each of the users who access the search system.
  • a user profile can include previous submissions by the user to the knowledge base and whether such submissions were accurate or not.
  • a user profile for registered or unregistered users may include user interactions with subsystems of the search system, e.g., a web search system, an image search system, a map system, an email system, a social network system, a blogging system, a shopping system, just to name a few, topics of interest, and an indication of a level of expertise of the user for each of the topics of interest, e.g., novice or expert.
  • the topics of interest and levels of expertise may include user-provided data or system-generated data based on a user's interaction with the search system.
  • the search system may determine that a specific user is interested in French restaurants based on a search history of the specific user or search results selected by the specific user. The search system may then add “restaurants” to the user's topics of interest.
  • users are distinguished by the IP addresses of the user devices used in performing the activities.
  • activities are recorded by the interactive system involved in the activity.
  • activity information is also, or alternatively, collected with the consent of the user by an application, e.g, a web browser toolbar, running on the user's device.
  • users may be given an opportunity to control whether the personal information about the users is collected.
  • certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
  • the search system front end 240 receives the query 212 from the user device 210 and routes the query 212 to the search engine 250 and the data request module 260 .
  • the search system front end 240 also provides the resulting search results page 216 that includes the search results 213 and the knowledge panel 214 to the user device 210 .
  • the search system front end 240 acts as a gateway, or interface, between user devices and the search system 230 .
  • the search engine 250 receives the query 212 and generates search results 213 that are responsive to the query.
  • the search engine 250 will generally include an indexing engine for indexing resources in a collection of resources.
  • the search engine 250 can index web pages found in a collection of web pages, e.g., web pages on the Internet.
  • a collection of resources indexed by the indexing engine may, but need not, be stored within search system 230 , e.g., in index database 252 .
  • the search engine 250 can rank the search results 213 using conventional methods and route the ranked search results 213 back to search system front end 240 for inclusion in the search results page 216 .
  • the data request module 260 receives the query 212 and determines whether the search system 230 should provide a knowledge panel in a response to the query as well as whether to ask the user through a data request 214 to update information about a particular entity. In some implementations, the data request 214 is presented through a knowledge panel on the search results page 216 .
  • the data request module 260 can determine whether the search system 230 should provide a knowledge panel using a data structure of the knowledge base 262 that maps an alias to one or more entities, e.g. an entity alias index.
  • entity alias index For example, the alias “Bush” can be mapped to a set of entities having that alias, e.g., the entity “George W. Bush,” the entity “George H. W. Bush,” the entity for the rock band “Bush,” and the entity for a category of plants having that alias.
  • the entity alias index may also include a score for each entity that represents a likelihood that the alias refers to each particular entity.
  • the data request module 260 can use some or all of the query 212 as input to the entity alias index.
  • the data request module 260 can use a returned entity for the query to present an entity in a knowledge panel.
  • the data request module 260 can also use a returned entity to identify a topic of the query, which can be used to compute a likelihood that the user will provide accurate updates for the topic.
  • the data request module 260 evaluates the accuracy of the updated information 218 to determine whether to update the knowledge base 262 with the updated information 218 .
  • the data request module 260 can determine both whether to provide a data request 214 as well as the accuracy of the updated information 218 by using a user model 217 generated by the machine learning module 270 .
  • the machine learning module 270 receives the user profile data from user database 272 , and generates a user model that predicts the likelihood that a particular user will submit accurate information for the knowledge base.
  • the user model can be trained using one or more items of information in the user profiles, e.g., previous knowledge base submissions on various topics, the accuracy of such submissions, topics of interest of the users, and a level of expertise of the users for each topic of interest.
  • FIG. 3 is a flow chart of an example process for training a user model.
  • the system receives previous knowledge base submissions by users and user profile data of the submitting users.
  • the system trains a user model that can be used to compute a likelihood that a user having particular user profile data will provide an accurate update for a topic.
  • the process can be implemented by one or more computer programs installed on one or more computers.
  • the process will be described as being performed by a system of one or more computers, e.g. the machine learning module 270 of FIG. 2 .
  • the system obtains previous knowledge base submissions ( 310 ).
  • the system can train the user model using training data that includes previous knowledge base submissions on various topics along with information from profiles of the users who provided the submissions.
  • the training data includes training examples that each represents a previous user submission and one or more features of that particular submission.
  • the features can include a topic of the previous user submission and one or more items of information from the user's profile, e.g., a measure of the accuracy of the user's other knowledge base submissions, topics of interest in the user's profile, a level of expertise for each topic of interest, and other subsystems used by the user, for example.
  • Each training example can be labeled to indicate whether the previous submission by a user was accurate, e.g., with a score ranging from 0 to 1 or with a binary classification as “good”/“bad,” or “reliable”/“unreliable.”
  • the training data is hand-labeled by administrators of the knowledge base.
  • the system obtains statistics of a previous knowledge base submission ( 320 ).
  • One example feature of the training examples is a measure of the accuracy of previous knowledge base submissions of the user.
  • the system can select the previous knowledge base submissions associated with the user's profile and determine the accuracy of those submissions, e.g., based on the submissions being added to the knowledge base or updates being later changed back, e.g. by knowledge base curators, to a previous version.
  • the accuracy of a particular submission can be determined according to whether the submission was added to the knowledge base, for example, by considering a revision history of the knowledge base after the user's submission.
  • the accuracy of a particular submission can also be determined by verification by other knowledge base users, by an expert, or by an administrator of the search system.
  • the system can compute statistics to indicate the accuracy of the previous knowledge base submissions that the user has made, for example, a ratio of correct to incorrect submissions. For example, the system can consider a first user with a high ratio of correct to incorrect submissions to be more reliable than a second user with a lower ratio of correct to incorrect submissions.
  • the system obtains topics of interest and levels of expertise ( 330 ).
  • Another example feature for the training examples includes a level of expertise for each of the topics of interest in the user's profile.
  • the system generates the levels of expertise automatically. For example, the system can determine a level of expertise based on input received from a user and based on the types of documents that the user subsequently accesses. For example, the search system can determine that a user who views highly technical documents may be an expert in a particular field, e.g., medicine or technology. Conversely, the search system can determine that a user who views only general documents associated with the same field is a novice.
  • the search system can use any appropriate algorithm to determine a level of expertise for a specific user with respect to a specific topic of interest.
  • the search system may use machine learning to create an expertise model to determine a level of expertise for topics associated with profiles of users who access the knowledge base.
  • the expertise model can be trained by using the measure of language sophistication on resources accessed by users as input in order to classify resources as those that would be visited by experts or novices on a particular topic. The system can then use resources visited by a user to determine whether the user is an expert or a novice for the topic.
  • the system obtains information about other subsystems accessed by the user ( 340 ).
  • Another example feature for the training examples includes information about other subsystems of the search system accessed by a user.
  • a higher number of subsystems accessed by the same user is a signal of legitimacy for the associated user.
  • a user who has accessed only one subsystem is more likely to be suspect.
  • a user profile that is associated with the search engine and a social networking website will generally be more likely to have a high predicted accuracy than a user profile that is only associated with the search engine, assuming all other scoring factors are the same.
  • the system trains the user model ( 350 ).
  • the machine learning module 270 uses the labeled training examples to train the user model.
  • the module can be implemented with any appropriate supervised learning algorithm that uses labeled training data, e.g. a support vector machine, logistic regression, or nearest-neighbor classifiers.
  • the machine learning module performs active learning and updates the user model as the knowledge base receives additional data from users. For example, the machine learning module updates the user model or creates a new user model according to a schedule, e.g., monthly or yearly, or at another predetermined time, e.g., one specified by an administrator.
  • a schedule e.g., monthly or yearly
  • another predetermined time e.g., one specified by an administrator.
  • FIG. 4 is a flow chart of an example process for computing the likelihood that a user will provide an accurate update.
  • the system receives a data update from a user on a particular topic.
  • the system can then determine a likelihood that the user will provide accurate data on the particular topic using information in the user's profile.
  • the process can be implemented by one or more computer programs installed on one or more computers.
  • the process will be described as being performed by a system of one or more computers, e.g. the data request module 260 of FIG. 2 .
  • the system receives an update from a user for a topic ( 410 ).
  • the system can receive an update from a user through a knowledge panel provided as part of a search results page, as illustrated in FIG. 1 .
  • the system can determine the topic, for example, by determining one or more entities for which a query submitted by the user is an alias.
  • the system can also receive an update from a user who is browsing and submitting updates to a knowledge base through a direct interface to the knowledge base, in which case the topic can be determined from an entity associated with the update.
  • the system obtains user profile data of the user ( 420 ).
  • the system determines that the user is reliable relative to the topic ( 430 ).
  • a user can be considered reliable relative to a topic if the system determines that the user is likely to provide updates to the knowledge base that are accurate.
  • the system can use the obtained user profile data of the user and the topic of the update as input to a user model to determine the likelihood that an update from the user on the topic is accurate. Generally, if the determined likelihood satisfies a threshold, the system can determine that the user is reliable relative to the topic. The system can then update the knowledge base accordingly without further intervention or inspection by knowledge base administrators.
  • the system can compute features from the user's profile data and use the features as input to the user model, including the user's previous knowledge base submissions, topics of interest, etc., as described above.
  • the system can then use the features as input to the user model to compute a likelihood that the user's update for the topic is accurate.
  • the system determines that there is no information available about the user other than the current submission, the system assigns the user a default likelihood of providing an accurate update for the topic.
  • the system can seek to verify the submission using input from one or more other users before updating the knowledge base. For example, the system can wait for additional submissions by other users and compute an aggregate likelihood that a particular update to an attribute is reliable. Once a cumulative likelihood of the submissions satisfies a threshold, the system can then determine that the knowledge base should be updated with a value provided by the user submissions.
  • the system can weight each of the responses to determine a response that has the highest probability of being accurate. For example, when the system receives an update to a phone number of a restaurant from five different users, the system can determine weights for the responses based on the computed likelihood associated with each of the users. Thus, updates from users with a higher likelihood of accuracy can outweigh updates from users with a lower likelihood of accuracy.
  • the search system if the search system receives a submission from a user who has a low computed likelihood of providing a reliable update, the search system discards the submission and does not update the knowledge base.
  • the search system may also maintain records of such low-likelihood submissions for aggregation with previous and future submissions by other users.
  • the system can also use different thresholds for updates to existing attributes and new attributes. For example, if the user submits a new attribute, the system can require a higher likelihood that the user will provide accurate updates for the topic than it would if the attribute were an existing attribute for the entity.
  • the system updates a knowledge base with the received data update ( 440 ). After determining that the knowledge base should be updated, the system can change the value of the attribute as provided by the user. Generally, updating the attribute requires no confirmation by knowledge base administrators and will cause other users that subsequently access knowledge base information, e.g. by information presented in a knowledge panel, to be provided with the updated information.
  • FIG. 5 is a flow chart of an example process for asking particular users to update knowledge base information.
  • the system receives a search request from a user and determines whether to ask the user to provide an update to an attribute of an entity in a knowledge base.
  • the process can be implemented by one or more computer programs installed on one or more computers. The process will be described as being performed by a system of one or more computers, e.g., the search system 230 of FIG. 2 .
  • the system receives a search request from a user on a particular topic ( 510 ).
  • the system can receive a search query from a user who is logged into the system.
  • the system can then determine a topic from the search query, for example, by determining an entity for which the search query is an alias.
  • the system can also receive other types of search requests and determine topics from the other types of search requests.
  • the system can receive, from a user, a request for news stories, map data, social networking data, or other requests for other types of data from one or more subsystems of the system.
  • the system obtains user profile data of the user ( 520 ).
  • the system determines that the user is reliable relative to the topic ( 530 ).
  • the system can, for example, use information in the user profile data to compute features that can be used as input to the user model, as described in more detail above with reference to FIG. 3 .
  • the system can use the user model to compute a likelihood that the user will provide accurate data for the particular topic.
  • the system can also use the user model to determine which users to provide questions to and when to provide the questions. For example, the system can identify multiple users whose search request is relevant to a particular topic or whose recent search history is relevant to a particular topic. The system can then rank the users according to their respective predicted likelihoods of providing accurate updates to entities related to the particular topic. The system can then choose one or more highest-ranking users to ask for updates.
  • the system may also consider a time of day of the received request. For example, the system may ask users for updates only during each user's non-working hours, according to the user's local time. Thus, the system can highly rank those users whose request was received during non-working hours for the geographic region from which the request was received.
  • the system provides a request for an update to the user ( 540 ).
  • the system provides a knowledge panel, e.g., as illustrated in FIG. 1 , that asks a user if an element of information about a particular entity is correct or incorrect, or invites the user to provide such information in the first instance.
  • the system receives an update from the user on the topic ( 550 ).
  • the system can receive an update submitted by the user through a knowledge panel interface.
  • the system updates a knowledge base with the received update ( 560 ). Because the system previously evaluated the likelihood that the user would provide accurate data for the topic, the system need not again evaluate information in the user's profile to determine a likelihood that the update is accurate. However, the system may still compare the updated data to other sources of data, e.g., updates provided by one or more other users as described above with reference to FIG. 3 .
  • Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
  • Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non-transitory program carrier for execution by, or to control the operation of, data processing apparatus.
  • the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
  • the computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • data processing apparatus refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
  • the apparatus can also be or further include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
  • the apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • a computer program which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program may, but need not, correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub-programs, or portions of code.
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
  • special purpose logic circuitry e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
  • Computers suitable for the execution of a computer program include, by way of example, general or special purpose microprocessors or both, or any other kind of central processing unit.
  • a central processing unit will receive instructions and data from a read-only memory or a random access memory or both.
  • the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
  • mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
  • a computer need not have such devices.
  • a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
  • PDA personal digital assistant
  • GPS Global Positioning System
  • USB universal serial bus
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
  • semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
  • magnetic disks e.g., internal hard disks or removable disks
  • magneto-optical disks e.g., CD-ROM and DVD-ROM disks.
  • the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
  • a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer.
  • a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and a pointing device e.g., a mouse or a trackball
  • Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
  • a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to
  • Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components.
  • the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
  • LAN local area network
  • WAN wide area network
  • the computing system can include clients and servers.
  • a client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
  • a server transmits data, e.g., an HTML page, to a user device, e.g., for purposes of displaying data to and receiving user input from a user interacting with the user device, which acts as a client.
  • Data generated at the user device e.g., a result of the user interaction, can be received from the user device at the server.

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting the accuracy of user submissions. One of the methods includes receiving, from a user, an update to an attribute of an entity. If the user is determined to be reliable based on user profile data of the user, the knowledge base is updated with the update to the attribute of the entity.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This is a continuation of U.S. application Ser. No. 16/291,589, filed on Mar. 4, 2019, which is a continuation of U.S. application Ser. No. 13/906,220, filed on May 30, 2013 (now U.S. Pat. No. 10,223,637), the disclosures of which are considered part of and are incorporated by reference in the disclosure of this application in their entireties.
  • BACKGROUND
  • This specification relates to determining whether a submission of data by a user is accurate.
  • A search system can provide one or more knowledge panels in response to a received search query. A knowledge panel is a user interface element that provides a collection of information or other content related to a particular entity referenced by the search query. For example, the entity may be a person, place, country, landmark, animal, historical event, organization, business, sports team, sporting event, movie, song, album, game, work of art, or any other entity.
  • In general, a knowledge panel provides a summary of information about the entity. For example, a knowledge panel for a famous singer may include the name of the singer, an image of the singer, a description of the singer, one or more facts about the singer, content that identifies songs and albums recorded by the singer, and/or links to searches related to the singer. Other types of information and content can also be presented in the knowledge panel. Information presented in a knowledge panel can include content obtained from multiple disparate sources, e.g., multiple different web pages accessible over the Internet.
  • A search system can maintain a knowledge base that stores information about various entities. The system can assign a unique entity identifier to each entity. The system can also assign one or more text string aliases to a particular entity. For example, the Statue of Liberty can be associated with aliases “the Statue of Liberty” and “Lady Liberty.” Aliases need not be unique among entities. For example, “jaguar” can be an alias both for an animal and for a car manufacturer.
  • The system can also store information about an entity's relationship to other entities. For example, the system can define a “located in:” relationship between two entities to reflect, for example, that the Statue of Liberty is located in New York City. In some implementations, the system stores relationships between entities in a representation of a graph in which nodes represent distinct entities and links between nodes represent relationships between the entities. In this example, the system could maintain a node representing the Statue of Liberty, a node representing New York City, and a link between the nodes to represent that the Statue of Liberty is located in New York City.
  • SUMMARY
  • This specification describes how a system can compute a likelihood that a user will provide accurate updates to a knowledge base based on information in the user's profile. In general, the system can train a model using previous knowledge base submissions by users and use the model to predict whether a particular user will provide accurate updates.
  • In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving, from a user, an update to an attribute of an entity related to a topic; obtaining user profile data of the user; determining from the user profile data that the user is reliable relative to the topic; and in response to determining that the user is reliable relative to the topic, updating a knowledge base with the update to the attribute of the entity. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods. A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.
  • The foregoing and other embodiments can each optionally include one or more of the following features, alone or in combination. Determining that the user is reliable relative to the topic comprises computing, using the user profile data as input to a user model, a likelihood that an update from the user to an entity related to the topic is accurate; and determining that the computed likelihood satisfies a threshold. The user model is trained using training examples that represent previously submitted updates to the knowledge base by users and whether the previously submitted updates were accurate. Each training example includes information from a user profile of a user that submitted the corresponding update. The information from the user profile includes one or more statistics describing the accuracy of knowledge base submissions by the user or a topic of interest and a level of expertise for the topic of interest. The information from the user profile includes information about subsystems accessed by the user. The update to the attribute of the entity includes an update to a value of an existing attribute of the entity stored in the knowledge base. The update to the attribute of the entity includes a new attribute of the entity that was previously not stored in the knowledge base. The threshold is different for an existing attribute of the entity than for a new attribute for the entity. The updated entity attribute in the knowledge base is provided in response to search requests by users.
  • In general, another innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of receiving, from a user, a search request related to a topic; obtaining user profile data of the user; determining from the user profile data that the user is reliable relative to the topic; in response to determining that the user is reliable relative to the topic, providing to the user a request for an update to an attribute of an entity related to the topic; receiving, from the user, an update to the attribute of the entity related to the topic; and updating a knowledge base with the update to the attribute of the entity. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
  • The foregoing and other embodiments can each optionally include one or more of the following features, alone or in combination. Determining that the user is reliable relative to the topic comprises computing, using the user profile data as input to a user model, a likelihood that an update from the user to an entity related to the topic is accurate; and determining that the computed likelihood satisfies a threshold. The user model is trained using training examples that represent previously submitted updates to the knowledge base by users and whether the previously submitted updates were accurate. Each training example includes information from a user profile of a user that submitted the corresponding update. The information from the user profile includes one or more statistics describing the accuracy of knowledge base submissions by the user or a topic of interest and a level of expertise for the topic of interest. The information from the user profile includes information about subsystems accessed by the user. Providing to the user a request for an update to an attribute of an entity related to the topic comprises providing a knowledge panel that presents one or more items of information about the entity and requests the update to the attribute of the entity. Receiving, from a user, an update to an attribute of an entity related to a topic comprises receiving the update to the attribute of the entity through a user interface control of the knowledge panel.
  • The subject matter described in this specification can be implemented in particular embodiments so as to realize one or more of the following advantages. A search system can automatically determine whether a submission from a user is likely to be accurate based on the accuracy of previous submissions received from the user or other indications of user trustworthiness. This can reduce the amount of erroneous or spam inputs to the knowledge base. A search system is more likely to receive accurate data updates for a particular topic by asking users who are interested in the particular topic to provide updates on the topic. This can reduce the likelihood that a user will be annoyed by being asked to provide an update and can increase the likelihood of receiving a response from a user.
  • The details of one or more embodiments of the subject matter of this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an example search results page that includes a knowledge panel.
  • FIG. 2 is a diagram of an example system.
  • FIG. 3 is a flow chart of an example process for training a user model.
  • FIG. 4 is a flow chart of an example process for computing the likelihood that a user will provide an accurate update.
  • FIG. 5 is a flow chart of an example process for asking particular users to update knowledge base information.
  • Like reference numbers and designations in the various drawings indicate like elements.
  • DETAILED DESCRIPTION
  • Some search systems allow users to update data stored in a knowledge base. In general, an update to the knowledge base updates an [attribute, value] pair associated with an entity. For example, a user can submit an update to the knowledge base for an existing attribute “date of birth,” where the updated value is “Feb. 12, 1809.” A user may also submit an update to the knowledge base for a new attribute associated with a person entity, e.g., “Favorite food,” and a corresponding new value, e.g., “pizza,” providing both the new attribute and the new value.
  • However, in some cases it may be difficult for the search system to determine whether data entered by a user is correct, as the user may intentionally or unintentionally enter incorrect data.
  • A search system can create a user model to determine whether a particular user is likely to provide accurate updates to the knowledge base. The search system can use machine learning to generate the user model based on the accuracy of updates to the knowledge base previously entered by the user and profile information associated with the user. Once the user model is developed, the search system can use the user model to predict the accuracy of knowledge base updates submitted by the user, determine whether to update the knowledge base with the submitted data, and determine whether or not to ask a the user for specific data or verification of data in the knowledge base.
  • FIG. 1 illustrates an example search results page 100 that includes a knowledge panel 130. A user can submit the query 102 to a search system through a graphical user interface of a software application, e.g., a web browser, or through a user interface of some other software application installed on a user device, e.g., a spoken query issued through a speech recognition application installed on a mobile user device. In response to receiving the query 102, the search system can provide a search results page 100 in a form that can be presented on the user device. For example, the search results page 100 can be provided as a markup language document, e.g., a HyperText Markup Language document, and the user device can render the document, e.g., using a web browser, in order to present the search results page 100 on a display of the user device.
  • The search results page 100 includes three search results 122 a-c that the search system has obtained in response to the query 102. Each of the search results 122 a-c includes a title, a display link, and a text snippet. Each of the search results 122 a-c is also linked to a respective resource, e.g., a web page at a location indicated by the display link. User selection of a search result will cause the application to navigate to the linked resource. The search results page 100 also includes an indicator 110 that the user is currently logged in.
  • The search results page 100 also includes a knowledge panel 130 corresponding to an entity with an alias corresponding to the search query 102. In this example, the entity is Abraham Lincoln.
  • The knowledge panel 130 includes various items of information about Abraham Lincoln. The knowledge panel 130 includes an entity name 132, a picture of the entity 133, items of information 134, including an occupation, a date of birth, a date of death, and a spouse's name.
  • The search system can provide the knowledge panel 130 as an interface for the user to update one or more items of information maintained by the search system in the knowledge base. For example, the search system can invite the user to correct a specific one of the items of information 134, or the search system can, upon user selection of any of the items of information 134, provide an editable text-input field 136 for editing the item of information. For example, upon user selection of the “Spouse” field, the search system can provide editable text-input field 136 through which the user can edit that particular item of information.
  • After making changes to the information in the knowledge panel 130, the user can submit the information, e.g., by selecting a “Submit” user interface control 138. The system can then evaluate the submitted information based on one or more criteria, e.g., the user's reliability or data submitted by other users. If the system determines that the update is likely to be accurate, the system can update the knowledge base with the submitted information. In this way, the system can use the knowledge panel 130 as an efficient way to ask for updates to information maintained by the search system from one place and in-line, e.g., without having to navigate away from the search results page 100.
  • FIG. 2 is a diagram of an example system 200. In general, the system includes a user device 210 coupled to a search system 230 over a network 220. The search system 230 is an example of an information retrieval system in which the systems, components, and techniques described below can be implemented.
  • In operation, the user device 210 transmits a query 212 to the search system 230, e.g., over the network 220. The query 212 includes one or more terms and can include other information, for example, a location and a type of the user device 210. The search system 230 generates a response, generally in the form of a search results page 216. The search results page 216 can include search results 213 that the search system 230 has identified as being responsive to the query 212.
  • If the search system 230 determines that the user is likely to know and provide accurate information about a particular entity, e.g., an entity relevant to a user's field of expertise, the search system 230 can provide a data request 214 that requests an update to a particular item of information about the entity in the knowledge base 262. In some implementations, the data request 214 can be included in a knowledge panel for the entity, which can be used as an interface for the user to update the requested items of information. The search system 230 transmits the search results page 216 over the network 220 back to the user device 210 for presentation to a user.
  • The search system 230 can receive updated information 218 that is either initiated by the user or initiated by a data request 214. The updated information can be received, for example, through a knowledge panel provided on the search results page 216. The search system 230 can then use the updated information 218 to update the knowledge base 262.
  • The user device 210 can be any appropriate type of computing device, e.g., mobile phone, tablet computer, notebook computer, music player, e-book reader, laptop or desktop computer, PDA (personal digital assistant), smart phone, a server, or other stationary or portable device, that includes one or more processors 208 for executing program instructions and memory 206, e.g., random access memory (RAM). The user device 210 can include non-volatile computer readable media that store software applications, e.g., a browser or layout engine, an input device, e.g., a keyboard or mouse, a communication interface, and a display device.
  • The network 220 can be, for example, a wireless cellular network, a wireless local area network (WLAN) or Wi-Fi network, a mobile telephone network or other telecommunications network, a wired Ethernet network, a private network such as an intranet, a public network such as the Internet, or any appropriate combination of such networks.
  • The search system 230 can be implemented as computer programs installed on one or more computers in one or more locations that are coupled to each through a network. The search system 230 includes a search system front end 240, a search engine 250, a data request module 260, and a machine learning module 270. The computing device or devices that implement the search system front end 240, the search engine 250, the data request module 260, and the machine learning module 270 may include similar components.
  • The search system 230 includes a user database 272 that stores information about users who access the search system 230. For example, the user database 272 may include a user profile for each of the users who access the search system. For users who are registered users of the search system 230, a user profile can include previous submissions by the user to the knowledge base and whether such submissions were accurate or not. A user profile for registered or unregistered users may include user interactions with subsystems of the search system, e.g., a web search system, an image search system, a map system, an email system, a social network system, a blogging system, a shopping system, just to name a few, topics of interest, and an indication of a level of expertise of the user for each of the topics of interest, e.g., novice or expert. The topics of interest and levels of expertise may include user-provided data or system-generated data based on a user's interaction with the search system. For example, the search system may determine that a specific user is interested in French restaurants based on a search history of the specific user or search results selected by the specific user. The search system may then add “restaurants” to the user's topics of interest.
  • In some implementations, users are distinguished by the IP addresses of the user devices used in performing the activities. In some implementations, activities are recorded by the interactive system involved in the activity. In some implementations, activity information is also, or alternatively, collected with the consent of the user by an application, e.g, a web browser toolbar, running on the user's device.
  • Where personal information about users may be collected or used, users may be given an opportunity to control whether the personal information about the users is collected. In addition, certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
  • In general, the search system front end 240 receives the query 212 from the user device 210 and routes the query 212 to the search engine 250 and the data request module 260. The search system front end 240 also provides the resulting search results page 216 that includes the search results 213 and the knowledge panel 214 to the user device 210. In doing so, the search system front end 240 acts as a gateway, or interface, between user devices and the search system 230.
  • The search engine 250 receives the query 212 and generates search results 213 that are responsive to the query. The search engine 250 will generally include an indexing engine for indexing resources in a collection of resources. For example, the search engine 250 can index web pages found in a collection of web pages, e.g., web pages on the Internet. A collection of resources indexed by the indexing engine may, but need not, be stored within search system 230, e.g., in index database 252. The search engine 250 can rank the search results 213 using conventional methods and route the ranked search results 213 back to search system front end 240 for inclusion in the search results page 216.
  • The data request module 260 receives the query 212 and determines whether the search system 230 should provide a knowledge panel in a response to the query as well as whether to ask the user through a data request 214 to update information about a particular entity. In some implementations, the data request 214 is presented through a knowledge panel on the search results page 216.
  • The data request module 260 can determine whether the search system 230 should provide a knowledge panel using a data structure of the knowledge base 262 that maps an alias to one or more entities, e.g. an entity alias index. For example, the alias “Bush” can be mapped to a set of entities having that alias, e.g., the entity “George W. Bush,” the entity “George H. W. Bush,” the entity for the rock band “Bush,” and the entity for a category of plants having that alias. The entity alias index may also include a score for each entity that represents a likelihood that the alias refers to each particular entity. The data request module 260 can use some or all of the query 212 as input to the entity alias index. The data request module 260 can use a returned entity for the query to present an entity in a knowledge panel. The data request module 260 can also use a returned entity to identify a topic of the query, which can be used to compute a likelihood that the user will provide accurate updates for the topic.
  • The data request module 260 evaluates the accuracy of the updated information 218 to determine whether to update the knowledge base 262 with the updated information 218. The data request module 260 can determine both whether to provide a data request 214 as well as the accuracy of the updated information 218 by using a user model 217 generated by the machine learning module 270.
  • The machine learning module 270 receives the user profile data from user database 272, and generates a user model that predicts the likelihood that a particular user will submit accurate information for the knowledge base. The user model can be trained using one or more items of information in the user profiles, e.g., previous knowledge base submissions on various topics, the accuracy of such submissions, topics of interest of the users, and a level of expertise of the users for each topic of interest.
  • FIG. 3 is a flow chart of an example process for training a user model. The system receives previous knowledge base submissions by users and user profile data of the submitting users. The system then trains a user model that can be used to compute a likelihood that a user having particular user profile data will provide an accurate update for a topic. The process can be implemented by one or more computer programs installed on one or more computers. The process will be described as being performed by a system of one or more computers, e.g. the machine learning module 270 of FIG. 2 .
  • The system obtains previous knowledge base submissions (310). The system can train the user model using training data that includes previous knowledge base submissions on various topics along with information from profiles of the users who provided the submissions. The training data includes training examples that each represents a previous user submission and one or more features of that particular submission. The features can include a topic of the previous user submission and one or more items of information from the user's profile, e.g., a measure of the accuracy of the user's other knowledge base submissions, topics of interest in the user's profile, a level of expertise for each topic of interest, and other subsystems used by the user, for example.
  • Each training example can be labeled to indicate whether the previous submission by a user was accurate, e.g., with a score ranging from 0 to 1 or with a binary classification as “good”/“bad,” or “reliable”/“unreliable.” In some implementations, the training data is hand-labeled by administrators of the knowledge base.
  • The system obtains statistics of a previous knowledge base submission (320). One example feature of the training examples is a measure of the accuracy of previous knowledge base submissions of the user. For example, the system can select the previous knowledge base submissions associated with the user's profile and determine the accuracy of those submissions, e.g., based on the submissions being added to the knowledge base or updates being later changed back, e.g. by knowledge base curators, to a previous version. The accuracy of a particular submission can be determined according to whether the submission was added to the knowledge base, for example, by considering a revision history of the knowledge base after the user's submission. The accuracy of a particular submission can also be determined by verification by other knowledge base users, by an expert, or by an administrator of the search system.
  • The system can compute statistics to indicate the accuracy of the previous knowledge base submissions that the user has made, for example, a ratio of correct to incorrect submissions. For example, the system can consider a first user with a high ratio of correct to incorrect submissions to be more reliable than a second user with a lower ratio of correct to incorrect submissions.
  • Other values may be used to represent the accuracy of previous submissions that the user has made to the knowledge base.
  • The system obtains topics of interest and levels of expertise (330). Another example feature for the training examples includes a level of expertise for each of the topics of interest in the user's profile. In some implementations, the system generates the levels of expertise automatically. For example, the system can determine a level of expertise based on input received from a user and based on the types of documents that the user subsequently accesses. For example, the search system can determine that a user who views highly technical documents may be an expert in a particular field, e.g., medicine or technology. Conversely, the search system can determine that a user who views only general documents associated with the same field is a novice.
  • The search system can use any appropriate algorithm to determine a level of expertise for a specific user with respect to a specific topic of interest. For example, the search system may use machine learning to create an expertise model to determine a level of expertise for topics associated with profiles of users who access the knowledge base. The expertise model can be trained by using the measure of language sophistication on resources accessed by users as input in order to classify resources as those that would be visited by experts or novices on a particular topic. The system can then use resources visited by a user to determine whether the user is an expert or a novice for the topic.
  • The system obtains information about other subsystems accessed by the user (340). Another example feature for the training examples includes information about other subsystems of the search system accessed by a user. In general, a higher number of subsystems accessed by the same user is a signal of legitimacy for the associated user. In contrast, a user who has accessed only one subsystem is more likely to be suspect. Thus, a user profile that is associated with the search engine and a social networking website will generally be more likely to have a high predicted accuracy than a user profile that is only associated with the search engine, assuming all other scoring factors are the same.
  • The system trains the user model (350). The machine learning module 270 uses the labeled training examples to train the user model. The module can be implemented with any appropriate supervised learning algorithm that uses labeled training data, e.g. a support vector machine, logistic regression, or nearest-neighbor classifiers.
  • In some implementations, the machine learning module performs active learning and updates the user model as the knowledge base receives additional data from users. For example, the machine learning module updates the user model or creates a new user model according to a schedule, e.g., monthly or yearly, or at another predetermined time, e.g., one specified by an administrator.
  • FIG. 4 is a flow chart of an example process for computing the likelihood that a user will provide an accurate update. In general, the system receives a data update from a user on a particular topic. The system can then determine a likelihood that the user will provide accurate data on the particular topic using information in the user's profile. The process can be implemented by one or more computer programs installed on one or more computers. The process will be described as being performed by a system of one or more computers, e.g. the data request module 260 of FIG. 2 .
  • The system receives an update from a user for a topic (410). For example, the system can receive an update from a user through a knowledge panel provided as part of a search results page, as illustrated in FIG. 1 . The system can determine the topic, for example, by determining one or more entities for which a query submitted by the user is an alias. The system can also receive an update from a user who is browsing and submitting updates to a knowledge base through a direct interface to the knowledge base, in which case the topic can be determined from an entity associated with the update.
  • The system obtains user profile data of the user (420).
  • The system determines that the user is reliable relative to the topic (430). A user can be considered reliable relative to a topic if the system determines that the user is likely to provide updates to the knowledge base that are accurate. The system can use the obtained user profile data of the user and the topic of the update as input to a user model to determine the likelihood that an update from the user on the topic is accurate. Generally, if the determined likelihood satisfies a threshold, the system can determine that the user is reliable relative to the topic. The system can then update the knowledge base accordingly without further intervention or inspection by knowledge base administrators.
  • For example, the system can compute features from the user's profile data and use the features as input to the user model, including the user's previous knowledge base submissions, topics of interest, etc., as described above. The system can then use the features as input to the user model to compute a likelihood that the user's update for the topic is accurate.
  • Some users may not have any information associated with their profiles. Thus, in some implementations, if the system determines that there is no information available about the user other than the current submission, the system assigns the user a default likelihood of providing an accurate update for the topic.
  • Alternatively, if the system determines that the likelihood does not satisfy the threshold, the system can seek to verify the submission using input from one or more other users before updating the knowledge base. For example, the system can wait for additional submissions by other users and compute an aggregate likelihood that a particular update to an attribute is reliable. Once a cumulative likelihood of the submissions satisfies a threshold, the system can then determine that the knowledge base should be updated with a value provided by the user submissions.
  • If the system receives conflicting updates from two or more different users, the system can weight each of the responses to determine a response that has the highest probability of being accurate. For example, when the system receives an update to a phone number of a restaurant from five different users, the system can determine weights for the responses based on the computed likelihood associated with each of the users. Thus, updates from users with a higher likelihood of accuracy can outweigh updates from users with a lower likelihood of accuracy.
  • In some implementations, if the search system receives a submission from a user who has a low computed likelihood of providing a reliable update, the search system discards the submission and does not update the knowledge base. The search system may also maintain records of such low-likelihood submissions for aggregation with previous and future submissions by other users.
  • The system can also use different thresholds for updates to existing attributes and new attributes. For example, if the user submits a new attribute, the system can require a higher likelihood that the user will provide accurate updates for the topic than it would if the attribute were an existing attribute for the entity.
  • The system updates a knowledge base with the received data update (440). After determining that the knowledge base should be updated, the system can change the value of the attribute as provided by the user. Generally, updating the attribute requires no confirmation by knowledge base administrators and will cause other users that subsequently access knowledge base information, e.g. by information presented in a knowledge panel, to be provided with the updated information. FIG. 5 is a flow chart of an example process for asking particular users to update knowledge base information. In general, the system receives a search request from a user and determines whether to ask the user to provide an update to an attribute of an entity in a knowledge base. The process can be implemented by one or more computer programs installed on one or more computers. The process will be described as being performed by a system of one or more computers, e.g., the search system 230 of FIG. 2 .
  • The system receives a search request from a user on a particular topic (510). For example, the system can receive a search query from a user who is logged into the system. The system can then determine a topic from the search query, for example, by determining an entity for which the search query is an alias. The system can also receive other types of search requests and determine topics from the other types of search requests. For example, the system can receive, from a user, a request for news stories, map data, social networking data, or other requests for other types of data from one or more subsystems of the system.
  • The system obtains user profile data of the user (520). The system determines that the user is reliable relative to the topic (530). The system can, for example, use information in the user profile data to compute features that can be used as input to the user model, as described in more detail above with reference to FIG. 3 . The system can use the user model to compute a likelihood that the user will provide accurate data for the particular topic.
  • The system can also use the user model to determine which users to provide questions to and when to provide the questions. For example, the system can identify multiple users whose search request is relevant to a particular topic or whose recent search history is relevant to a particular topic. The system can then rank the users according to their respective predicted likelihoods of providing accurate updates to entities related to the particular topic. The system can then choose one or more highest-ranking users to ask for updates.
  • The system may also consider a time of day of the received request. For example, the system may ask users for updates only during each user's non-working hours, according to the user's local time. Thus, the system can highly rank those users whose request was received during non-working hours for the geographic region from which the request was received.
  • The system provides a request for an update to the user (540). In some implementations, the system provides a knowledge panel, e.g., as illustrated in FIG. 1 , that asks a user if an element of information about a particular entity is correct or incorrect, or invites the user to provide such information in the first instance.
  • The system receives an update from the user on the topic (550). For example, the system can receive an update submitted by the user through a knowledge panel interface.
  • The system updates a knowledge base with the received update (560). Because the system previously evaluated the likelihood that the user would provide accurate data for the topic, the system need not again evaluate information in the user's profile to determine a likelihood that the update is accurate. However, the system may still compare the updated data to other sources of data, e.g., updates provided by one or more other users as described above with reference to FIG. 3 .
  • Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. Alternatively or in addition, the program instructions can be encoded on an artificially-generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus. The computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • The term “data processing apparatus” refers to data processing hardware and encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can also be or further include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). The apparatus can optionally include, in addition to hardware, code that creates an execution environment for computer programs, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
  • A computer program, which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub-programs, or portions of code. A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • The processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
  • Computers suitable for the execution of a computer program include, by way of example, general or special purpose microprocessors or both, or any other kind of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
  • To provide for interaction with a user, embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's device in response to requests received from the web browser.
  • Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
  • The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits data, e.g., an HTML page, to a user device, e.g., for purposes of displaying data to and receiving user input from a user interacting with the user device, which acts as a client. Data generated at the user device, e.g., a result of the user interaction, can be received from the user device at the server.
  • While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any invention or on the scope of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
  • Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system modules and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
  • Particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. For example, the actions recited in the claims can be performed in a different order and still achieve desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous.

Claims (20)

1. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
maintaining, by the system, a knowledge base accessible by multiple users, wherein the knowledge base comprises one or more attribute-value pairs for a plurality of entities;
obtaining, from a user having user profile data stored in the system, a value for an attribute of an entity of the plurality of entities, wherein the user profile data is not stored in the knowledge base;
obtaining the user profile data for the user, the user profile data including information representing other subsystems of the system accessed by the user;
computing a likelihood that the value is accurate using the information representing the other subsystems; and
updating the knowledge base with the value in response to determining that the likelihood satisfies a threshold.
2. The system of claim 1, wherein the likelihood that the value is accurate is output from a user reliability model, in response to an input of the information representing the other subsystems.
3. The system of claim 2, wherein the user reliability model is configured to consider users who access more subsystems of the system to be more reliable than users who access fewer subsystems of the system.
4. The system of claim 2, wherein the user reliability model is trained using training examples comprising respective subsystems of the system accessed by each user of a plurality of users and a measure of accuracy of previously submitted updates to the knowledge base by the each user.
5. The system of claim 1, wherein the other subsystems include an image search system, a map system, an email system, a social network system, a blogging system, or a shopping system.
6. The system of claim 1, wherein the other subsystems of the system do not include a search engine or the knowledge base.
7. The system of claim 1, wherein the updating of the knowledge base with the value comprises automatically updating the knowledge base without intervention or inspection by a knowledge base administrator.
8. A method performed by a search system comprising one or more computers, the method comprising:
maintaining, by the search system, a knowledge base accessible by multiple users, wherein the knowledge base comprises information about entities, the information about each entity of the entities being represented as one or more attribute-value pairs;
receiving, by the search system from a user having user profile data stored in the search system, an updated value of an attribute of an entity in the knowledge base, wherein the user profile data is not stored in the knowledge base;
obtaining the user profile data for the user, the user profile data including information representing other subsystems of the search system accessed by the user;
computing a likelihood that the updated value is accurate using the information representing the other subsystems of the search system; and
updating the knowledge base with the updated value in response to determining that the likelihood satisfies a threshold.
9. The method of claim 8, wherein the likelihood that the updated value is accurate is output from a user reliability model, in response to an input of the information representing the other subsystems.
10. The method of claim 9, wherein the user reliability model is configured to consider users who access more subsystems of the search system to be more reliable than users who access fewer subsystems of the search system.
11. The method of claim 9, wherein the user reliability model is trained using training examples comprising respective subsystems of the search system accessed by each user of a plurality of users and a measure of accuracy of previously submitted updates to the knowledge base by the each user.
12. The method of claim 8, wherein the other subsystems include an image search system, a map system, an email system, a social network system, a blogging system, or a shopping system.
13. The method of claim 8, wherein the other subsystems of the search system do not include a search engine or the knowledge base.
14. The method of claim 8, wherein the updating of the knowledge base with the updated value comprises automatically updating the knowledge base without intervention or inspection by a knowledge base administrator.
15. The method of claim 8, wherein the receiving of the updated value in the knowledge base comprises receiving the updated value by a web search engine of the search system.
16. The method of claim 8, wherein the receiving of the updated value comprises receiving the updated value through a knowledge panel user interface provided by the web search engine in response to the user submitting a search query.
17. The method of claim 8, wherein the knowledge panel user interface presents one or more items of information about the entity in the knowledge base.
18. One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers of a search system cause the one or more computers to perform operations comprising:
maintaining, by the search system, a knowledge base accessible by multiple users, wherein the knowledge base comprises information about entities, the information about each entity of the entities being represented as one or more attribute-value pairs;
receiving, by the search system from a user having user profile data stored in the search system, a value of an attribute of an entity in the knowledge base, wherein the user profile data is not stored in the knowledge base;
obtaining the user profile data for the user, the user profile data including information representing other subsystems of the search system accessed by the user, wherein the other subsystems include an image search system, a map system, an email system, a social network system, a blogging system, or a shopping system;
computing a likelihood that the value is accurate using the information representing the other subsystems; and
updating the knowledge base with the updated value received from the user in response to determining that the likelihood satisfies a threshold.
19. The one or more non-transitory computer storage media of claim 18, wherein the likelihood that the value is accurate is output from a trained user reliability model, in response to an input of the information representing the other subsystems.
20. The one or more non-transitory computer storage media of claim 19, wherein the trained user reliability model is configured to consider users who access more subsystems of the search system to be more reliable than users who access fewer subsystems of the search system.
US18/064,657 2013-05-30 2022-12-12 Predicting accuracy of submitted data Pending US20230113420A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/064,657 US20230113420A1 (en) 2013-05-30 2022-12-12 Predicting accuracy of submitted data

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/906,220 US10223637B1 (en) 2013-05-30 2013-05-30 Predicting accuracy of submitted data
US16/291,589 US11526773B1 (en) 2013-05-30 2019-03-04 Predicting accuracy of submitted data
US18/064,657 US20230113420A1 (en) 2013-05-30 2022-12-12 Predicting accuracy of submitted data

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/291,589 Continuation US11526773B1 (en) 2013-05-30 2019-03-04 Predicting accuracy of submitted data

Publications (1)

Publication Number Publication Date
US20230113420A1 true US20230113420A1 (en) 2023-04-13

Family

ID=65495739

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/906,220 Active 2034-02-06 US10223637B1 (en) 2013-05-30 2013-05-30 Predicting accuracy of submitted data
US16/291,589 Active 2035-10-10 US11526773B1 (en) 2013-05-30 2019-03-04 Predicting accuracy of submitted data
US18/064,657 Pending US20230113420A1 (en) 2013-05-30 2022-12-12 Predicting accuracy of submitted data

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US13/906,220 Active 2034-02-06 US10223637B1 (en) 2013-05-30 2013-05-30 Predicting accuracy of submitted data
US16/291,589 Active 2035-10-10 US11526773B1 (en) 2013-05-30 2019-03-04 Predicting accuracy of submitted data

Country Status (1)

Country Link
US (3) US10223637B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220353304A1 (en) * 2021-04-30 2022-11-03 Microsoft Technology Licensing, Llc Intelligent Agent For Auto-Summoning to Meetings

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098667A1 (en) * 2014-10-07 2016-04-07 Salesforce.Com, Inc. Customizable skills database
US20170249388A1 (en) * 2016-02-26 2017-08-31 Microsoft Technology Licensing, Llc Expert Detection in Social Networks
US11321286B1 (en) * 2018-01-26 2022-05-03 Wells Fargo Bank, N.A. Systems and methods for data quality certification
US11514334B2 (en) * 2020-02-07 2022-11-29 International Business Machines Corporation Maintaining a knowledge database based on user interactions with a user interface
US20210326716A1 (en) * 2020-04-15 2021-10-21 Elsevier Inc. Targeted probing of memory networks for knowledge base construction
US11748439B2 (en) * 2020-05-04 2023-09-05 Big Idea Lab, Inc. Computer-aided methods and systems for distributed cognition of digital content comprised of knowledge objects

Family Cites Families (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4961139A (en) 1988-06-30 1990-10-02 Hewlett-Packard Company Data base management system for real-time applications
US5418942A (en) 1989-07-06 1995-05-23 Krawchuk; Kenneth V. System and method for storing and managing information
WO1994025913A2 (en) 1993-04-30 1994-11-10 Novadigm, Inc. Method and apparatus for enterprise desktop management
US5872973A (en) 1995-10-26 1999-02-16 Viewsoft, Inc. Method for managing dynamic relations between objects in dynamic object-oriented languages
US6098078A (en) 1995-12-29 2000-08-01 Lucent Technologies Inc. Maintaining consistency of database replicas
US5913214A (en) 1996-05-30 1999-06-15 Massachusetts Inst Technology Data extraction from world wide web pages
US6208993B1 (en) 1996-07-26 2001-03-27 Ori Software Development Ltd. Method for organizing directories
US6175835B1 (en) 1996-07-26 2001-01-16 Ori Software Development, Ltd. Layered index with a basic unbalanced partitioned index that allows a balanced structure of blocks
US6098065A (en) * 1997-02-13 2000-08-01 Nortel Networks Corporation Associative search engine
US20050005266A1 (en) 1997-05-01 2005-01-06 Datig William E. Method of and apparatus for realizing synthetic knowledge processes in devices for useful applications
US6233545B1 (en) 1997-05-01 2001-05-15 William E. Datig Universal machine translator of arbitrary languages utilizing epistemic moments
US6377993B1 (en) 1997-09-26 2002-04-23 Mci Worldcom, Inc. Integrated proxy interface for web based data management reports
US7225249B1 (en) 1997-09-26 2007-05-29 Mci, Llc Integrated systems for providing communications network management services and interactive generating invoice documents
CN1292901A (en) 1998-01-22 2001-04-25 Ori软件开发有限公司 Database apparatus
JPH11232487A (en) 1998-02-13 1999-08-27 Sony Corp Information processor, its processing method and provided medium
US6978262B2 (en) 1999-01-05 2005-12-20 Tsai Daniel E Distributed database schema
JP2000207266A (en) 1999-01-13 2000-07-28 Mitsubishi Electric Corp Replica system and replica method
US6640242B1 (en) 1999-01-29 2003-10-28 Microsoft Corporation Voice access through a data-centric network to an integrated message storage and retrieval system
US6408282B1 (en) 1999-03-01 2002-06-18 Wit Capital Corp. System and method for conducting securities transactions over a computer network
US6327590B1 (en) * 1999-05-05 2001-12-04 Xerox Corporation System and method for collaborative ranking of search results employing user and group profiles derived from document collection content analysis
WO2001008045A1 (en) 1999-07-22 2001-02-01 Ori Software Development Ltd. Method for organizing directories
US6477580B1 (en) 1999-08-31 2002-11-05 Accenture Llp Self-described stream in a communication services patterns environment
US6529948B1 (en) 1999-08-31 2003-03-04 Accenture Llp Multi-object fetch component
US20040225865A1 (en) 1999-09-03 2004-11-11 Cox Richard D. Integrated database indexing system
US7630986B1 (en) 1999-10-27 2009-12-08 Pinpoint, Incorporated Secure data interchange
US7725307B2 (en) 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
EP1126681A2 (en) 2000-01-14 2001-08-22 Sun Microsystems, Inc. A network portal system and methods
US20060173873A1 (en) 2000-03-03 2006-08-03 Michel Prompt System and method for providing access to databases via directories and other hierarchical structures and interfaces
US6850979B1 (en) 2000-05-09 2005-02-01 Sun Microsystems, Inc. Message gates in a distributed computing environment
US8001232B1 (en) 2000-05-09 2011-08-16 Oracle America, Inc. Event message endpoints in a distributed computing environment
WO2001086427A2 (en) 2000-05-09 2001-11-15 Sun Microsystems, Inc. Transformation of objects between a computer programming language and a data representation language
US8082491B1 (en) 2000-05-09 2011-12-20 Oracle America, Inc. Dynamic displays in a distributed computing environment
ATE332529T1 (en) 2000-05-09 2006-07-15 Sun Microsystems Inc CONNECTION BETWEEN A DATA REPRESENTATION LANGUAGE AND MESSAGE BASED DISTRIBUTED COMPUTING ENVIRONMENT AND OTHER ENVIRONMENTS
US6578041B1 (en) 2000-06-30 2003-06-10 Microsoft Corporation High speed on-line backup when using logical log operations
US9292516B2 (en) * 2005-02-16 2016-03-22 Sonic Solutions Llc Generation, organization and/or playing back of content based on incorporated parameter identifiers
US7212985B2 (en) * 2000-10-10 2007-05-01 Intragroup, Inc. Automated system and method for managing a process for the shopping and selection of human entities
US6999956B2 (en) 2000-11-16 2006-02-14 Ward Mullins Dynamic object-driven database manipulation and mapping system
US20030105732A1 (en) 2000-11-17 2003-06-05 Kagalwala Raxit A. Database schema for structure query language (SQL) server
AU2002234258A1 (en) 2001-01-22 2002-07-30 Sun Microsystems, Inc. Peer-to-peer network computing platform
US20030236795A1 (en) 2001-05-03 2003-12-25 Kemp Thomas D. Method and system for identifying objects
US7099885B2 (en) 2001-05-25 2006-08-29 Unicorn Solutions Method and system for collaborative ontology modeling
US6799184B2 (en) 2001-06-21 2004-09-28 Sybase, Inc. Relational database system providing XML query support
WO2003001413A1 (en) 2001-06-22 2003-01-03 Nosa Omoigui System and method for knowledge retrieval, management, delivery and presentation
US20040230572A1 (en) 2001-06-22 2004-11-18 Nosa Omoigui System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation
EP1274018A3 (en) 2001-07-06 2006-02-08 Unicorn Solutions, Inc. Instance browser for ontology
US7752326B2 (en) 2001-08-20 2010-07-06 Masterobjects, Inc. System and method for utilizing asynchronous client server communication objects
FR2832236B1 (en) 2001-11-13 2004-04-16 Inst Nat Rech Inf Automat SEMANTIC WEB PORTAL GRAPHIC INTERFACE
US7240330B2 (en) 2002-02-01 2007-07-03 John Fairweather Use of ontologies for auto-generating and handling applications, their persistent storage, and user interfaces
US6946715B2 (en) 2003-02-19 2005-09-20 Micron Technology, Inc. CMOS image sensor and method of fabrication
US20050050030A1 (en) 2003-01-30 2005-03-03 Decode Genetics Ehf. Set definition language for relational data
US7152073B2 (en) 2003-01-30 2006-12-19 Decode Genetics Ehf. Method and system for defining sets by querying relational data using a set definition language
MXPA05008670A (en) 2003-02-14 2005-11-17 Nervana Inc Semantic knowledge retrieval management and presentation.
US20040236598A1 (en) 2003-03-21 2004-11-25 Thomsen David J. System and method for providing occupational information
JP4122433B2 (en) 2003-08-18 2008-07-23 独立行政法人農業・食品産業技術総合研究機構 Catalyst-free production method of biodiesel fuel that does not produce by-products
US7305404B2 (en) 2003-10-21 2007-12-04 United Parcel Service Of America, Inc. Data structure and management system for a superset of relational databases
CN100421107C (en) 2003-10-21 2008-09-24 美国联合包裹服务公司 Data structure and management system for a superset of relational databases
US7634472B2 (en) * 2003-12-01 2009-12-15 Yahoo! Inc. Click-through re-ranking of images and other data
EP1709550A4 (en) 2003-12-19 2009-08-12 Sonic Solutions Inc Personalization services for entities from multiple sources
US7197502B2 (en) 2004-02-18 2007-03-27 Friendly Polynomials, Inc. Machine-implemented activity management system using asynchronously shared activity data objects and journal data items
US7433876B2 (en) 2004-02-23 2008-10-07 Radar Networks, Inc. Semantic web portal and platform
WO2005081963A2 (en) 2004-02-23 2005-09-09 Metatomix, Inc. Appliance for enterprise information integration and enterprise resource interoperability platform and methods
US7490094B2 (en) 2004-05-06 2009-02-10 International Buisness Machines Corporation Importance of semantic web resources and semantic associations between two resources
JP2006011739A (en) 2004-06-24 2006-01-12 Internatl Business Mach Corp <Ibm> Device, computer system and data processing method using ontology
US7865457B2 (en) * 2004-08-25 2011-01-04 International Business Machines Corporation Knowledge management system automatically allocating expert resources
US8412554B2 (en) 2004-09-24 2013-04-02 Samsung Electronics Co., Ltd. Method and system for describing consumer electronics using separate task and device descriptions
US20060074980A1 (en) 2004-09-29 2006-04-06 Sarkar Pte. Ltd. System for semantically disambiguating text information
EP1645974B1 (en) 2004-10-05 2014-01-01 Sony Europe Limited Self-organisation approach to semantic interoperability in peer-to-peer information exchange
US8290977B2 (en) 2004-10-21 2012-10-16 Sybase Inc. Database system providing methodology for execution of functions in XML queries
US7478105B2 (en) 2004-10-26 2009-01-13 International Business Machines Corporation E-mail based Semantic Web collaboration and annotation
EP1653308B1 (en) 2004-10-29 2011-07-27 Siemens Aktiengesellschaft Method and apparatus for providing and storing information
WO2006076198A2 (en) 2005-01-10 2006-07-20 Instant Information Inc. Methods and systems for managing communications in a collaboration system
US20070168340A1 (en) 2005-01-10 2007-07-19 Instant Information Inc. Methods and systems for enabling the collaborative management of information using persistent metadata
US20070255674A1 (en) 2005-01-10 2007-11-01 Instant Information Inc. Methods and systems for enabling the collaborative management of information based upon user interest
US8200700B2 (en) 2005-02-01 2012-06-12 Newsilike Media Group, Inc Systems and methods for use of structured and unstructured distributed data
US8126870B2 (en) 2005-03-28 2012-02-28 Sybase, Inc. System and methodology for parallel query optimization using semantic-based partitioning
WO2006116649A2 (en) 2005-04-27 2006-11-02 Intel Corporation Parser for structured document
US7310652B1 (en) 2005-08-08 2007-12-18 At&T Corp. Method and apparatus for managing hierarchical collections of data
US20070078675A1 (en) * 2005-09-30 2007-04-05 Kaplan Craig A Contributor reputation-based message boards and forums
US8874477B2 (en) 2005-10-04 2014-10-28 Steven Mark Hoffberg Multifactorial optimization system and method
US7904401B2 (en) 2006-02-21 2011-03-08 International Business Machines Corporation Scaleable ontology reasoning to explain inferences made by a tableau reasoner
US7933915B2 (en) 2006-02-27 2011-04-26 The Regents Of The University Of California Graph querying, graph motif mining and the discovery of clusters
US8972872B2 (en) 2006-03-27 2015-03-03 Fujitsu Limited Building computing applications based upon metadata
CN101093493B (en) 2006-06-23 2011-08-31 国际商业机器公司 Speech conversion method for database inquiry and converter
US20080033993A1 (en) 2006-08-04 2008-02-07 International Business Machines Corporation Database Access Through Ontologies With Semi-Automatic Semantic Mapping
US8838648B2 (en) 2006-08-17 2014-09-16 International Business Machines Corporation Efficient discovery of keys in a database
US20080059455A1 (en) * 2006-08-31 2008-03-06 Canoy Michael-David N Method and apparatus of obtaining or providing search results using user-based biases
US20100121839A1 (en) 2007-03-15 2010-05-13 Scott Meyer Query optimization
US20090024590A1 (en) 2007-03-15 2009-01-22 Sturge Timothy User contributed knowledge database
US20100174692A1 (en) 2007-03-15 2010-07-08 Scott Meyer Graph store
US8204856B2 (en) 2007-03-15 2012-06-19 Google Inc. Database replication
US20090055384A1 (en) * 2007-08-23 2009-02-26 Yahoo! Inc. Shared influence search
US20090125382A1 (en) * 2007-11-07 2009-05-14 Wise Window Inc. Quantifying a Data Source's Reputation
US8032503B2 (en) 2008-08-05 2011-10-04 Teradata Us, Inc. Deferred maintenance of sparse join indexes
US20110093500A1 (en) 2009-01-21 2011-04-21 Google Inc. Query Optimization
WO2010085523A1 (en) 2009-01-21 2010-07-29 Metaweb Technologies, Inc. Graph store
US8484181B2 (en) * 2010-10-14 2013-07-09 Iac Search & Media, Inc. Cloud matching of a question and an expert
US8700580B1 (en) * 2011-04-29 2014-04-15 Google Inc. Moderation of user-generated content
US20120283574A1 (en) * 2011-05-06 2012-11-08 Park Sun Young Diagnosis Support System Providing Guidance to a User by Automated Retrieval of Similar Cancer Images with User Feedback
US20130166340A1 (en) * 2011-12-21 2013-06-27 Mansour Anthony Salamé System and Method for Online Marketing of Services

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220353304A1 (en) * 2021-04-30 2022-11-03 Microsoft Technology Licensing, Llc Intelligent Agent For Auto-Summoning to Meetings
US20220353306A1 (en) * 2021-04-30 2022-11-03 Microsoft Technology Licensing, Llc Intelligent agent for auto-summoning to meetings

Also Published As

Publication number Publication date
US11526773B1 (en) 2022-12-13
US10223637B1 (en) 2019-03-05

Similar Documents

Publication Publication Date Title
US20230113420A1 (en) Predicting accuracy of submitted data
US20230129014A1 (en) Apparatus, systems, and methods for analyzing characteristics of entities of interest
US11514333B2 (en) Combining machine-learning and social data to generate personalized recommendations
US10318538B2 (en) Systems, methods, and apparatuses for implementing an interface to view and explore socially relevant concepts of an entity graph
RU2696230C2 (en) Search based on combination of user relations data
US20170235788A1 (en) Machine learned query generation on inverted indices
US10282483B2 (en) Client-side caching of search keywords for online social networks
US10165066B2 (en) Systems, methods, and apparatuses for implementing an interface to populate and update an entity graph through socially relevant user challenges
US10102482B2 (en) Factorized models
AU2014259978B2 (en) Tagged search result maintenance
US9411857B1 (en) Grouping related entities
US9946794B2 (en) Accessing special purpose search systems
CN110990725A (en) Distance-based search ranking demotion
WO2019194868A1 (en) Allocating resources in response to estimated completion times for requests
US20160179882A1 (en) Searching and Accessing Application -Independent Functionality
US9727545B1 (en) Selecting textual representations for entity attribute values
US8782034B1 (en) Utilizing information about user-visited places to recommend novel spaces to explore
US20140156623A1 (en) Generating and displaying tasks
US9547713B2 (en) Search result tagging
US20160188721A1 (en) Accessing Multi-State Search Results
US20190034474A1 (en) Resolving Inconsistencies in Information Graphs
US9311362B1 (en) Personal knowledge panel interface
US10445326B2 (en) Searching based on application usage
US10510095B2 (en) Searching based on a local density of entities
Ma et al. Mobile application search: a QoS-aware and tag-based approach

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CZUBA, KRZYSZTOF;GABRILOVICH, EVGENIY;REEL/FRAME:062133/0057

Effective date: 20130522

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: ENTITY CONVERSION;ASSIGNOR:GOOGLE INC.;REEL/FRAME:062151/0935

Effective date: 20170929

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION