US20090248690A1 - System and method for determining preferences from information mashups - Google Patents

System and method for determining preferences from information mashups Download PDF

Info

Publication number
US20090248690A1
US20090248690A1 US12/195,126 US19512608A US2009248690A1 US 20090248690 A1 US20090248690 A1 US 20090248690A1 US 19512608 A US19512608 A US 19512608A US 2009248690 A1 US2009248690 A1 US 2009248690A1
Authority
US
United States
Prior art keywords
vote
preferences
information
swf
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/195,126
Inventor
Varun Bhagwan
Tyrone Wilberforce Andre Grandison
Daniel Frederick Gruhl
Jan Hendrik Pieper
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/195,126 priority Critical patent/US20090248690A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BHAGWAN, VARUN, GRUHL, DANIEL FREDERICK, PIEPER, JAN HENDRIK, GRANDISON, TYRONE WILBERFORCE ANDRE
Publication of US20090248690A1 publication Critical patent/US20090248690A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present invention relates to information mashups, and in particular to a system and method for determining preferences from cross-modality information mashups.
  • a computer-implemented method for determining preferences from cross-modality information mashups includes receiving a social welfare function (SWF) and identifying two or more vote computing methods. For each of the two or more vote computing methods, the method uses the vote computing method to combine information on preferences into a combined list ranking the preferences.
  • the information is from a set of two or more sources. The set is heterogeneous in modality.
  • the method inputs the combined list into the SWF to compute a score.
  • the method outputs the combined list of the vote computing method associated with the highest score.
  • the set of two or more sources may include data from websites indicating preferences within a certain domain of interest.
  • the information from the set of two or more sources may include structured data from a first source and unstructured data from a second source.
  • the number of preferences being ranked may be at least an order magnitude more in number than the number of sources.
  • a computer program product for determining preferences from cross-modality information includes a computer readable medium and program instructions.
  • the program instructions include first program instructions to identify two or more vote computing methods and second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences.
  • the information is from a set of two or more sources.
  • the set is heterogeneous in modality.
  • the program instructions further include third program instructions to compute a score, and fourth program instructions to output the vote computing method associated with the highest score.
  • the program instructions may also include fifth program instructions to output the combined list of the vote computing method associated with the highest score.
  • the two or more sources may include an online blog, an online forum, and/or an online social networking website.
  • the social welfare function may be selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
  • a system for determining preferences from cross-modality information includes a communications interface, memory storing computer usable program code; and a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory.
  • the computer usable program code includes computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score.
  • the computer usable program code may further include computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
  • FIG. 1 shows a method of combining data on preferences according to one embodiment of the present invention.
  • FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods.
  • FIG. 3 shows a table of four top-10 lists.
  • FIG. 4 shows a flow diagram of a method for combining data on preferences in accordance with an embodiment of the invention.
  • FIG. 5 shows a flow diagram of another method for combining data on preferences in accordance with an embodiment of the invention.
  • FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention.
  • FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented.
  • the present invention provides a system and method for determining preferences from information mashups.
  • An information mashup combines or mixes information or data from a multitude of often-conflicting sources into a single representation. For example, for any given domain of interest, opinions can be expressed in many places and collected by many sources. Online sources for people's opinion on a wide range of topics include, for example, blogs, discussion forums and social networking sites.
  • Embodiments of the invention combine information gathered from across different sources, including in one application various online sources, to form a unified, focused view of a community's interests regarding that domain.
  • An exemplary embodiment of the invention determines preferences from cross-modality information mashups.
  • systems compare things with identical modalities, such as number of sales from different sources.
  • domains of interests e.g., patient preferences, drugs for certain medical conditions, cars, wine, financial products (stocks, bonds, etc.), consumer goods, cameras, computers, books, etc.
  • information is available from many different modalities (e.g., comments, passive listens, sales, hits on a website, creation of new website, views on television, etc.).
  • An exemplary embodiment of the invention determines preferences from information mashups constructed from information and data from a set of sources heterogeneous in modality. For example, say we want to combine different on-line data to generate a list of wines.
  • One source of preferences may be generated from sales numbers of wines.
  • Another source may be a list generated from wine tasters.
  • Yet another may be generated by professionals at a wine magazine.
  • Yet another may be generated from counts of comments users post on a wine aficionado site. There are many more sales of wines than posts on a website.
  • Plurality type voting systems are those that add together the number of votes from each source and simply adjudicate the winner based on whomever or whichever candidate has the most votes.
  • Plurality type voting systems include systems in which votes are weighted.
  • plurality type voting systems have deficiencies when combining information gathered from multiple sources with differing modalities. This can occur, for example, when there are large differences in the numbers returned by sources or when the values measured to derive those numbers indicate very different things.
  • SWF social welfare function
  • a SWF might describe, for example, the preferences of an individual over social states, or might describe, as another example, outcomes of an allocation process, whether or not individuals had preferences over those outcomes.
  • SWFs are the Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
  • a method is supplied for embodying subjectiveness, such as those described above, into one function.
  • embodiments of the invention can capture, for example, business goals in a semi-heuristic way, objectively evaluate various preference combination techniques, and identify which of the combinations techniques to use in a specific instance.
  • the combination techniques include techniques that originate from vote computing or vote counting systems, such as a Borda count method or the Nauru method.
  • Embodiments of the invention may supplement or modify a vote computing or vote counting technique depending on whether the original information expressing the preferences is, for example, structured or unstructured, numerical or textual, etc.
  • the combination technique used is as describe in co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), and filed on ______. Accordingly, the system and method for determining preferences from information mashups described in detail herein compliments the system and method described in the co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2).
  • the system or method described in detail herein may be used in conjunction with the system or method described in detail in the co-pending application.
  • the first system and method may be used separately from the latter
  • the SWF takes as input a “final” ranked list generated from each of the various vote counting/computing methods and/or systems, and the preferences of each source.
  • the “final” ranked list may be generated using, for example, weighted voting systems, semi-proportional methods, delegates, Borda Count, inverted rank, run off, round robin, and/or a ranking method described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2).
  • the SWF outputs a number that indicates how happy or satisfied the “society” of sources is with the results.
  • multiple methods of combining are examined and evaluated, and the combining method that returns the highest SWF value is considered the “best” method. That combining method is then established as the combining method that application will use when determining preferences from future mashups combining information for those sources for those business purposes, for example. As discussed below, reevaluation of the combining method may be done periodically to optimize the quality of the results.
  • the present disclosure differs from traditional work in the field in several ways.
  • the disclosure addresses situations where, as noted, people are providing preferences in non-uniform ways (complaints, purchase, opinion posted, time, etc.).
  • ad hoc weights don't work well because ad hoc weights can only adjust for the deficiencies that exist at a simple point in time.
  • Amazon.com ranks should be weighted lower (having less weight in the over scheme of the analysis) than Barnes & Noble because Amazon.com opened its online store in July 1995. If the top-10 lists were compared today, the weights would differ.
  • ad hoc weights are useful when combining lists of preferences at one point in time, they need adjusting each time new data from the sources are recombined to account for, e.g., changes in the market, business cycles, seasons, time of day, new product releases (which could, for example, skew statistics for a few days), blitz marketing campaigns, events (e.g., Olympics® or Super Bowl®), etc. These real world changes have the potential of causing dramatic shifts in the rankings being reported.
  • the ad hoc weights adjustments are time-dependent. If we calculate the rankings at a different point in time, the weights would be reconsidered and changed, tuned each time we calculate the rankings. This can be particularly onerous depending on how often the combined rankings are calculated (in real time, daily, weekly, monthly, quarterly, annually, etc.) particularly if the tuning is done without the assistance of any computer-implemented algorithms.
  • embodiments of the invention identify the most appropriate method to combine preferences from sources of different modalities by using a SWF appropriate for predefined objectives (e.g., business requirements).
  • exemplary embodiments take into account, for example, business requirements to a level of granularity that ad hoc weights cannot.
  • embodiments of the invention examine domains with orders of magnitude more “candidates” than “voters”, the reverse of most elections.
  • Conventional voting techniques do not examine scenarios in which the number of “candidates” is orders of magnitude more than the number of “voters.”
  • the Borda function is intended for use in situations when there are a large number of voters and a small number of candidates, such as in a presidential election.
  • embodiments of the invention examine vote computing techniques that are intended for use in scenarios in which the number of items being ranked (or “candidates”) is orders magnitude more than the number of sources ranking the items (“voters”), the opposite of convention elections.
  • a vote computing technique may combine the information on preferences into a combined list ranking the preferences in an application in which the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
  • FIG. 1 illustrates a method of combining data on preferences according to one embodiment of the invention.
  • FIG. 1 shows information 1010 from multiples sources (e.g., Source 1 , Source 2 , Source 3 , etc.), a set of vote computing methods 1020 , a social welfare function (SWF) 1030 , and a set of social welfare function (SWF) scores 1040 .
  • sources e.g., Source 1 , Source 2 , Source 3 , etc.
  • SWF social welfare function
  • SWF social welfare function
  • the information 1010 includes information from multiples sources (e.g., Source 1 , Source 2 , Source 3 , etc.).
  • the multiple sources are of varying modalities. Modalities may be expressed as having two major dimensions: intentional versus unintentional, and consuming (passive) versus producing (creative).
  • Intentional activities are those where a user, for example, has had to take steps to “make their mark.” Examples in the online arena would be navigating to a particular page or typing in a name into a search bar. Intentional activities are stronger indicators of interest than unintentional activities.
  • Creative, producing activities are, for example, those where the user takes the time to author a post or compose a response. Passive, consuming activities may involve watching or reading something created by someone else. Creative activities, taking more time and attention, indicate more interest than passive activities.
  • each source is either a list of preferences (e.g. a ranked list) or provides data which is converted into a list of preferences.
  • the converting may, for example, process user posts from a social networking site, such as by employing a series of unstructured information management architecture (UIMA) annotators driven off of entity spotting, using Information Extraction (IE) techniques, and/or using natural language mining (NLM) techniques.
  • UIMA unstructured information management architecture
  • IE Information Extraction
  • NLM natural language mining
  • an embodiment of the invention may additionally or alternatively request that a source convert the source's data into a list of preferences, instead of converting the data itself.
  • Source 1 may be sales numbers of wines
  • Source 2 may be an online list generated by wine tasters
  • Source 3 may be a ranking based analysis of various blogs on wines.
  • the analysis may have employed a series of UIMA annotators. Each source expresses or reflects opinions on the same underlying subject matter, phenomenon, or domain of interest.
  • each vote computing method is a different combining technique.
  • Vote computing method 1 may be a Borda count technique
  • Vote computing method 2 may be an inverted ranking technique
  • Vote computing method 3 may be round robin technique
  • Vote computing method 4 may be the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2.
  • the output of each vote computing method is, a “final” ranked list, sometimes referred to as a single, merged list.
  • the final list is provided to the SWF, along with the sources' preferences.
  • the SWF is a mathematical criterion for the success of a voting system based on some desired characteristics. Accordingly, in certain embodiments, the SWF is constructed to capture characteristics which are considered valuable. The value may be determined from a business standpoint if business objectives are driving the undertaking. For example, in determining preferences for wine, it may be considered valuable for each source to see at least 1 ⁇ 2 of its top 10 list in the overall top 10 list. How well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function affects the appropriateness of the actual combined list outputted by embodiments of the invention.
  • the SWF may be selected or custom constructed to fit the situation.
  • the SWF is selected from among a set of SWF, e.g., a set including the Precision Optimal Aggregation SWF (P swf ) and the Spearman Footrule SWF (S swf ).
  • the P swf measures how many items from each source's top-n list are in the “final” ranked list (the single list which merges ranked items from each source). For example, in one application, the P swf measures how many artists from each source's top-10 list are in an overall top-10 list created using Borda count technique.
  • a Precision Optimal Aggregation SWF defined as:
  • the Spearman Footrule SWF emphasizes preservation of position in the rankings.
  • the S swf is an approximation of a related SWF Kendall tau distance.
  • the S swf is less computationally intensive (minutes versus days) relative to the Kendall tau distance.
  • One exemplary embodiment uses a Spearman Footrule SWF defined as:
  • the SWF takes as input a “final” ranked list and the preferences of each source.
  • the outcome is a score where points are awarded for increased social welfare of a ranking system.
  • embodiments quantitatively measure the “happiness” of each contributing source with the overall “final” ranking.
  • an SWF score is calculated for each vote computing method (e.g., SWF score 1 , SWF score 2 , SWF score 3 , etc.).
  • FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods.
  • the results of four combination techniques are shown to illustrate how each technique merges the four top-10 lists show in FIG. 3 .
  • a “final” top-10 computed using the corresponding vote counting technique is shown in the first column.
  • FIG. 2A a “final” top-10 ranking computed using Total Votes is shown;
  • FIG. 2B a “final” top-10 ranking computed using Weighted Votes is shown;
  • FIG. 2C a “final” top-10 ranking computed using Semi-Proportional methods is shown; and in FIG. 2D , a “final” top-10 ranking computed using Delegates is shown.
  • SWF scores computed using two different SWF are shown.
  • the P swf column shows the contribution of each artist to the overall Precision Optimal Aggregation SWF for that source.
  • the S swf column shows the contribution of each artist to the overall Spearman Footrule SWF for that source.
  • the bars correspond to the sources in FIG. 3 in this order: Bebo, LastFM, MySpace, and YouTube. The bars represent how “happy” each source is with the artist being ranked at this position.
  • the graphs in the C S column express the contribution to the combined ranking for the artist from each source.
  • the bars from left to right in each of those cells correspond to the sources in FIG. 3 in the order: Bebo, LastFM, MySpace, and YouTube.
  • the rankings shown in the first column was produced by merging the rankings from Bebo, LastFM, MySpace, and YouTube shown in FIG. 3 through simple summation of the votes for each artists.
  • the last bar, corresponding to YouTube is the longest because YouTube had more data points than the other sources. In this example, the “number of votes” is dominated by YouTube.
  • each table in FIGS. 2A-2D shows the total SWF for P swf and S swf expressed as a raw score.
  • P swf each source contributes up to 10 points, for a maximum score of 40 (best).
  • S swf each source contributes up to 100 points, for a maximum of 400.
  • the total influence of each source had on the top-10 list is also seen in the bottom of each table in the last row of the C S column.
  • FIGS. 2A-2D also illustrate a scenario when the SWF is a Precision Optimal Aggregation SWF (second column of each table).
  • Table 1 below shows the results converted to percentage based on a maximum score of 40 for the Precision Optimal Aggregation SWF.
  • the Semi-Proportional vote counting method is identified among the four as the combining technique that produces a combined list most congruent with the subjective values embodied by the Precision Optimal Aggregation SWF.
  • FIGS. 2A-2D also illustrate a scenario when the SWF is Spearman Footrule SWF (third column of each table).
  • Table 2 below shows the results converted to percentage based on a maximum score of 400 for the Spearman Footrule SWF.
  • the Weighted Votes vote counting method is identified as the combining technique among the four that produces a combined list most congruent with the subjective values embodied by the Spearman Footrule SWF.
  • embodiments of the invention include a method that includes identifying a vote computing method that produces the highest SWF score.
  • the evaluation of which voting computing method is most appropriate for a given set of objectives is performed by the SWF.
  • the SWF takes the lists of the voters' preferences (the lists from the various sources), along with the outcome of the vote (the combined/consensus list), and produces for each vote computing method a “score” indicating the “satisfaction” in the outcome.
  • the highest score indicates the highest satisfaction. That is, the vote computing method that elevates/accounts for/values those characteristics that are valued by the business objectives (as modeled using the SWF) in an optimal fashion is the vote computing method that gets the highest score from the SWF. Since an embodiment of the invention will output a final combined list based on the SWF, the quality of the output is affected by how well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function.
  • a large collection of voting methods or combination techniques are enumerated and examined, over multiple time periods of sample data, to identify which voting method or combination technique produces the highest SWF score.
  • An exemplary embodiment examines the results of various “voting” methods using several weeks or months worth of data.
  • parameter(s) may also be optimized as well to improve the quality of the system. For example, in use, embodiments may determine (e.g., by searching for or computing) a parameter value that is most congruent with enabling the parameterized voting method to output a combined list that reflects the values, e.g., the business objectives.
  • the characteristics of many sources change over time.
  • the congruency of the method to the business objectives is revisited periodically (e.g., quarterly) to make sure that changes in the underlying data sources have not reduced the quality of the results.
  • the additional optimization techniques described are also repeated periodically.
  • an exemplary embodiment of the invention applies voting theory to cross-modality information mashups to construct a combined list ranking preferences.
  • An SWF is used to select from various voting methods based on data from various cross-modality sources.
  • the sources and associated data are dependent on the domain. For example, in the domain of interest of wine, the source or associated data may be results of wine tasting parties, professional reviews (e.g., scores from 1-10 in different categories), sales, change in sales, comments posted by average users, and mentions in mass media.
  • FIG. 4 shows a flow diagram of a method 4000 for combining data on preferences in accordance with an embodiment of the invention.
  • a social welfare function (SWF) is received (e.g., by communications interface 66 or from memory or storage, as described below).
  • the SWF embodies a business objective.
  • sources that provide perspective on opinions on the subject area are identified. The sources may be, for example, sales, comments, etc.
  • data from these sources are gathered and normalized to create ranked lists of preferences for each source.
  • two or more vote computing methods are identified.
  • the vote computing method is used to combine data on preferences into a combined list ranking the preferences.
  • the data being combined is from a set of two or more sources.
  • the set is heterogeneous in modality.
  • a delegate allocation vote computing method a preliminary distribution of delegates among the sources is determined. For example, in the example shown in FIG. 2D , the following delegate numbers based roughly on population were distributed to the sources: 300 to Bebo, 500 to LastFM, 1000 to MySpace, and 500 to YouTube.
  • the combined list is inputted into the SWF to compute a score.
  • the score indicates congruency between the combined list and value(s) embodied by the SWF, e.g., a business objective.
  • the combined list of the vote computing method associated with the highest score is outputted. In one embodiment, additionally or alternatively, the vote computing method associated with the highest score is outputted.
  • FIG. 5 shows a flow diagram of a method 5000 for combining data on preferences in accordance with an embodiment of the invention.
  • SWF social welfare function
  • the SWF defines business objectives.
  • two or more vote computing methods are identified.
  • the vote computing method is used to combine data on preferences into a combined list ranking the preferences.
  • the data is from a set of two or more sources.
  • the set is heterogeneous in modality.
  • the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and, for example, the business objective.
  • the combined list of the vote computing method associated with the highest score is outputted.
  • embodiments of this invention may execute the method 4000 and/or the method 5000 in a non-sequential order as appropriate and still remain in accordance with the invention.
  • embodiments of the present invention may receive the social welfare function before, during, or after 4020 , 4030 , 4040 , and/or 4050 .
  • embodiments of the present invention may create the social welfare function before, during, or after 5020 and/or 5030 .
  • the sources provide additional information on preference (such as total numbers) for input into voting systems that can make use of such additional information.
  • Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements.
  • the invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.
  • the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device).
  • Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, or an optical disk.
  • Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
  • a data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
  • the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
  • I/O devices including but not limited to keyboards, displays, pointing devices
  • I/O controllers can be coupled to the system either directly or through intervening I/O controllers.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
  • Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
  • FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention.
  • the computer system includes one or more processors, such as processor 44 .
  • the processor 44 is connected to a communication infrastructure 46 (e.g., a communications bus, cross-over bar, or network).
  • a communication infrastructure 46 e.g., a communications bus, cross-over bar, or network.
  • the computer system can include a display interface 48 that forwards graphics, text, and other data from the communication infrastructure 46 (or from a frame buffer not shown) for display on a display unit 50 .
  • the computer system also includes a main memory 52 , preferably random access memory (RAM), and may also include a secondary memory 54 .
  • the secondary memory 54 may include, for example, a hard disk drive 56 and/or a removable storage drive 58 , representing, for example, a floppy disk drive, a magnetic tape drive, or an optical disk drive.
  • the removable storage drive 58 reads from and/or writes to a removable storage unit 60 in a manner well known to those having ordinary skill in the art.
  • Removable storage unit 60 represents, for example, a floppy disk, a compact disc, a magnetic tape, or an optical disk, etc. which is read by and written to by removable storage drive 58 .
  • the removable storage unit 60 includes a computer readable medium having stored therein computer software and/or data.
  • the secondary memory 54 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system.
  • Such means may include, for example, a removable storage unit 62 and an interface 64 .
  • Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 62 and interfaces 64 which allow software and data to be transferred from the removable storage unit 62 to the computer system.
  • the computer system may also include a communications interface 66 .
  • Communications interface 66 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 66 may include a modem, a network interface (such as an Ethernet card), a communications port, or a PCMCIA slot and card, etc.
  • Software and data transferred via communications interface 66 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 66 . These signals are provided to communications interface 66 via a communications path (i.e., channel) 68 .
  • This channel 68 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
  • computer program medium “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 52 and secondary memory 54 , removable storage drive 58 , and a hard disk installed in hard disk drive 56 .
  • Computer programs are stored in main memory 52 and/or secondary memory 54 . Computer programs may also be received via communications interface 66 . Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 44 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
  • FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented.
  • Distributed data processing system 100 may include a network of computers in which aspects of the illustrative embodiments may be implemented.
  • the distributed data processing system 100 contains at least one network 102 , which is the medium used to provide communication links between various devices and computers connected together within distributed data processing system 100 .
  • the network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.
  • server 104 and server 106 are connected to network 102 along with storage unit 108 .
  • clients 110 , 112 , and 114 are also connected to network 102 .
  • These clients 110 , 112 , and 114 may be, for example, personal computers, network computers, or the like.
  • server 104 provides data, such as boot files, operating system images, and applications to clients 110 , 112 , and 114 .
  • Clients 110 , 112 , and 114 are clients to server 104 in the depicted example.
  • Distributed data processing system 100 may include additional servers, clients, and other devices not shown.
  • distributed data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the distributed data processing system 100 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like.
  • FIG. 7 is intended as an example, not as an architectural limitation for different embodiments of the present invention, and therefore, the particular elements shown in FIG. 7 should not be considered limiting with regard to the environments in which the illustrative embodiments of the present invention may be implemented.
  • clients 110 and 112 collect information (e.g., from user input) and provides it to server 104 .
  • Server 104 stores the information in storage 108 .
  • Server 106 contains hardware devices and software tools to combine the information (e.g., into information mashups and/or combined/consensus lists) according to the present invention.
  • Server 106 transmits the combined information to server 104 and/or clients 110 , 112 , and/or 114 , for example.
  • client 114 may provide the server with business requirements embodied in a SWF.
  • the server determines to best vote counting method to use for that particular application based on the SWF and the information stored, e.g., in storage 108 .
  • the server may transmit an identification of the vote counting method to the client.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information based on a social welfare function is disclosed. An exemplary embodiment of the invention uses a social welfare function (SWF) to identify a vote computing method from among a group of vote computing methods. The SWF embodies subjective values, e.g. business objectives. The embodiment uses the SWF to identify the vote computing method that combines cross-modality information into a single information mashup in a manner that is most congruent with the subjective values relative to the other vote computing methods. The information mashup may be in the form of a single, merged ranked list.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of and incorporates by reference in its entirety U.S. provisional application No. 61/041,128, which was filed on Mar. 31, 2008.
  • FIELD OF INVENTION
  • The present invention relates to information mashups, and in particular to a system and method for determining preferences from cross-modality information mashups.
  • BACKGROUND
  • Through the advances of technology, today's world has become inundated with information. One continuing technological and societal challenge is finding methods and systems to extract and combine useful data, knowledge, and understanding from a pool of information that is constantly growing in quantity and increasing in granularity.
  • Even when we narrow our analysis to one domain of interest, e.g. ranking wines, how do we combine all the information indicating preferences within the domain when the information is available from multiples sources and the sources differ in modality? For example, how do we combine multiple lists of preferences, e.g., from different online communities, sales numbers from different stores, etc? How do we combine the information in a manner that will reveal the aspects of that information that are important, valuable, significant to an entity (e.g., a machine, business, customer, end-user, etc.) requesting the results? And how do we enable tuning of the outcome, e.g., at the touch of a button, to target certain characteristics and elevate those characteristics to the forefront?
  • SUMMARY OF THE INVENTION
  • A computer-implemented method for determining preferences from cross-modality information mashups is provided. The method includes receiving a social welfare function (SWF) and identifying two or more vote computing methods. For each of the two or more vote computing methods, the method uses the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. For each combined list, the method inputs the combined list into the SWF to compute a score. The method outputs the combined list of the vote computing method associated with the highest score. The set of two or more sources may include data from websites indicating preferences within a certain domain of interest. The information from the set of two or more sources may include structured data from a first source and unstructured data from a second source. The number of preferences being ranked may be at least an order magnitude more in number than the number of sources.
  • A computer program product for determining preferences from cross-modality information is also provided. The computer program product includes a computer readable medium and program instructions. The program instructions include first program instructions to identify two or more vote computing methods and second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. The program instructions further include third program instructions to compute a score, and fourth program instructions to output the vote computing method associated with the highest score. The program instructions may also include fifth program instructions to output the combined list of the vote computing method associated with the highest score. The two or more sources may include an online blog, an online forum, and/or an online social networking website. The social welfare function may be selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
  • A system for determining preferences from cross-modality information is further provided. The system includes a communications interface, memory storing computer usable program code; and a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory. The computer usable program code includes computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score. The computer usable program code may further include computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 shows a method of combining data on preferences according to one embodiment of the present invention.
  • FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods.
  • FIG. 3 shows a table of four top-10 lists.
  • FIG. 4 shows a flow diagram of a method for combining data on preferences in accordance with an embodiment of the invention.
  • FIG. 5 shows a flow diagram of another method for combining data on preferences in accordance with an embodiment of the invention.
  • FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention.
  • FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented.
  • DESCRIPTION Overview
  • The present invention provides a system and method for determining preferences from information mashups. An information mashup combines or mixes information or data from a multitude of often-conflicting sources into a single representation. For example, for any given domain of interest, opinions can be expressed in many places and collected by many sources. Online sources for people's opinion on a wide range of topics include, for example, blogs, discussion forums and social networking sites. Embodiments of the invention combine information gathered from across different sources, including in one application various online sources, to form a unified, focused view of a community's interests regarding that domain.
  • An exemplary embodiment of the invention determines preferences from cross-modality information mashups. In more traditional information integration scenarios, systems compare things with identical modalities, such as number of sales from different sources. However there are many domains of interests (e.g., patient preferences, drugs for certain medical conditions, cars, wine, financial products (stocks, bonds, etc.), consumer goods, cameras, computers, books, etc.) where information is available from many different modalities (e.g., comments, passive listens, sales, hits on a website, creation of new website, views on television, etc.). In the domain of books, the following information may be available for determining book preferences: book sales and returns, lists of books read, library checkouts, comments on books read (e.g., online, in newspapers, in magazines, on television or radio), etc. An exemplary embodiment of the invention determines preferences from information mashups constructed from information and data from a set of sources heterogeneous in modality. For example, say we want to combine different on-line data to generate a list of wines. One source of preferences may be generated from sales numbers of wines. Another source may be a list generated from wine tasters. Yet another may be generated by professionals at a wine magazine. Yet another may be generated from counts of comments users post on a wine aficionado site. There are many more sales of wines than posts on a website. Many people buy wines whereas composing a review takes more time and may indicate more interest in a particular vintage. Ultimately, a good cross-modality mashup combines these multiple sources, which indicate interest in all the same underlying subject matter, without allowing one source to unduly influence the combined/consensus list.
  • Yet, how can one combine the data from the various sources when they are heterogeneous in modality? Comparing different modalities is akin to comparing apples and oranges. How does one determine overall rankings for certain wines, for example, based on the combination of data on sales, written reviews, returns, website polls, etc.? How do you combine data indicating that the reviewer loves a certain wine (glowing reviews), but the public hates it (e.g., by ranking it low on wine.com or low sales)? Do we decide that ten times as many posts on a website reflect ten times as much interest in an event or item? There is a fair amount of subjectivity in how these combinations occur and it is not typically clear how to combine all these sources.
  • In systems that compare things with identical modalities, using a plurality type voting system makes sense. Plurality type voting systems are those that add together the number of votes from each source and simply adjudicate the winner based on whomever or whichever candidate has the most votes. Plurality type voting systems include systems in which votes are weighted. However, plurality type voting systems have deficiencies when combining information gathered from multiple sources with differing modalities. This can occur, for example, when there are large differences in the numbers returned by sources or when the values measured to derive those numbers indicate very different things.
  • To identify which of a multitude of combination techniques (including plurality type voting techniques) is optimal for combining data from various sources in a certain instance, embodiments of the invention use a construct known as a social welfare function (SWF). A SWF is a mapping from allocations of goods or rights among people to real numbers. The SWF construct was a tool introduced by Abram Bergson in 1938. The SWF construct allows for the determination of a society's taste for different economic states. There are two features to the SWF construct: first, it imposes a structure and second, it devises a single constitutional/voting system that changes the rankings of the individual into a single society ranking. A SWF might describe, for example, the preferences of an individual over social states, or might describe, as another example, outcomes of an allocation process, whether or not individuals had preferences over those outcomes. Examples of SWFs are the Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule. Thus, using an SWF, a method is supplied for embodying subjectiveness, such as those described above, into one function. Using a custom constructed or selected SWF, embodiments of the invention can capture, for example, business goals in a semi-heuristic way, objectively evaluate various preference combination techniques, and identify which of the combinations techniques to use in a specific instance.
  • In one exemplary application, the combination techniques include techniques that originate from vote computing or vote counting systems, such as a Borda count method or the Nauru method. Embodiments of the invention may supplement or modify a vote computing or vote counting technique depending on whether the original information expressing the preferences is, for example, structured or unstructured, numerical or textual, etc.
  • In one exemplary embodiment, the combination technique used is as describe in co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), and filed on ______. Accordingly, the system and method for determining preferences from information mashups described in detail herein compliments the system and method described in the co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). In one use, the system or method described in detail herein may be used in conjunction with the system or method described in detail in the co-pending application. In another use, the first system and method may be used separately from the latter
  • In embodiments of the invention, the SWF takes as input a “final” ranked list generated from each of the various vote counting/computing methods and/or systems, and the preferences of each source. The “final” ranked list may be generated using, for example, weighted voting systems, semi-proportional methods, delegates, Borda Count, inverted rank, run off, round robin, and/or a ranking method described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). The SWF outputs a number that indicates how happy or satisfied the “society” of sources is with the results. Thus, in one application, multiple methods of combining are examined and evaluated, and the combining method that returns the highest SWF value is considered the “best” method. That combining method is then established as the combining method that application will use when determining preferences from future mashups combining information for those sources for those business purposes, for example. As discussed below, reevaluation of the combining method may be done periodically to optimize the quality of the results.
  • The present disclosure differs from traditional work in the field in several ways. For example, the disclosure addresses situations where, as noted, people are providing preferences in non-uniform ways (complaints, purchase, opinion posted, time, etc.). In such situations, ad hoc weights don't work well because ad hoc weights can only adjust for the deficiencies that exist at a simple point in time. Consider the use of ad hoc weights in combining top-10 lists from Amazon.com and Barnes & Noble in 1995. In that year, Amazon.com ranks should be weighted lower (having less weight in the over scheme of the analysis) than Barnes & Noble because Amazon.com opened its online store in July 1995. If the top-10 lists were compared today, the weights would differ. Thus, although ad hoc weights are useful when combining lists of preferences at one point in time, they need adjusting each time new data from the sources are recombined to account for, e.g., changes in the market, business cycles, seasons, time of day, new product releases (which could, for example, skew statistics for a few days), blitz marketing campaigns, events (e.g., Olympics® or Super Bowl®), etc. These real world changes have the potential of causing dramatic shifts in the rankings being reported. The ad hoc weights adjustments are time-dependent. If we calculate the rankings at a different point in time, the weights would be reconsidered and changed, tuned each time we calculate the rankings. This can be particularly onerous depending on how often the combined rankings are calculated (in real time, daily, weekly, monthly, quarterly, annually, etc.) particularly if the tuning is done without the assistance of any computer-implemented algorithms.
  • In contrast, embodiments of the invention identify the most appropriate method to combine preferences from sources of different modalities by using a SWF appropriate for predefined objectives (e.g., business requirements). Thus, in analyzing and combining information on preferences, exemplary embodiments take into account, for example, business requirements to a level of granularity that ad hoc weights cannot.
  • Additionally, embodiments of the invention examine domains with orders of magnitude more “candidates” than “voters”, the reverse of most elections. Conventional voting techniques do not examine scenarios in which the number of “candidates” is orders of magnitude more than the number of “voters.” For example, the Borda function is intended for use in situations when there are a large number of voters and a small number of candidates, such as in a presidential election. Accordingly, embodiments of the invention examine vote computing techniques that are intended for use in scenarios in which the number of items being ranked (or “candidates”) is orders magnitude more than the number of sources ranking the items (“voters”), the opposite of convention elections. Thus, such a vote computing technique may combine the information on preferences into a combined list ranking the preferences in an application in which the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
  • FEATURES OF EXEMPLARY EMBODIMENT
  • Exemplary embodiments of the invention determine preferences from cross-modality information mashups based on a constructed or selected social welfare function (SWF). FIG. 1 illustrates a method of combining data on preferences according to one embodiment of the invention. FIG. 1 shows information 1010 from multiples sources (e.g., Source1, Source2, Source3, etc.), a set of vote computing methods 1020, a social welfare function (SWF) 1030, and a set of social welfare function (SWF) scores 1040.
  • The information 1010 includes information from multiples sources (e.g., Source1, Source2, Source3, etc.). The multiple sources are of varying modalities. Modalities may be expressed as having two major dimensions: intentional versus unintentional, and consuming (passive) versus producing (creative). Intentional activities are those where a user, for example, has had to take steps to “make their mark.” Examples in the online arena would be navigating to a particular page or typing in a name into a search bar. Intentional activities are stronger indicators of interest than unintentional activities. Creative, producing activities are, for example, those where the user takes the time to author a post or compose a response. Passive, consuming activities may involve watching or reading something created by someone else. Creative activities, taking more time and attention, indicate more interest than passive activities.
  • In FIG. 1, each source is either a list of preferences (e.g. a ranked list) or provides data which is converted into a list of preferences. The converting may, for example, process user posts from a social networking site, such as by employing a series of unstructured information management architecture (UIMA) annotators driven off of entity spotting, using Information Extraction (IE) techniques, and/or using natural language mining (NLM) techniques. Depending on the application, an embodiment of the invention may additionally or alternatively request that a source convert the source's data into a list of preferences, instead of converting the data itself. In an example application, Source1 may be sales numbers of wines, Source2 may be an online list generated by wine tasters, and Source3 may be a ranking based analysis of various blogs on wines. The analysis may have employed a series of UIMA annotators. Each source expresses or reflects opinions on the same underlying subject matter, phenomenon, or domain of interest.
  • The information 1010 is communicated to each of vote computing methods (e.g., Vote computing method1, Vote computing method2, etc.). In FIG. 1, each vote computing method is a different combining technique. For example, Vote computing method1, may be a Borda count technique, Vote computing method2 may be an inverted ranking technique, Vote computing method3 may be round robin technique and Vote computing method4 may be the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2.
  • In FIG. 1, the output of each vote computing method is, a “final” ranked list, sometimes referred to as a single, merged list. The final list is provided to the SWF, along with the sources' preferences. From one perspective, the SWF is a mathematical criterion for the success of a voting system based on some desired characteristics. Accordingly, in certain embodiments, the SWF is constructed to capture characteristics which are considered valuable. The value may be determined from a business standpoint if business objectives are driving the undertaking. For example, in determining preferences for wine, it may be considered valuable for each source to see at least ½ of its top 10 list in the overall top 10 list. How well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function affects the appropriateness of the actual combined list outputted by embodiments of the invention.
  • Accordingly, the SWF may be selected or custom constructed to fit the situation. In one embodiment, the SWF is selected from among a set of SWF, e.g., a set including the Precision Optimal Aggregation SWF (Pswf) and the Spearman Footrule SWF (Sswf). The Pswf measures how many items from each source's top-n list are in the “final” ranked list (the single list which merges ranked items from each source). For example, in one application, the Pswf measures how many artists from each source's top-10 list are in an overall top-10 list created using Borda count technique. One exemplary embodiment uses a Precision Optimal Aggregation SWF defined as:

  • P swfSmin(2*|T S ∩T|,10),
      • for top-10 lists TS for each source and top-10 list T overall.
  • The Spearman Footrule SWF (Sswf) emphasizes preservation of position in the rankings. The Sswf is an approximation of a related SWF Kendall tau distance. The Sswf is less computationally intensive (minutes versus days) relative to the Kendall tau distance. One exemplary embodiment uses a Spearman Footrule SWF defined as:

  • S swfSΣ10 a=1max(10−|r a −r as|,0).
  • In use, the SWF takes as input a “final” ranked list and the preferences of each source. The outcome is a score where points are awarded for increased social welfare of a ranking system. In this way, embodiments quantitatively measure the “happiness” of each contributing source with the overall “final” ranking. As shown in FIG. 1, an SWF score is calculated for each vote computing method (e.g., SWF score1, SWF score2, SWF score3, etc.).
  • FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods. In FIGS. 2A-2D, the results of four combination techniques are shown to illustrate how each technique merges the four top-10 lists show in FIG. 3. In each of FIGS. 2A-2D, a “final” top-10 computed using the corresponding vote counting technique is shown in the first column. Specifically, in FIG. 2A, a “final” top-10 ranking computed using Total Votes is shown; in FIG. 2B, a “final” top-10 ranking computed using Weighted Votes is shown; in FIG. 2C, a “final” top-10 ranking computed using Semi-Proportional methods is shown; and in FIG. 2D, a “final” top-10 ranking computed using Delegates is shown.
  • In each of FIGS. 2A-2D, SWF scores computed using two different SWF (Pswf and Sswf) are shown. The Pswf column shows the contribution of each artist to the overall Precision Optimal Aggregation SWF for that source. The Sswf column shows the contribution of each artist to the overall Spearman Footrule SWF for that source. In the columns labeled Pswf and Sswf, from top to bottom in each of those table cells, the bars correspond to the sources in FIG. 3 in this order: Bebo, LastFM, MySpace, and YouTube. The bars represent how “happy” each source is with the artist being ranked at this position.
  • The graphs in the CS column express the contribution to the combined ranking for the artist from each source. In the columns labeled CS, the bars from left to right in each of those cells correspond to the sources in FIG. 3 in the order: Bebo, LastFM, MySpace, and YouTube. The greater a source's contribution to the combined ranking for the artist, the longer the bar. For example, in FIG. 2A, the rankings shown in the first column was produced by merging the rankings from Bebo, LastFM, MySpace, and YouTube shown in FIG. 3 through simple summation of the votes for each artists. In FIG. 2A, the last bar, corresponding to YouTube, is the longest because YouTube had more data points than the other sources. In this example, the “number of votes” is dominated by YouTube.
  • The bottom of each table in FIGS. 2A-2D shows the total SWF for Pswf and Sswf expressed as a raw score. For Pswf, each source contributes up to 10 points, for a maximum score of 40 (best). For Sswf, each source contributes up to 100 points, for a maximum of 400. The total influence of each source had on the top-10 list is also seen in the bottom of each table in the last row of the CS column.
  • The examples illustrated by FIGS. 2A-2D can be understood in the context of FIG. 1 as follows. FIGS. 2A-2D illustrate a scenario with four sources: Source1=ranking from Bebo, Source2=ranking from LastFM, Source3=ranking from MySpace, and Source4=ranking from YouTube. FIGS. 2A-2D also illustrate a scenario with four vote counting methods: Vote computing method1=Total Votes (see FIG. 2A), Vote computing method2=Weighted Votes (see FIG. 2B), Vote computing method3=Semi-Proportional (see FIG. 2C), and Vote computing method4=Delegates (see FIG. 2D).
  • FIGS. 2A-2D also illustrate a scenario when the SWF is a Precision Optimal Aggregation SWF (second column of each table). For each vote counting method, the Precision Optimal Aggregation SWF score is shown at the bottom of the corresponding table: SWF score1=22 (see FIG. 2A), SWF score2=28 (see FIG. 2B), SWF score3=30 (see FIG. 2C), and SWF score3=26 (see FIG. 2D). Table 1 below shows the results converted to percentage based on a maximum score of 40 for the Precision Optimal Aggregation SWF.
  • TABLE 1
    Precision Optimal Aggregation SWF scores
    Raw Precision Optimal
    Vote counting method Aggregation SWF score Percentage
    Total Votes 22 55%
    Weighted Votes 28 70%
    Semi-Proportional 30 75%
    Delegates
    26 65%
  • Accordingly, using the Precision Optimal Aggregation SWF, the Semi-Proportional vote counting method is identified among the four as the combining technique that produces a combined list most congruent with the subjective values embodied by the Precision Optimal Aggregation SWF.
  • FIGS. 2A-2D also illustrate a scenario when the SWF is Spearman Footrule SWF (third column of each table). For each vote counting method, the Spearman Footrule SWF score is shown at the bottom of each table: SWF score1=149 (see FIG. 2A), SWF score2=153 (see FIG. 2B), SWF score3=146 (see FIG. 2C), and SWF score3=151 (see FIG. 2D). Table 2 below shows the results converted to percentage based on a maximum score of 400 for the Spearman Footrule SWF.
  • TABLE 2
    Spearman Footrule SWF scores
    Raw Spearman Footrule
    Vote counting method SWF score Percentage
    Total Votes 149 37.25%
    Weighted Votes 153 38.25%
    Semi-Proportional 146 36.50%
    Delegates
    151 37.75%
  • Accordingly, using the Spearman Footrule SWF, the Weighted Votes vote counting method is identified as the combining technique among the four that produces a combined list most congruent with the subjective values embodied by the Spearman Footrule SWF.
  • Accordingly, embodiments of the invention include a method that includes identifying a vote computing method that produces the highest SWF score. The evaluation of which voting computing method is most appropriate for a given set of objectives (e.g., business objectives) is performed by the SWF. The SWF takes the lists of the voters' preferences (the lists from the various sources), along with the outcome of the vote (the combined/consensus list), and produces for each vote computing method a “score” indicating the “satisfaction” in the outcome. The highest score indicates the highest satisfaction. That is, the vote computing method that elevates/accounts for/values those characteristics that are valued by the business objectives (as modeled using the SWF) in an optimal fashion is the vote computing method that gets the highest score from the SWF. Since an embodiment of the invention will output a final combined list based on the SWF, the quality of the output is affected by how well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function.
  • In an exemplary embodiment, to improve the quality of the system or method's output, a large collection of voting methods or combination techniques are enumerated and examined, over multiple time periods of sample data, to identify which voting method or combination technique produces the highest SWF score. An exemplary embodiment examines the results of various “voting” methods using several weeks or months worth of data.
  • For parameterized voting techniques (such as the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), parameter(s) may also be optimized as well to improve the quality of the system. For example, in use, embodiments may determine (e.g., by searching for or computing) a parameter value that is most congruent with enabling the parameterized voting method to output a combined list that reflects the values, e.g., the business objectives.
  • Moreover, the characteristics of many sources change over time. Thus, in an exemplary embodiment, even after a vote counting method is established, the congruency of the method to the business objectives (as embodied by the SWF) is revisited periodically (e.g., quarterly) to make sure that changes in the underlying data sources have not reduced the quality of the results. In certain applications, the additional optimization techniques described are also repeated periodically.
  • Thus, an exemplary embodiment of the invention applies voting theory to cross-modality information mashups to construct a combined list ranking preferences. An SWF is used to select from various voting methods based on data from various cross-modality sources. In use, the sources and associated data are dependent on the domain. For example, in the domain of interest of wine, the source or associated data may be results of wine tasting parties, professional reviews (e.g., scores from 1-10 in different categories), sales, change in sales, comments posted by average users, and mentions in mass media.
  • FIG. 4 shows a flow diagram of a method 4000 for combining data on preferences in accordance with an embodiment of the invention. At 4010, a social welfare function (SWF) is received (e.g., by communications interface 66 or from memory or storage, as described below). In an exemplary embodiment, the SWF embodies a business objective. At 4020, sources that provide perspective on opinions on the subject area are identified. The sources may be, for example, sales, comments, etc. At 4030, data from these sources are gathered and normalized to create ranked lists of preferences for each source. At 4040, two or more vote computing methods are identified. At 4050, for each of the two or more vote computing methods, the vote computing method is used to combine data on preferences into a combined list ranking the preferences. The data being combined is from a set of two or more sources. In an exemplary embodiment, the set is heterogeneous in modality. In embodiments in which a delegate allocation vote computing method is used, a preliminary distribution of delegates among the sources is determined. For example, in the example shown in FIG. 2D, the following delegate numbers based roughly on population were distributed to the sources: 300 to Bebo, 500 to LastFM, 1000 to MySpace, and 500 to YouTube.
  • At 4060, for each combined list, the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and value(s) embodied by the SWF, e.g., a business objective. At 4070, the combined list of the vote computing method associated with the highest score is outputted. In one embodiment, additionally or alternatively, the vote computing method associated with the highest score is outputted.
  • FIG. 5 shows a flow diagram of a method 5000 for combining data on preferences in accordance with an embodiment of the invention. At 5010, a social welfare function (SWF) is created. In an exemplary embodiment, the SWF defines business objectives. At 5020, two or more vote computing methods are identified. At 5030, for each of the two or more vote computing methods, the vote computing method is used to combine data on preferences into a combined list ranking the preferences. The data is from a set of two or more sources. In an exemplary embodiment, the set is heterogeneous in modality. At 5040, for each combined list, the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and, for example, the business objective. At 5050, the combined list of the vote computing method associated with the highest score is outputted.
  • Although labeled with the numbers above, it should be understood that embodiments of this invention may execute the method 4000 and/or the method 5000 in a non-sequential order as appropriate and still remain in accordance with the invention. For example, although numbered 4010, embodiments of the present invention may receive the social welfare function before, during, or after 4020, 4030, 4040, and/or 4050. Similarly, although numbered 5010, embodiments of the present invention may create the social welfare function before, during, or after 5020 and/or 5030.
  • Moreover, while ranked lists are described in detail herein, in other embodiments, the sources provide additional information on preference (such as total numbers) for input into voting systems that can make use of such additional information.
  • Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.
  • Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • The medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device). Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, or an optical disk. Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
  • A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
  • Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices) can be coupled to the system either directly or through intervening I/O controllers.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
  • FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention. The computer system includes one or more processors, such as processor 44. The processor 44 is connected to a communication infrastructure 46 (e.g., a communications bus, cross-over bar, or network). Various software embodiments are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person of ordinary skill in the relevant art(s) how to implement the invention using other computer systems and/or computer architectures.
  • The computer system can include a display interface 48 that forwards graphics, text, and other data from the communication infrastructure 46 (or from a frame buffer not shown) for display on a display unit 50. The computer system also includes a main memory 52, preferably random access memory (RAM), and may also include a secondary memory 54. The secondary memory 54 may include, for example, a hard disk drive 56 and/or a removable storage drive 58, representing, for example, a floppy disk drive, a magnetic tape drive, or an optical disk drive. The removable storage drive 58 reads from and/or writes to a removable storage unit 60 in a manner well known to those having ordinary skill in the art. Removable storage unit 60 represents, for example, a floppy disk, a compact disc, a magnetic tape, or an optical disk, etc. which is read by and written to by removable storage drive 58. As will be appreciated, the removable storage unit 60 includes a computer readable medium having stored therein computer software and/or data.
  • In alternative embodiments, the secondary memory 54 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, a removable storage unit 62 and an interface 64. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 62 and interfaces 64 which allow software and data to be transferred from the removable storage unit 62 to the computer system.
  • The computer system may also include a communications interface 66. Communications interface 66 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 66 may include a modem, a network interface (such as an Ethernet card), a communications port, or a PCMCIA slot and card, etc. Software and data transferred via communications interface 66 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 66. These signals are provided to communications interface 66 via a communications path (i.e., channel) 68. This channel 68 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
  • In this document, the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 52 and secondary memory 54, removable storage drive 58, and a hard disk installed in hard disk drive 56.
  • Computer programs (also called computer control logic) are stored in main memory 52 and/or secondary memory 54. Computer programs may also be received via communications interface 66. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 44 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
  • FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented. Distributed data processing system 100 may include a network of computers in which aspects of the illustrative embodiments may be implemented. The distributed data processing system 100 contains at least one network 102, which is the medium used to provide communication links between various devices and computers connected together within distributed data processing system 100. The network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.
  • In the depicted example, server 104 and server 106 are connected to network 102 along with storage unit 108. In addition, clients 110, 112, and 114 are also connected to network 102. These clients 110, 112, and 114 may be, for example, personal computers, network computers, or the like. In the depicted example, server 104 provides data, such as boot files, operating system images, and applications to clients 110, 112, and 114. Clients 110, 112, and 114 are clients to server 104 in the depicted example. Distributed data processing system 100 may include additional servers, clients, and other devices not shown.
  • In the depicted example, distributed data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other computer systems that route data and messages. Of course, the distributed data processing system 100 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like. As stated above, FIG. 7 is intended as an example, not as an architectural limitation for different embodiments of the present invention, and therefore, the particular elements shown in FIG. 7 should not be considered limiting with regard to the environments in which the illustrative embodiments of the present invention may be implemented.
  • In one use, as an example, clients 110 and 112 collect information (e.g., from user input) and provides it to server 104. Server 104 stores the information in storage 108. Server 106 contains hardware devices and software tools to combine the information (e.g., into information mashups and/or combined/consensus lists) according to the present invention. Server 106 transmits the combined information to server 104 and/or clients 110, 112, and/or 114, for example.
  • In use, client 114 may provide the server with business requirements embodied in a SWF. The server determines to best vote counting method to use for that particular application based on the SWF and the information stored, e.g., in storage 108. The server may transmit an identification of the vote counting method to the client.
  • References in the claims to an element in the singular is not intended to mean “one and only” unless explicitly so stated, but rather “one or more.” All structural and functional equivalents to the elements of the above-described exemplary embodiment that are currently known or later come to be known to those of ordinary skill in the art are intended to be encompassed by the present claims. No claim element herein is to be construed under the provisions of 35 U.S.C. section 112, sixth paragraph, unless the element is expressly recited using the phrase “means for” or “step for.”
  • Thus, a system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information mashups based on a social welfare function is disclosed. While the preferred embodiments of the present invention have been described, it will be understood that modifications and adaptations to the embodiments shown may occur to one of ordinary skill in the art without departing from the scope of the present invention as set forth in the claims. Thus, the scope of this invention is to be construed according to the claims and not limited by the specific details disclosed in the exemplary embodiments.

Claims (20)

1. A computer-implemented method for determining preferences from cross-modality information mashups, the method comprising:
receiving a social welfare function (SWF);
identifying two or more vote computing methods;
for each of the two or more vote computing methods, using the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
for each combined list, inputting the combined list into the SWF to compute a score; and
outputting the combined list of the vote computing method associated with the highest score.
2. The method of claim 1, wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
3. The method of claim 1, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
4. The method of claim 3, further comprising processing the unstructured data using natural language mining.
5. The method of claim 1, wherein receiving the SWF comprises receiving a custom constructed SWF based on a set of business objectives.
6. The method of claim 1, wherein the two or more vote computing methods includes a parameterized vote computing method.
7. The method of claim 1, wherein the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
8. A computer program product for determining preferences from cross-modality information, said computer program product comprising:
a computer readable medium;
first program instructions stored on the computer readable medium, the first program instructions to identify two or more vote computing methods;
second program instructions stored on the computer readable medium, the second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
third program instructions stored on the computer readable medium, the third program instructions to, for each combined list, input the combined list into a social welfare function to compute a score; and
fourth program instructions stored on the computer readable medium, the fourth program instructions to output the vote computing method associated with the highest score.
9. The computer program product of claim 8, further comprising
fifth program instructions stored on the computer readable medium, the fifth program instructions to output the combined list of the vote computing method associated with the highest score.
10. The computer program product of claim 8, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
11. The computer program product of claim 10, wherein the second source is selected from the group consisting of: an online blog, an online forum, and an online social networking website.
12. The computer program product of claim 8, wherein the social welfare function is selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
13. The computer program product of claim 8, wherein the social welfare function is a custom constructed social welfare function.
14. The computer program product of claim 8, wherein the two or more vote computing methods includes a parameterized vote computing method.
15. The computer program product of claim 8, wherein the number of preferences being ranked is at least an order magnitude more in number than the number of sources.
16. A system for determining preferences from cross-modality information, the system comprising:
a communications interface;
memory storing computer usable program code; and
a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory; wherein the computer usable program code comprises:
computer usable program code configured to identify two or more vote computing methods;
computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and
computer usable program code configured to identify the vote computing method associated with the highest score.
17. The system of claim 16, wherein the computer usable program code further comprises:
computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
18. The system of claim 16, wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
19. The system of claim 16, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
20. The system of claim 16, wherein the processor is coupled to the communications interface to receive the information on preferences from a server.
US12/195,126 2008-03-31 2008-08-20 System and method for determining preferences from information mashups Abandoned US20090248690A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/195,126 US20090248690A1 (en) 2008-03-31 2008-08-20 System and method for determining preferences from information mashups

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US4112808P 2008-03-31 2008-03-31
US12/195,126 US20090248690A1 (en) 2008-03-31 2008-08-20 System and method for determining preferences from information mashups

Publications (1)

Publication Number Publication Date
US20090248690A1 true US20090248690A1 (en) 2009-10-01

Family

ID=41118612

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/195,098 Active 2030-10-23 US8417694B2 (en) 2008-03-31 2008-08-20 System and method for constructing targeted ranking from multiple information sources
US12/195,126 Abandoned US20090248690A1 (en) 2008-03-31 2008-08-20 System and method for determining preferences from information mashups

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/195,098 Active 2030-10-23 US8417694B2 (en) 2008-03-31 2008-08-20 System and method for constructing targeted ranking from multiple information sources

Country Status (1)

Country Link
US (2) US8417694B2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110060628A1 (en) * 2009-09-03 2011-03-10 Olaf STOERMER Method for assessing candidates by voting and a system intended for this purpose and a program product comprising a computer-readable medium
US20130035988A1 (en) * 2007-12-14 2013-02-07 John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 Integrated Gourmet Item Data Collection, Recommender and Vending System and Method
US20160179849A1 (en) * 2014-12-22 2016-06-23 Verizon Patent And Licensing Inc. Machine to machine data aggregator
CN109828460A (en) * 2019-01-21 2019-05-31 南京理工大学 A kind of consistent control method of output for two-way heterogeneous multi-agent system
CN110390535A (en) * 2019-06-25 2019-10-29 阿里巴巴集团控股有限公司 Customer complaint object determines method, apparatus, electronic equipment and readable storage medium storing program for executing
CN113886723A (en) * 2021-09-09 2022-01-04 盐城金堤科技有限公司 Method and device for determining sequencing stability, storage medium and electronic equipment

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8234147B2 (en) * 2009-05-15 2012-07-31 Microsoft Corporation Multi-variable product rank
US20110219607A1 (en) * 2010-03-12 2011-09-15 Nanjundaswamy Kirakodu S Cathode active materials and method of making thereof
US8352406B2 (en) 2011-02-01 2013-01-08 Bullhorn, Inc. Methods and systems for predicting job seeking behavior
CN103377240B (en) * 2012-04-26 2017-03-01 阿里巴巴集团控股有限公司 Information providing method, processing server and merging server
US20140181085A1 (en) 2012-12-21 2014-06-26 Commvault Systems, Inc. Data storage system for analysis of data across heterogeneous information management systems
US9021452B2 (en) 2012-12-27 2015-04-28 Commvault Systems, Inc. Automatic identification of storage requirements, such as for use in selling data storage management solutions
KR20150128238A (en) * 2014-05-09 2015-11-18 삼성전자주식회사 server, control method thereof and system for producing ranking of serach terms whose popularity increases rapidly
US9760446B2 (en) * 2014-06-11 2017-09-12 Micron Technology, Inc. Conveying value of implementing an integrated data management and protection system
US10324914B2 (en) 2015-05-20 2019-06-18 Commvalut Systems, Inc. Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files
US20160364426A1 (en) * 2015-06-11 2016-12-15 Sap Se Maintenance of tags assigned to artifacts
US10949308B2 (en) 2017-03-15 2021-03-16 Commvault Systems, Inc. Application aware backup of virtual machines
US11032350B2 (en) 2017-03-15 2021-06-08 Commvault Systems, Inc. Remote commands framework to control clients
US11010261B2 (en) 2017-03-31 2021-05-18 Commvault Systems, Inc. Dynamically allocating streams during restoration of data
US10909125B2 (en) * 2018-05-22 2021-02-02 Salesforce.Com, Inc. Asymmetric rank-biased overlap

Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6026426A (en) * 1996-04-30 2000-02-15 International Business Machines Corporation Application programming interface unifying multiple mechanisms
US20020103695A1 (en) * 1998-04-16 2002-08-01 Arnold B. Urken Methods and apparatus for gauging group choices
US6718324B2 (en) * 2000-01-14 2004-04-06 International Business Machines Corporation Metadata search results ranking system
US6728704B2 (en) * 2001-08-27 2004-04-27 Verity, Inc. Method and apparatus for merging result lists from multiple search engines
US6763338B2 (en) * 2002-04-05 2004-07-13 Hewlett-Packard Development Company, L.P. Machine decisions based on preferential voting techniques
US6795820B2 (en) * 2001-06-20 2004-09-21 Nextpage, Inc. Metasearch technique that ranks documents obtained from multiple collections
US20040193698A1 (en) * 2003-03-24 2004-09-30 Sadasivuni Lakshminarayana Method for finding convergence of ranking of web page
US20040199419A1 (en) * 2001-11-13 2004-10-07 International Business Machines Corporation Promoting strategic documents by bias ranking of search results on a web browser
US20050097067A1 (en) * 2003-10-29 2005-05-05 Kirshenbaum Evan R. System and method for combining valuations of multiple evaluators
US6901409B2 (en) * 2001-01-17 2005-05-31 International Business Machines Corporation Mapping data from multiple data sources into a single software component
US20050210025A1 (en) * 2004-03-17 2005-09-22 Dalton Michael E System and method for predicting the ranking of items
US20050246322A1 (en) * 2004-04-30 2005-11-03 Shanmugasundaram Ravikumar On the role of market economics in ranking search results
US20050286772A1 (en) * 2004-06-24 2005-12-29 Lockheed Martin Corporation Multiple classifier system with voting arbitration
US20060074290A1 (en) * 2004-10-04 2006-04-06 Banner Health Methodologies linking patterns from multi-modality datasets
US20060184483A1 (en) * 2005-01-12 2006-08-17 Douglas Clark Predictive analytic method and apparatus
US20060190425A1 (en) * 2005-02-24 2006-08-24 Yuan-Chi Chang Method for merging multiple ranked lists with bounded memory
US7107263B2 (en) * 2000-12-08 2006-09-12 Netrics.Com, Inc. Multistage intelligent database search method
US20070016574A1 (en) * 2005-07-14 2007-01-18 International Business Machines Corporation Merging of results in distributed information retrieval
US20070038625A1 (en) * 1999-05-05 2007-02-15 West Publishing Company Document-classification system, method and software
US7188106B2 (en) * 2001-05-01 2007-03-06 International Business Machines Corporation System and method for aggregating ranking results from various sources to improve the results of web searching
US20070112720A1 (en) * 2005-11-14 2007-05-17 Microsoft Corporation Two stage search
US7242810B2 (en) * 2004-05-13 2007-07-10 Proximex Corporation Multimodal high-dimensional data fusion for classification and identification
US7257577B2 (en) * 2004-05-07 2007-08-14 International Business Machines Corporation System, method and service for ranking search results using a modular scoring system
US20070250784A1 (en) * 2006-03-14 2007-10-25 Workstone Llc Methods and apparatus to combine data from multiple computer systems for display in a computerized organizer
US20080027913A1 (en) * 2006-07-25 2008-01-31 Yahoo! Inc. System and method of information retrieval engine evaluation using human judgment input
US7689615B2 (en) * 2005-02-25 2010-03-30 Microsoft Corporation Ranking results using multiple nested ranking
US20100082639A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Processing maximum likelihood for listwise rankings
US7716202B2 (en) * 2003-06-27 2010-05-11 At&T Intellectual Property I, L.P. Determining a weighted relevance value for each search result based on the estimated relevance value when an actual relevance value was not received for the search result from one of the plurality of search engines
US20100281023A1 (en) * 2007-06-29 2010-11-04 Emc Corporation Relevancy scoring using query structure and data structure for federated search
US8260787B2 (en) * 2007-06-29 2012-09-04 Amazon Technologies, Inc. Recommendation system with multiple integrated recommenders

Patent Citations (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6026426A (en) * 1996-04-30 2000-02-15 International Business Machines Corporation Application programming interface unifying multiple mechanisms
US20020103695A1 (en) * 1998-04-16 2002-08-01 Arnold B. Urken Methods and apparatus for gauging group choices
US20070038625A1 (en) * 1999-05-05 2007-02-15 West Publishing Company Document-classification system, method and software
US6718324B2 (en) * 2000-01-14 2004-04-06 International Business Machines Corporation Metadata search results ranking system
US7107263B2 (en) * 2000-12-08 2006-09-12 Netrics.Com, Inc. Multistage intelligent database search method
US6901409B2 (en) * 2001-01-17 2005-05-31 International Business Machines Corporation Mapping data from multiple data sources into a single software component
US7188106B2 (en) * 2001-05-01 2007-03-06 International Business Machines Corporation System and method for aggregating ranking results from various sources to improve the results of web searching
US6795820B2 (en) * 2001-06-20 2004-09-21 Nextpage, Inc. Metasearch technique that ranks documents obtained from multiple collections
US6728704B2 (en) * 2001-08-27 2004-04-27 Verity, Inc. Method and apparatus for merging result lists from multiple search engines
US20040199419A1 (en) * 2001-11-13 2004-10-07 International Business Machines Corporation Promoting strategic documents by bias ranking of search results on a web browser
US6763338B2 (en) * 2002-04-05 2004-07-13 Hewlett-Packard Development Company, L.P. Machine decisions based on preferential voting techniques
US20040193698A1 (en) * 2003-03-24 2004-09-30 Sadasivuni Lakshminarayana Method for finding convergence of ranking of web page
US20100153357A1 (en) * 2003-06-27 2010-06-17 At&T Intellectual Property I, L.P. Rank-based estimate of relevance values
US7716202B2 (en) * 2003-06-27 2010-05-11 At&T Intellectual Property I, L.P. Determining a weighted relevance value for each search result based on the estimated relevance value when an actual relevance value was not received for the search result from one of the plurality of search engines
US20050097067A1 (en) * 2003-10-29 2005-05-05 Kirshenbaum Evan R. System and method for combining valuations of multiple evaluators
US20050210025A1 (en) * 2004-03-17 2005-09-22 Dalton Michael E System and method for predicting the ranking of items
US20050246322A1 (en) * 2004-04-30 2005-11-03 Shanmugasundaram Ravikumar On the role of market economics in ranking search results
US7257577B2 (en) * 2004-05-07 2007-08-14 International Business Machines Corporation System, method and service for ranking search results using a modular scoring system
US7242810B2 (en) * 2004-05-13 2007-07-10 Proximex Corporation Multimodal high-dimensional data fusion for classification and identification
US20050286772A1 (en) * 2004-06-24 2005-12-29 Lockheed Martin Corporation Multiple classifier system with voting arbitration
US20060074290A1 (en) * 2004-10-04 2006-04-06 Banner Health Methodologies linking patterns from multi-modality datasets
US20060184483A1 (en) * 2005-01-12 2006-08-17 Douglas Clark Predictive analytic method and apparatus
US20060190425A1 (en) * 2005-02-24 2006-08-24 Yuan-Chi Chang Method for merging multiple ranked lists with bounded memory
US7689615B2 (en) * 2005-02-25 2010-03-30 Microsoft Corporation Ranking results using multiple nested ranking
US20070016574A1 (en) * 2005-07-14 2007-01-18 International Business Machines Corporation Merging of results in distributed information retrieval
US20070112720A1 (en) * 2005-11-14 2007-05-17 Microsoft Corporation Two stage search
US20070250784A1 (en) * 2006-03-14 2007-10-25 Workstone Llc Methods and apparatus to combine data from multiple computer systems for display in a computerized organizer
US20080027913A1 (en) * 2006-07-25 2008-01-31 Yahoo! Inc. System and method of information retrieval engine evaluation using human judgment input
US20100281023A1 (en) * 2007-06-29 2010-11-04 Emc Corporation Relevancy scoring using query structure and data structure for federated search
US8260787B2 (en) * 2007-06-29 2012-09-04 Amazon Technologies, Inc. Recommendation system with multiple integrated recommenders
US20100082639A1 (en) * 2008-09-30 2010-04-01 Microsoft Corporation Processing maximum likelihood for listwise rankings

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130035988A1 (en) * 2007-12-14 2013-02-07 John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 Integrated Gourmet Item Data Collection, Recommender and Vending System and Method
US8671012B2 (en) * 2007-12-14 2014-03-11 John Nicholas and Kristin Gross Methods and systems for promoting items based on event sampling data
US9037515B2 (en) 2007-12-14 2015-05-19 John Nicholas and Kristin Gross Social networking websites and systems for publishing sampling event data
US10482484B2 (en) 2007-12-14 2019-11-19 John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 Item data collection systems and methods with social network integration
US20110060628A1 (en) * 2009-09-03 2011-03-10 Olaf STOERMER Method for assessing candidates by voting and a system intended for this purpose and a program product comprising a computer-readable medium
US20160179849A1 (en) * 2014-12-22 2016-06-23 Verizon Patent And Licensing Inc. Machine to machine data aggregator
US10275476B2 (en) * 2014-12-22 2019-04-30 Verizon Patent And Licensing Inc. Machine to machine data aggregator
CN109828460A (en) * 2019-01-21 2019-05-31 南京理工大学 A kind of consistent control method of output for two-way heterogeneous multi-agent system
CN110390535A (en) * 2019-06-25 2019-10-29 阿里巴巴集团控股有限公司 Customer complaint object determines method, apparatus, electronic equipment and readable storage medium storing program for executing
CN113886723A (en) * 2021-09-09 2022-01-04 盐城金堤科技有限公司 Method and device for determining sequencing stability, storage medium and electronic equipment

Also Published As

Publication number Publication date
US8417694B2 (en) 2013-04-09
US20090248614A1 (en) 2009-10-01

Similar Documents

Publication Publication Date Title
US20090248690A1 (en) System and method for determining preferences from information mashups
Le et al. Information sought by prospective students from social media electronic word-of-mouth during the university choice process
Oliveira et al. Can social media reveal the preferences of voters? A comparison between sentiment analysis and traditional opinion polls
Reschke et al. Status spillovers: The effect of status-conferring prizes on the allocation of attention
Holmberg Altmetrics for information professionals: Past, present and future
US10567182B1 (en) Revealing connections for persons in a social graph
Duverger Curvilinear effects of user-generated content on hotels’ market share: a dynamic panel-data analysis
Berezina et al. Understanding satisfied and dissatisfied hotel customers: text mining of online hotel reviews
Hadani et al. Complementary relationships between corporate philanthropy and corporate political activity: An exploratory study of political marketplace contingencies
Malthouse et al. Managing customer relationships in the social media era: Introducing the social CRM house
Mellet et al. A ‘democratization’of markets? Online consumer reviews in the restaurant industry
Yang et al. The changing public sphere on Twitter: Network structure, elites and topics of the# righttobeforgotten
Liket et al. Nonprofit organizational effectiveness: Analysis of best practices
Kwok et al. Spreading social media messages on Facebook: An analysis of restaurant business-to-consumer communications
Fichman A comparative assessment of answer quality on four question answering sites
Doloreux et al. Exploring and comparing innovation patterns across different knowledge intensive business services
Chalmers With a lot of help from their friends: Explaining the social logic of informational lobbying in the European Union
Loukis et al. Active and passive crowdsourcing in government
Phillips et al. The influence of geographic and psychic distance on online hotel ratings
Panda et al. Modelling the relationship between information technology infrastructure and organizational agility: A study in the context of India
Abdul‐Aziz et al. Exploring the internationalization of Malaysian contractors: the international entrepreneurship dimension
Jerrim et al. Are peer reviews of grant proposals reliable? An analysis of Economic and Social Research Council (ESRC) funding applications
Huang et al. The effect of online and offline word-of-mouth on new product diffusion
Yarış et al. The impact of social media use on restaurant choice
Nam et al. Can web ecology provide a clearer understanding of people’s information behavior during election campaigns?

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BHAGWAN, VARUN;GRANDISON, TYRONE WILBERFORCE ANDRE;GRUHL, DANIEL FREDERICK;AND OTHERS;REEL/FRAME:021758/0836;SIGNING DATES FROM 20080819 TO 20080820

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION