US20090248690A1 - System and method for determining preferences from information mashups - Google Patents
System and method for determining preferences from information mashups Download PDFInfo
- Publication number
- US20090248690A1 US20090248690A1 US12/195,126 US19512608A US2009248690A1 US 20090248690 A1 US20090248690 A1 US 20090248690A1 US 19512608 A US19512608 A US 19512608A US 2009248690 A1 US2009248690 A1 US 2009248690A1
- Authority
- US
- United States
- Prior art keywords
- vote
- preferences
- information
- swf
- computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Definitions
- the present invention relates to information mashups, and in particular to a system and method for determining preferences from cross-modality information mashups.
- a computer-implemented method for determining preferences from cross-modality information mashups includes receiving a social welfare function (SWF) and identifying two or more vote computing methods. For each of the two or more vote computing methods, the method uses the vote computing method to combine information on preferences into a combined list ranking the preferences.
- the information is from a set of two or more sources. The set is heterogeneous in modality.
- the method inputs the combined list into the SWF to compute a score.
- the method outputs the combined list of the vote computing method associated with the highest score.
- the set of two or more sources may include data from websites indicating preferences within a certain domain of interest.
- the information from the set of two or more sources may include structured data from a first source and unstructured data from a second source.
- the number of preferences being ranked may be at least an order magnitude more in number than the number of sources.
- a computer program product for determining preferences from cross-modality information includes a computer readable medium and program instructions.
- the program instructions include first program instructions to identify two or more vote computing methods and second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences.
- the information is from a set of two or more sources.
- the set is heterogeneous in modality.
- the program instructions further include third program instructions to compute a score, and fourth program instructions to output the vote computing method associated with the highest score.
- the program instructions may also include fifth program instructions to output the combined list of the vote computing method associated with the highest score.
- the two or more sources may include an online blog, an online forum, and/or an online social networking website.
- the social welfare function may be selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
- a system for determining preferences from cross-modality information includes a communications interface, memory storing computer usable program code; and a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory.
- the computer usable program code includes computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score.
- the computer usable program code may further include computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
- FIG. 1 shows a method of combining data on preferences according to one embodiment of the present invention.
- FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods.
- FIG. 3 shows a table of four top-10 lists.
- FIG. 4 shows a flow diagram of a method for combining data on preferences in accordance with an embodiment of the invention.
- FIG. 5 shows a flow diagram of another method for combining data on preferences in accordance with an embodiment of the invention.
- FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention.
- FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented.
- the present invention provides a system and method for determining preferences from information mashups.
- An information mashup combines or mixes information or data from a multitude of often-conflicting sources into a single representation. For example, for any given domain of interest, opinions can be expressed in many places and collected by many sources. Online sources for people's opinion on a wide range of topics include, for example, blogs, discussion forums and social networking sites.
- Embodiments of the invention combine information gathered from across different sources, including in one application various online sources, to form a unified, focused view of a community's interests regarding that domain.
- An exemplary embodiment of the invention determines preferences from cross-modality information mashups.
- systems compare things with identical modalities, such as number of sales from different sources.
- domains of interests e.g., patient preferences, drugs for certain medical conditions, cars, wine, financial products (stocks, bonds, etc.), consumer goods, cameras, computers, books, etc.
- information is available from many different modalities (e.g., comments, passive listens, sales, hits on a website, creation of new website, views on television, etc.).
- An exemplary embodiment of the invention determines preferences from information mashups constructed from information and data from a set of sources heterogeneous in modality. For example, say we want to combine different on-line data to generate a list of wines.
- One source of preferences may be generated from sales numbers of wines.
- Another source may be a list generated from wine tasters.
- Yet another may be generated by professionals at a wine magazine.
- Yet another may be generated from counts of comments users post on a wine aficionado site. There are many more sales of wines than posts on a website.
- Plurality type voting systems are those that add together the number of votes from each source and simply adjudicate the winner based on whomever or whichever candidate has the most votes.
- Plurality type voting systems include systems in which votes are weighted.
- plurality type voting systems have deficiencies when combining information gathered from multiple sources with differing modalities. This can occur, for example, when there are large differences in the numbers returned by sources or when the values measured to derive those numbers indicate very different things.
- SWF social welfare function
- a SWF might describe, for example, the preferences of an individual over social states, or might describe, as another example, outcomes of an allocation process, whether or not individuals had preferences over those outcomes.
- SWFs are the Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
- a method is supplied for embodying subjectiveness, such as those described above, into one function.
- embodiments of the invention can capture, for example, business goals in a semi-heuristic way, objectively evaluate various preference combination techniques, and identify which of the combinations techniques to use in a specific instance.
- the combination techniques include techniques that originate from vote computing or vote counting systems, such as a Borda count method or the Nauru method.
- Embodiments of the invention may supplement or modify a vote computing or vote counting technique depending on whether the original information expressing the preferences is, for example, structured or unstructured, numerical or textual, etc.
- the combination technique used is as describe in co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), and filed on ______. Accordingly, the system and method for determining preferences from information mashups described in detail herein compliments the system and method described in the co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2).
- the system or method described in detail herein may be used in conjunction with the system or method described in detail in the co-pending application.
- the first system and method may be used separately from the latter
- the SWF takes as input a “final” ranked list generated from each of the various vote counting/computing methods and/or systems, and the preferences of each source.
- the “final” ranked list may be generated using, for example, weighted voting systems, semi-proportional methods, delegates, Borda Count, inverted rank, run off, round robin, and/or a ranking method described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2).
- the SWF outputs a number that indicates how happy or satisfied the “society” of sources is with the results.
- multiple methods of combining are examined and evaluated, and the combining method that returns the highest SWF value is considered the “best” method. That combining method is then established as the combining method that application will use when determining preferences from future mashups combining information for those sources for those business purposes, for example. As discussed below, reevaluation of the combining method may be done periodically to optimize the quality of the results.
- the present disclosure differs from traditional work in the field in several ways.
- the disclosure addresses situations where, as noted, people are providing preferences in non-uniform ways (complaints, purchase, opinion posted, time, etc.).
- ad hoc weights don't work well because ad hoc weights can only adjust for the deficiencies that exist at a simple point in time.
- Amazon.com ranks should be weighted lower (having less weight in the over scheme of the analysis) than Barnes & Noble because Amazon.com opened its online store in July 1995. If the top-10 lists were compared today, the weights would differ.
- ad hoc weights are useful when combining lists of preferences at one point in time, they need adjusting each time new data from the sources are recombined to account for, e.g., changes in the market, business cycles, seasons, time of day, new product releases (which could, for example, skew statistics for a few days), blitz marketing campaigns, events (e.g., Olympics® or Super Bowl®), etc. These real world changes have the potential of causing dramatic shifts in the rankings being reported.
- the ad hoc weights adjustments are time-dependent. If we calculate the rankings at a different point in time, the weights would be reconsidered and changed, tuned each time we calculate the rankings. This can be particularly onerous depending on how often the combined rankings are calculated (in real time, daily, weekly, monthly, quarterly, annually, etc.) particularly if the tuning is done without the assistance of any computer-implemented algorithms.
- embodiments of the invention identify the most appropriate method to combine preferences from sources of different modalities by using a SWF appropriate for predefined objectives (e.g., business requirements).
- exemplary embodiments take into account, for example, business requirements to a level of granularity that ad hoc weights cannot.
- embodiments of the invention examine domains with orders of magnitude more “candidates” than “voters”, the reverse of most elections.
- Conventional voting techniques do not examine scenarios in which the number of “candidates” is orders of magnitude more than the number of “voters.”
- the Borda function is intended for use in situations when there are a large number of voters and a small number of candidates, such as in a presidential election.
- embodiments of the invention examine vote computing techniques that are intended for use in scenarios in which the number of items being ranked (or “candidates”) is orders magnitude more than the number of sources ranking the items (“voters”), the opposite of convention elections.
- a vote computing technique may combine the information on preferences into a combined list ranking the preferences in an application in which the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
- FIG. 1 illustrates a method of combining data on preferences according to one embodiment of the invention.
- FIG. 1 shows information 1010 from multiples sources (e.g., Source 1 , Source 2 , Source 3 , etc.), a set of vote computing methods 1020 , a social welfare function (SWF) 1030 , and a set of social welfare function (SWF) scores 1040 .
- sources e.g., Source 1 , Source 2 , Source 3 , etc.
- SWF social welfare function
- SWF social welfare function
- the information 1010 includes information from multiples sources (e.g., Source 1 , Source 2 , Source 3 , etc.).
- the multiple sources are of varying modalities. Modalities may be expressed as having two major dimensions: intentional versus unintentional, and consuming (passive) versus producing (creative).
- Intentional activities are those where a user, for example, has had to take steps to “make their mark.” Examples in the online arena would be navigating to a particular page or typing in a name into a search bar. Intentional activities are stronger indicators of interest than unintentional activities.
- Creative, producing activities are, for example, those where the user takes the time to author a post or compose a response. Passive, consuming activities may involve watching or reading something created by someone else. Creative activities, taking more time and attention, indicate more interest than passive activities.
- each source is either a list of preferences (e.g. a ranked list) or provides data which is converted into a list of preferences.
- the converting may, for example, process user posts from a social networking site, such as by employing a series of unstructured information management architecture (UIMA) annotators driven off of entity spotting, using Information Extraction (IE) techniques, and/or using natural language mining (NLM) techniques.
- UIMA unstructured information management architecture
- IE Information Extraction
- NLM natural language mining
- an embodiment of the invention may additionally or alternatively request that a source convert the source's data into a list of preferences, instead of converting the data itself.
- Source 1 may be sales numbers of wines
- Source 2 may be an online list generated by wine tasters
- Source 3 may be a ranking based analysis of various blogs on wines.
- the analysis may have employed a series of UIMA annotators. Each source expresses or reflects opinions on the same underlying subject matter, phenomenon, or domain of interest.
- each vote computing method is a different combining technique.
- Vote computing method 1 may be a Borda count technique
- Vote computing method 2 may be an inverted ranking technique
- Vote computing method 3 may be round robin technique
- Vote computing method 4 may be the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2.
- the output of each vote computing method is, a “final” ranked list, sometimes referred to as a single, merged list.
- the final list is provided to the SWF, along with the sources' preferences.
- the SWF is a mathematical criterion for the success of a voting system based on some desired characteristics. Accordingly, in certain embodiments, the SWF is constructed to capture characteristics which are considered valuable. The value may be determined from a business standpoint if business objectives are driving the undertaking. For example, in determining preferences for wine, it may be considered valuable for each source to see at least 1 ⁇ 2 of its top 10 list in the overall top 10 list. How well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function affects the appropriateness of the actual combined list outputted by embodiments of the invention.
- the SWF may be selected or custom constructed to fit the situation.
- the SWF is selected from among a set of SWF, e.g., a set including the Precision Optimal Aggregation SWF (P swf ) and the Spearman Footrule SWF (S swf ).
- the P swf measures how many items from each source's top-n list are in the “final” ranked list (the single list which merges ranked items from each source). For example, in one application, the P swf measures how many artists from each source's top-10 list are in an overall top-10 list created using Borda count technique.
- a Precision Optimal Aggregation SWF defined as:
- the Spearman Footrule SWF emphasizes preservation of position in the rankings.
- the S swf is an approximation of a related SWF Kendall tau distance.
- the S swf is less computationally intensive (minutes versus days) relative to the Kendall tau distance.
- One exemplary embodiment uses a Spearman Footrule SWF defined as:
- the SWF takes as input a “final” ranked list and the preferences of each source.
- the outcome is a score where points are awarded for increased social welfare of a ranking system.
- embodiments quantitatively measure the “happiness” of each contributing source with the overall “final” ranking.
- an SWF score is calculated for each vote computing method (e.g., SWF score 1 , SWF score 2 , SWF score 3 , etc.).
- FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods.
- the results of four combination techniques are shown to illustrate how each technique merges the four top-10 lists show in FIG. 3 .
- a “final” top-10 computed using the corresponding vote counting technique is shown in the first column.
- FIG. 2A a “final” top-10 ranking computed using Total Votes is shown;
- FIG. 2B a “final” top-10 ranking computed using Weighted Votes is shown;
- FIG. 2C a “final” top-10 ranking computed using Semi-Proportional methods is shown; and in FIG. 2D , a “final” top-10 ranking computed using Delegates is shown.
- SWF scores computed using two different SWF are shown.
- the P swf column shows the contribution of each artist to the overall Precision Optimal Aggregation SWF for that source.
- the S swf column shows the contribution of each artist to the overall Spearman Footrule SWF for that source.
- the bars correspond to the sources in FIG. 3 in this order: Bebo, LastFM, MySpace, and YouTube. The bars represent how “happy” each source is with the artist being ranked at this position.
- the graphs in the C S column express the contribution to the combined ranking for the artist from each source.
- the bars from left to right in each of those cells correspond to the sources in FIG. 3 in the order: Bebo, LastFM, MySpace, and YouTube.
- the rankings shown in the first column was produced by merging the rankings from Bebo, LastFM, MySpace, and YouTube shown in FIG. 3 through simple summation of the votes for each artists.
- the last bar, corresponding to YouTube is the longest because YouTube had more data points than the other sources. In this example, the “number of votes” is dominated by YouTube.
- each table in FIGS. 2A-2D shows the total SWF for P swf and S swf expressed as a raw score.
- P swf each source contributes up to 10 points, for a maximum score of 40 (best).
- S swf each source contributes up to 100 points, for a maximum of 400.
- the total influence of each source had on the top-10 list is also seen in the bottom of each table in the last row of the C S column.
- FIGS. 2A-2D also illustrate a scenario when the SWF is a Precision Optimal Aggregation SWF (second column of each table).
- Table 1 below shows the results converted to percentage based on a maximum score of 40 for the Precision Optimal Aggregation SWF.
- the Semi-Proportional vote counting method is identified among the four as the combining technique that produces a combined list most congruent with the subjective values embodied by the Precision Optimal Aggregation SWF.
- FIGS. 2A-2D also illustrate a scenario when the SWF is Spearman Footrule SWF (third column of each table).
- Table 2 below shows the results converted to percentage based on a maximum score of 400 for the Spearman Footrule SWF.
- the Weighted Votes vote counting method is identified as the combining technique among the four that produces a combined list most congruent with the subjective values embodied by the Spearman Footrule SWF.
- embodiments of the invention include a method that includes identifying a vote computing method that produces the highest SWF score.
- the evaluation of which voting computing method is most appropriate for a given set of objectives is performed by the SWF.
- the SWF takes the lists of the voters' preferences (the lists from the various sources), along with the outcome of the vote (the combined/consensus list), and produces for each vote computing method a “score” indicating the “satisfaction” in the outcome.
- the highest score indicates the highest satisfaction. That is, the vote computing method that elevates/accounts for/values those characteristics that are valued by the business objectives (as modeled using the SWF) in an optimal fashion is the vote computing method that gets the highest score from the SWF. Since an embodiment of the invention will output a final combined list based on the SWF, the quality of the output is affected by how well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function.
- a large collection of voting methods or combination techniques are enumerated and examined, over multiple time periods of sample data, to identify which voting method or combination technique produces the highest SWF score.
- An exemplary embodiment examines the results of various “voting” methods using several weeks or months worth of data.
- parameter(s) may also be optimized as well to improve the quality of the system. For example, in use, embodiments may determine (e.g., by searching for or computing) a parameter value that is most congruent with enabling the parameterized voting method to output a combined list that reflects the values, e.g., the business objectives.
- the characteristics of many sources change over time.
- the congruency of the method to the business objectives is revisited periodically (e.g., quarterly) to make sure that changes in the underlying data sources have not reduced the quality of the results.
- the additional optimization techniques described are also repeated periodically.
- an exemplary embodiment of the invention applies voting theory to cross-modality information mashups to construct a combined list ranking preferences.
- An SWF is used to select from various voting methods based on data from various cross-modality sources.
- the sources and associated data are dependent on the domain. For example, in the domain of interest of wine, the source or associated data may be results of wine tasting parties, professional reviews (e.g., scores from 1-10 in different categories), sales, change in sales, comments posted by average users, and mentions in mass media.
- FIG. 4 shows a flow diagram of a method 4000 for combining data on preferences in accordance with an embodiment of the invention.
- a social welfare function (SWF) is received (e.g., by communications interface 66 or from memory or storage, as described below).
- the SWF embodies a business objective.
- sources that provide perspective on opinions on the subject area are identified. The sources may be, for example, sales, comments, etc.
- data from these sources are gathered and normalized to create ranked lists of preferences for each source.
- two or more vote computing methods are identified.
- the vote computing method is used to combine data on preferences into a combined list ranking the preferences.
- the data being combined is from a set of two or more sources.
- the set is heterogeneous in modality.
- a delegate allocation vote computing method a preliminary distribution of delegates among the sources is determined. For example, in the example shown in FIG. 2D , the following delegate numbers based roughly on population were distributed to the sources: 300 to Bebo, 500 to LastFM, 1000 to MySpace, and 500 to YouTube.
- the combined list is inputted into the SWF to compute a score.
- the score indicates congruency between the combined list and value(s) embodied by the SWF, e.g., a business objective.
- the combined list of the vote computing method associated with the highest score is outputted. In one embodiment, additionally or alternatively, the vote computing method associated with the highest score is outputted.
- FIG. 5 shows a flow diagram of a method 5000 for combining data on preferences in accordance with an embodiment of the invention.
- SWF social welfare function
- the SWF defines business objectives.
- two or more vote computing methods are identified.
- the vote computing method is used to combine data on preferences into a combined list ranking the preferences.
- the data is from a set of two or more sources.
- the set is heterogeneous in modality.
- the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and, for example, the business objective.
- the combined list of the vote computing method associated with the highest score is outputted.
- embodiments of this invention may execute the method 4000 and/or the method 5000 in a non-sequential order as appropriate and still remain in accordance with the invention.
- embodiments of the present invention may receive the social welfare function before, during, or after 4020 , 4030 , 4040 , and/or 4050 .
- embodiments of the present invention may create the social welfare function before, during, or after 5020 and/or 5030 .
- the sources provide additional information on preference (such as total numbers) for input into voting systems that can make use of such additional information.
- Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements.
- the invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.
- the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
- a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- the medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device).
- Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, or an optical disk.
- Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
- a data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
- the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
- I/O devices including but not limited to keyboards, displays, pointing devices
- I/O controllers can be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
- Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
- FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention.
- the computer system includes one or more processors, such as processor 44 .
- the processor 44 is connected to a communication infrastructure 46 (e.g., a communications bus, cross-over bar, or network).
- a communication infrastructure 46 e.g., a communications bus, cross-over bar, or network.
- the computer system can include a display interface 48 that forwards graphics, text, and other data from the communication infrastructure 46 (or from a frame buffer not shown) for display on a display unit 50 .
- the computer system also includes a main memory 52 , preferably random access memory (RAM), and may also include a secondary memory 54 .
- the secondary memory 54 may include, for example, a hard disk drive 56 and/or a removable storage drive 58 , representing, for example, a floppy disk drive, a magnetic tape drive, or an optical disk drive.
- the removable storage drive 58 reads from and/or writes to a removable storage unit 60 in a manner well known to those having ordinary skill in the art.
- Removable storage unit 60 represents, for example, a floppy disk, a compact disc, a magnetic tape, or an optical disk, etc. which is read by and written to by removable storage drive 58 .
- the removable storage unit 60 includes a computer readable medium having stored therein computer software and/or data.
- the secondary memory 54 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system.
- Such means may include, for example, a removable storage unit 62 and an interface 64 .
- Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 62 and interfaces 64 which allow software and data to be transferred from the removable storage unit 62 to the computer system.
- the computer system may also include a communications interface 66 .
- Communications interface 66 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 66 may include a modem, a network interface (such as an Ethernet card), a communications port, or a PCMCIA slot and card, etc.
- Software and data transferred via communications interface 66 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 66 . These signals are provided to communications interface 66 via a communications path (i.e., channel) 68 .
- This channel 68 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
- computer program medium “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 52 and secondary memory 54 , removable storage drive 58 , and a hard disk installed in hard disk drive 56 .
- Computer programs are stored in main memory 52 and/or secondary memory 54 . Computer programs may also be received via communications interface 66 . Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 44 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
- FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented.
- Distributed data processing system 100 may include a network of computers in which aspects of the illustrative embodiments may be implemented.
- the distributed data processing system 100 contains at least one network 102 , which is the medium used to provide communication links between various devices and computers connected together within distributed data processing system 100 .
- the network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.
- server 104 and server 106 are connected to network 102 along with storage unit 108 .
- clients 110 , 112 , and 114 are also connected to network 102 .
- These clients 110 , 112 , and 114 may be, for example, personal computers, network computers, or the like.
- server 104 provides data, such as boot files, operating system images, and applications to clients 110 , 112 , and 114 .
- Clients 110 , 112 , and 114 are clients to server 104 in the depicted example.
- Distributed data processing system 100 may include additional servers, clients, and other devices not shown.
- distributed data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another.
- TCP/IP Transmission Control Protocol/Internet Protocol
- the distributed data processing system 100 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like.
- FIG. 7 is intended as an example, not as an architectural limitation for different embodiments of the present invention, and therefore, the particular elements shown in FIG. 7 should not be considered limiting with regard to the environments in which the illustrative embodiments of the present invention may be implemented.
- clients 110 and 112 collect information (e.g., from user input) and provides it to server 104 .
- Server 104 stores the information in storage 108 .
- Server 106 contains hardware devices and software tools to combine the information (e.g., into information mashups and/or combined/consensus lists) according to the present invention.
- Server 106 transmits the combined information to server 104 and/or clients 110 , 112 , and/or 114 , for example.
- client 114 may provide the server with business requirements embodied in a SWF.
- the server determines to best vote counting method to use for that particular application based on the SWF and the information stored, e.g., in storage 108 .
- the server may transmit an identification of the vote counting method to the client.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information based on a social welfare function is disclosed. An exemplary embodiment of the invention uses a social welfare function (SWF) to identify a vote computing method from among a group of vote computing methods. The SWF embodies subjective values, e.g. business objectives. The embodiment uses the SWF to identify the vote computing method that combines cross-modality information into a single information mashup in a manner that is most congruent with the subjective values relative to the other vote computing methods. The information mashup may be in the form of a single, merged ranked list.
Description
- This application claims the benefit of and incorporates by reference in its entirety U.S. provisional application No. 61/041,128, which was filed on Mar. 31, 2008.
- The present invention relates to information mashups, and in particular to a system and method for determining preferences from cross-modality information mashups.
- Through the advances of technology, today's world has become inundated with information. One continuing technological and societal challenge is finding methods and systems to extract and combine useful data, knowledge, and understanding from a pool of information that is constantly growing in quantity and increasing in granularity.
- Even when we narrow our analysis to one domain of interest, e.g. ranking wines, how do we combine all the information indicating preferences within the domain when the information is available from multiples sources and the sources differ in modality? For example, how do we combine multiple lists of preferences, e.g., from different online communities, sales numbers from different stores, etc? How do we combine the information in a manner that will reveal the aspects of that information that are important, valuable, significant to an entity (e.g., a machine, business, customer, end-user, etc.) requesting the results? And how do we enable tuning of the outcome, e.g., at the touch of a button, to target certain characteristics and elevate those characteristics to the forefront?
- A computer-implemented method for determining preferences from cross-modality information mashups is provided. The method includes receiving a social welfare function (SWF) and identifying two or more vote computing methods. For each of the two or more vote computing methods, the method uses the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. For each combined list, the method inputs the combined list into the SWF to compute a score. The method outputs the combined list of the vote computing method associated with the highest score. The set of two or more sources may include data from websites indicating preferences within a certain domain of interest. The information from the set of two or more sources may include structured data from a first source and unstructured data from a second source. The number of preferences being ranked may be at least an order magnitude more in number than the number of sources.
- A computer program product for determining preferences from cross-modality information is also provided. The computer program product includes a computer readable medium and program instructions. The program instructions include first program instructions to identify two or more vote computing methods and second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. The program instructions further include third program instructions to compute a score, and fourth program instructions to output the vote computing method associated with the highest score. The program instructions may also include fifth program instructions to output the combined list of the vote computing method associated with the highest score. The two or more sources may include an online blog, an online forum, and/or an online social networking website. The social welfare function may be selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
- A system for determining preferences from cross-modality information is further provided. The system includes a communications interface, memory storing computer usable program code; and a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory. The computer usable program code includes computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score. The computer usable program code may further include computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
-
FIG. 1 shows a method of combining data on preferences according to one embodiment of the present invention. -
FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods. -
FIG. 3 shows a table of four top-10 lists. -
FIG. 4 shows a flow diagram of a method for combining data on preferences in accordance with an embodiment of the invention. -
FIG. 5 shows a flow diagram of another method for combining data on preferences in accordance with an embodiment of the invention. -
FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention. -
FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented. - The present invention provides a system and method for determining preferences from information mashups. An information mashup combines or mixes information or data from a multitude of often-conflicting sources into a single representation. For example, for any given domain of interest, opinions can be expressed in many places and collected by many sources. Online sources for people's opinion on a wide range of topics include, for example, blogs, discussion forums and social networking sites. Embodiments of the invention combine information gathered from across different sources, including in one application various online sources, to form a unified, focused view of a community's interests regarding that domain.
- An exemplary embodiment of the invention determines preferences from cross-modality information mashups. In more traditional information integration scenarios, systems compare things with identical modalities, such as number of sales from different sources. However there are many domains of interests (e.g., patient preferences, drugs for certain medical conditions, cars, wine, financial products (stocks, bonds, etc.), consumer goods, cameras, computers, books, etc.) where information is available from many different modalities (e.g., comments, passive listens, sales, hits on a website, creation of new website, views on television, etc.). In the domain of books, the following information may be available for determining book preferences: book sales and returns, lists of books read, library checkouts, comments on books read (e.g., online, in newspapers, in magazines, on television or radio), etc. An exemplary embodiment of the invention determines preferences from information mashups constructed from information and data from a set of sources heterogeneous in modality. For example, say we want to combine different on-line data to generate a list of wines. One source of preferences may be generated from sales numbers of wines. Another source may be a list generated from wine tasters. Yet another may be generated by professionals at a wine magazine. Yet another may be generated from counts of comments users post on a wine aficionado site. There are many more sales of wines than posts on a website. Many people buy wines whereas composing a review takes more time and may indicate more interest in a particular vintage. Ultimately, a good cross-modality mashup combines these multiple sources, which indicate interest in all the same underlying subject matter, without allowing one source to unduly influence the combined/consensus list.
- Yet, how can one combine the data from the various sources when they are heterogeneous in modality? Comparing different modalities is akin to comparing apples and oranges. How does one determine overall rankings for certain wines, for example, based on the combination of data on sales, written reviews, returns, website polls, etc.? How do you combine data indicating that the reviewer loves a certain wine (glowing reviews), but the public hates it (e.g., by ranking it low on wine.com or low sales)? Do we decide that ten times as many posts on a website reflect ten times as much interest in an event or item? There is a fair amount of subjectivity in how these combinations occur and it is not typically clear how to combine all these sources.
- In systems that compare things with identical modalities, using a plurality type voting system makes sense. Plurality type voting systems are those that add together the number of votes from each source and simply adjudicate the winner based on whomever or whichever candidate has the most votes. Plurality type voting systems include systems in which votes are weighted. However, plurality type voting systems have deficiencies when combining information gathered from multiple sources with differing modalities. This can occur, for example, when there are large differences in the numbers returned by sources or when the values measured to derive those numbers indicate very different things.
- To identify which of a multitude of combination techniques (including plurality type voting techniques) is optimal for combining data from various sources in a certain instance, embodiments of the invention use a construct known as a social welfare function (SWF). A SWF is a mapping from allocations of goods or rights among people to real numbers. The SWF construct was a tool introduced by Abram Bergson in 1938. The SWF construct allows for the determination of a society's taste for different economic states. There are two features to the SWF construct: first, it imposes a structure and second, it devises a single constitutional/voting system that changes the rankings of the individual into a single society ranking. A SWF might describe, for example, the preferences of an individual over social states, or might describe, as another example, outcomes of an allocation process, whether or not individuals had preferences over those outcomes. Examples of SWFs are the Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule. Thus, using an SWF, a method is supplied for embodying subjectiveness, such as those described above, into one function. Using a custom constructed or selected SWF, embodiments of the invention can capture, for example, business goals in a semi-heuristic way, objectively evaluate various preference combination techniques, and identify which of the combinations techniques to use in a specific instance.
- In one exemplary application, the combination techniques include techniques that originate from vote computing or vote counting systems, such as a Borda count method or the Nauru method. Embodiments of the invention may supplement or modify a vote computing or vote counting technique depending on whether the original information expressing the preferences is, for example, structured or unstructured, numerical or textual, etc.
- In one exemplary embodiment, the combination technique used is as describe in co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), and filed on ______. Accordingly, the system and method for determining preferences from information mashups described in detail herein compliments the system and method described in the co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). In one use, the system or method described in detail herein may be used in conjunction with the system or method described in detail in the co-pending application. In another use, the first system and method may be used separately from the latter
- In embodiments of the invention, the SWF takes as input a “final” ranked list generated from each of the various vote counting/computing methods and/or systems, and the preferences of each source. The “final” ranked list may be generated using, for example, weighted voting systems, semi-proportional methods, delegates, Borda Count, inverted rank, run off, round robin, and/or a ranking method described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). The SWF outputs a number that indicates how happy or satisfied the “society” of sources is with the results. Thus, in one application, multiple methods of combining are examined and evaluated, and the combining method that returns the highest SWF value is considered the “best” method. That combining method is then established as the combining method that application will use when determining preferences from future mashups combining information for those sources for those business purposes, for example. As discussed below, reevaluation of the combining method may be done periodically to optimize the quality of the results.
- The present disclosure differs from traditional work in the field in several ways. For example, the disclosure addresses situations where, as noted, people are providing preferences in non-uniform ways (complaints, purchase, opinion posted, time, etc.). In such situations, ad hoc weights don't work well because ad hoc weights can only adjust for the deficiencies that exist at a simple point in time. Consider the use of ad hoc weights in combining top-10 lists from Amazon.com and Barnes & Noble in 1995. In that year, Amazon.com ranks should be weighted lower (having less weight in the over scheme of the analysis) than Barnes & Noble because Amazon.com opened its online store in July 1995. If the top-10 lists were compared today, the weights would differ. Thus, although ad hoc weights are useful when combining lists of preferences at one point in time, they need adjusting each time new data from the sources are recombined to account for, e.g., changes in the market, business cycles, seasons, time of day, new product releases (which could, for example, skew statistics for a few days), blitz marketing campaigns, events (e.g., Olympics® or Super Bowl®), etc. These real world changes have the potential of causing dramatic shifts in the rankings being reported. The ad hoc weights adjustments are time-dependent. If we calculate the rankings at a different point in time, the weights would be reconsidered and changed, tuned each time we calculate the rankings. This can be particularly onerous depending on how often the combined rankings are calculated (in real time, daily, weekly, monthly, quarterly, annually, etc.) particularly if the tuning is done without the assistance of any computer-implemented algorithms.
- In contrast, embodiments of the invention identify the most appropriate method to combine preferences from sources of different modalities by using a SWF appropriate for predefined objectives (e.g., business requirements). Thus, in analyzing and combining information on preferences, exemplary embodiments take into account, for example, business requirements to a level of granularity that ad hoc weights cannot.
- Additionally, embodiments of the invention examine domains with orders of magnitude more “candidates” than “voters”, the reverse of most elections. Conventional voting techniques do not examine scenarios in which the number of “candidates” is orders of magnitude more than the number of “voters.” For example, the Borda function is intended for use in situations when there are a large number of voters and a small number of candidates, such as in a presidential election. Accordingly, embodiments of the invention examine vote computing techniques that are intended for use in scenarios in which the number of items being ranked (or “candidates”) is orders magnitude more than the number of sources ranking the items (“voters”), the opposite of convention elections. Thus, such a vote computing technique may combine the information on preferences into a combined list ranking the preferences in an application in which the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
- Exemplary embodiments of the invention determine preferences from cross-modality information mashups based on a constructed or selected social welfare function (SWF).
FIG. 1 illustrates a method of combining data on preferences according to one embodiment of the invention.FIG. 1 showsinformation 1010 from multiples sources (e.g., Source1, Source2, Source3, etc.), a set ofvote computing methods 1020, a social welfare function (SWF) 1030, and a set of social welfare function (SWF) scores 1040. - The
information 1010 includes information from multiples sources (e.g., Source1, Source2, Source3, etc.). The multiple sources are of varying modalities. Modalities may be expressed as having two major dimensions: intentional versus unintentional, and consuming (passive) versus producing (creative). Intentional activities are those where a user, for example, has had to take steps to “make their mark.” Examples in the online arena would be navigating to a particular page or typing in a name into a search bar. Intentional activities are stronger indicators of interest than unintentional activities. Creative, producing activities are, for example, those where the user takes the time to author a post or compose a response. Passive, consuming activities may involve watching or reading something created by someone else. Creative activities, taking more time and attention, indicate more interest than passive activities. - In
FIG. 1 , each source is either a list of preferences (e.g. a ranked list) or provides data which is converted into a list of preferences. The converting may, for example, process user posts from a social networking site, such as by employing a series of unstructured information management architecture (UIMA) annotators driven off of entity spotting, using Information Extraction (IE) techniques, and/or using natural language mining (NLM) techniques. Depending on the application, an embodiment of the invention may additionally or alternatively request that a source convert the source's data into a list of preferences, instead of converting the data itself. In an example application, Source1 may be sales numbers of wines, Source2 may be an online list generated by wine tasters, and Source3 may be a ranking based analysis of various blogs on wines. The analysis may have employed a series of UIMA annotators. Each source expresses or reflects opinions on the same underlying subject matter, phenomenon, or domain of interest. - The
information 1010 is communicated to each of vote computing methods (e.g., Vote computing method1, Vote computing method2, etc.). InFIG. 1 , each vote computing method is a different combining technique. For example, Vote computing method1, may be a Borda count technique, Vote computing method2 may be an inverted ranking technique, Vote computing method3 may be round robin technique and Vote computing method4 may be the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2. - In
FIG. 1 , the output of each vote computing method is, a “final” ranked list, sometimes referred to as a single, merged list. The final list is provided to the SWF, along with the sources' preferences. From one perspective, the SWF is a mathematical criterion for the success of a voting system based on some desired characteristics. Accordingly, in certain embodiments, the SWF is constructed to capture characteristics which are considered valuable. The value may be determined from a business standpoint if business objectives are driving the undertaking. For example, in determining preferences for wine, it may be considered valuable for each source to see at least ½ of its top 10 list in the overall top 10 list. How well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function affects the appropriateness of the actual combined list outputted by embodiments of the invention. - Accordingly, the SWF may be selected or custom constructed to fit the situation. In one embodiment, the SWF is selected from among a set of SWF, e.g., a set including the Precision Optimal Aggregation SWF (Pswf) and the Spearman Footrule SWF (Sswf). The Pswf measures how many items from each source's top-n list are in the “final” ranked list (the single list which merges ranked items from each source). For example, in one application, the Pswf measures how many artists from each source's top-10 list are in an overall top-10 list created using Borda count technique. One exemplary embodiment uses a Precision Optimal Aggregation SWF defined as:
-
P swf=ΣSmin(2*|T S ∩T|,10), -
- for top-10 lists TS for each source and top-10 list T overall.
- The Spearman Footrule SWF (Sswf) emphasizes preservation of position in the rankings. The Sswf is an approximation of a related SWF Kendall tau distance. The Sswf is less computationally intensive (minutes versus days) relative to the Kendall tau distance. One exemplary embodiment uses a Spearman Footrule SWF defined as:
-
S swf=ΣSΣ10 a=1max(10−|r a −r as|,0). - In use, the SWF takes as input a “final” ranked list and the preferences of each source. The outcome is a score where points are awarded for increased social welfare of a ranking system. In this way, embodiments quantitatively measure the “happiness” of each contributing source with the overall “final” ranking. As shown in
FIG. 1 , an SWF score is calculated for each vote computing method (e.g., SWF score1, SWF score2, SWF score3, etc.). -
FIGS. 2A-2D shows sample tables showing the top-10 artists resulting by merging preference information from various sources using different vote computing methods. InFIGS. 2A-2D , the results of four combination techniques are shown to illustrate how each technique merges the four top-10 lists show inFIG. 3 . In each ofFIGS. 2A-2D , a “final” top-10 computed using the corresponding vote counting technique is shown in the first column. Specifically, inFIG. 2A , a “final” top-10 ranking computed using Total Votes is shown; inFIG. 2B , a “final” top-10 ranking computed using Weighted Votes is shown; inFIG. 2C , a “final” top-10 ranking computed using Semi-Proportional methods is shown; and inFIG. 2D , a “final” top-10 ranking computed using Delegates is shown. - In each of
FIGS. 2A-2D , SWF scores computed using two different SWF (Pswf and Sswf) are shown. The Pswf column shows the contribution of each artist to the overall Precision Optimal Aggregation SWF for that source. The Sswf column shows the contribution of each artist to the overall Spearman Footrule SWF for that source. In the columns labeled Pswf and Sswf, from top to bottom in each of those table cells, the bars correspond to the sources inFIG. 3 in this order: Bebo, LastFM, MySpace, and YouTube. The bars represent how “happy” each source is with the artist being ranked at this position. - The graphs in the CS column express the contribution to the combined ranking for the artist from each source. In the columns labeled CS, the bars from left to right in each of those cells correspond to the sources in
FIG. 3 in the order: Bebo, LastFM, MySpace, and YouTube. The greater a source's contribution to the combined ranking for the artist, the longer the bar. For example, inFIG. 2A , the rankings shown in the first column was produced by merging the rankings from Bebo, LastFM, MySpace, and YouTube shown inFIG. 3 through simple summation of the votes for each artists. InFIG. 2A , the last bar, corresponding to YouTube, is the longest because YouTube had more data points than the other sources. In this example, the “number of votes” is dominated by YouTube. - The bottom of each table in
FIGS. 2A-2D shows the total SWF for Pswf and Sswf expressed as a raw score. For Pswf, each source contributes up to 10 points, for a maximum score of 40 (best). For Sswf, each source contributes up to 100 points, for a maximum of 400. The total influence of each source had on the top-10 list is also seen in the bottom of each table in the last row of the CS column. - The examples illustrated by
FIGS. 2A-2D can be understood in the context ofFIG. 1 as follows.FIGS. 2A-2D illustrate a scenario with four sources: Source1=ranking from Bebo, Source2=ranking from LastFM, Source3=ranking from MySpace, and Source4=ranking from YouTube.FIGS. 2A-2D also illustrate a scenario with four vote counting methods: Vote computing method1=Total Votes (seeFIG. 2A ), Vote computing method2=Weighted Votes (seeFIG. 2B ), Vote computing method3=Semi-Proportional (seeFIG. 2C ), and Vote computing method4=Delegates (seeFIG. 2D ). -
FIGS. 2A-2D also illustrate a scenario when the SWF is a Precision Optimal Aggregation SWF (second column of each table). For each vote counting method, the Precision Optimal Aggregation SWF score is shown at the bottom of the corresponding table: SWF score1=22 (seeFIG. 2A ), SWF score2=28 (seeFIG. 2B ), SWF score3=30 (seeFIG. 2C ), and SWF score3=26 (seeFIG. 2D ). Table 1 below shows the results converted to percentage based on a maximum score of 40 for the Precision Optimal Aggregation SWF. -
TABLE 1 Precision Optimal Aggregation SWF scores Raw Precision Optimal Vote counting method Aggregation SWF score Percentage Total Votes 22 55% Weighted Votes 28 70% Semi-Proportional 30 75 % Delegates 26 65% - Accordingly, using the Precision Optimal Aggregation SWF, the Semi-Proportional vote counting method is identified among the four as the combining technique that produces a combined list most congruent with the subjective values embodied by the Precision Optimal Aggregation SWF.
-
FIGS. 2A-2D also illustrate a scenario when the SWF is Spearman Footrule SWF (third column of each table). For each vote counting method, the Spearman Footrule SWF score is shown at the bottom of each table: SWF score1=149 (seeFIG. 2A ), SWF score2=153 (seeFIG. 2B ), SWF score3=146 (seeFIG. 2C ), and SWF score3=151 (seeFIG. 2D ). Table 2 below shows the results converted to percentage based on a maximum score of 400 for the Spearman Footrule SWF. -
TABLE 2 Spearman Footrule SWF scores Raw Spearman Footrule Vote counting method SWF score Percentage Total Votes 149 37.25% Weighted Votes 153 38.25% Semi-Proportional 146 36.50 % Delegates 151 37.75% - Accordingly, using the Spearman Footrule SWF, the Weighted Votes vote counting method is identified as the combining technique among the four that produces a combined list most congruent with the subjective values embodied by the Spearman Footrule SWF.
- Accordingly, embodiments of the invention include a method that includes identifying a vote computing method that produces the highest SWF score. The evaluation of which voting computing method is most appropriate for a given set of objectives (e.g., business objectives) is performed by the SWF. The SWF takes the lists of the voters' preferences (the lists from the various sources), along with the outcome of the vote (the combined/consensus list), and produces for each vote computing method a “score” indicating the “satisfaction” in the outcome. The highest score indicates the highest satisfaction. That is, the vote computing method that elevates/accounts for/values those characteristics that are valued by the business objectives (as modeled using the SWF) in an optimal fashion is the vote computing method that gets the highest score from the SWF. Since an embodiment of the invention will output a final combined list based on the SWF, the quality of the output is affected by how well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function.
- In an exemplary embodiment, to improve the quality of the system or method's output, a large collection of voting methods or combination techniques are enumerated and examined, over multiple time periods of sample data, to identify which voting method or combination technique produces the highest SWF score. An exemplary embodiment examines the results of various “voting” methods using several weeks or months worth of data.
- For parameterized voting techniques (such as the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), parameter(s) may also be optimized as well to improve the quality of the system. For example, in use, embodiments may determine (e.g., by searching for or computing) a parameter value that is most congruent with enabling the parameterized voting method to output a combined list that reflects the values, e.g., the business objectives.
- Moreover, the characteristics of many sources change over time. Thus, in an exemplary embodiment, even after a vote counting method is established, the congruency of the method to the business objectives (as embodied by the SWF) is revisited periodically (e.g., quarterly) to make sure that changes in the underlying data sources have not reduced the quality of the results. In certain applications, the additional optimization techniques described are also repeated periodically.
- Thus, an exemplary embodiment of the invention applies voting theory to cross-modality information mashups to construct a combined list ranking preferences. An SWF is used to select from various voting methods based on data from various cross-modality sources. In use, the sources and associated data are dependent on the domain. For example, in the domain of interest of wine, the source or associated data may be results of wine tasting parties, professional reviews (e.g., scores from 1-10 in different categories), sales, change in sales, comments posted by average users, and mentions in mass media.
-
FIG. 4 shows a flow diagram of amethod 4000 for combining data on preferences in accordance with an embodiment of the invention. At 4010, a social welfare function (SWF) is received (e.g., bycommunications interface 66 or from memory or storage, as described below). In an exemplary embodiment, the SWF embodies a business objective. At 4020, sources that provide perspective on opinions on the subject area are identified. The sources may be, for example, sales, comments, etc. At 4030, data from these sources are gathered and normalized to create ranked lists of preferences for each source. At 4040, two or more vote computing methods are identified. At 4050, for each of the two or more vote computing methods, the vote computing method is used to combine data on preferences into a combined list ranking the preferences. The data being combined is from a set of two or more sources. In an exemplary embodiment, the set is heterogeneous in modality. In embodiments in which a delegate allocation vote computing method is used, a preliminary distribution of delegates among the sources is determined. For example, in the example shown inFIG. 2D , the following delegate numbers based roughly on population were distributed to the sources: 300 to Bebo, 500 to LastFM, 1000 to MySpace, and 500 to YouTube. - At 4060, for each combined list, the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and value(s) embodied by the SWF, e.g., a business objective. At 4070, the combined list of the vote computing method associated with the highest score is outputted. In one embodiment, additionally or alternatively, the vote computing method associated with the highest score is outputted.
-
FIG. 5 shows a flow diagram of amethod 5000 for combining data on preferences in accordance with an embodiment of the invention. At 5010, a social welfare function (SWF) is created. In an exemplary embodiment, the SWF defines business objectives. At 5020, two or more vote computing methods are identified. At 5030, for each of the two or more vote computing methods, the vote computing method is used to combine data on preferences into a combined list ranking the preferences. The data is from a set of two or more sources. In an exemplary embodiment, the set is heterogeneous in modality. At 5040, for each combined list, the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and, for example, the business objective. At 5050, the combined list of the vote computing method associated with the highest score is outputted. - Although labeled with the numbers above, it should be understood that embodiments of this invention may execute the
method 4000 and/or themethod 5000 in a non-sequential order as appropriate and still remain in accordance with the invention. For example, although numbered 4010, embodiments of the present invention may receive the social welfare function before, during, or after 4020, 4030, 4040, and/or 4050. Similarly, although numbered 5010, embodiments of the present invention may create the social welfare function before, during, or after 5020 and/or 5030. - Moreover, while ranked lists are described in detail herein, in other embodiments, the sources provide additional information on preference (such as total numbers) for input into voting systems that can make use of such additional information.
- Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.
- Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- The medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device). Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, or an optical disk. Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
- A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
- Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices) can be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
-
FIG. 6 is a high level block diagram showing an information processing system useful for implementing one embodiment of the present invention. The computer system includes one or more processors, such asprocessor 44. Theprocessor 44 is connected to a communication infrastructure 46 (e.g., a communications bus, cross-over bar, or network). Various software embodiments are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person of ordinary skill in the relevant art(s) how to implement the invention using other computer systems and/or computer architectures. - The computer system can include a
display interface 48 that forwards graphics, text, and other data from the communication infrastructure 46 (or from a frame buffer not shown) for display on adisplay unit 50. The computer system also includes amain memory 52, preferably random access memory (RAM), and may also include asecondary memory 54. Thesecondary memory 54 may include, for example, ahard disk drive 56 and/or aremovable storage drive 58, representing, for example, a floppy disk drive, a magnetic tape drive, or an optical disk drive. Theremovable storage drive 58 reads from and/or writes to aremovable storage unit 60 in a manner well known to those having ordinary skill in the art.Removable storage unit 60 represents, for example, a floppy disk, a compact disc, a magnetic tape, or an optical disk, etc. which is read by and written to byremovable storage drive 58. As will be appreciated, theremovable storage unit 60 includes a computer readable medium having stored therein computer software and/or data. - In alternative embodiments, the
secondary memory 54 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, aremovable storage unit 62 and aninterface 64. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and otherremovable storage units 62 andinterfaces 64 which allow software and data to be transferred from theremovable storage unit 62 to the computer system. - The computer system may also include a
communications interface 66. Communications interface 66 allows software and data to be transferred between the computer system and external devices. Examples ofcommunications interface 66 may include a modem, a network interface (such as an Ethernet card), a communications port, or a PCMCIA slot and card, etc. Software and data transferred viacommunications interface 66 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received bycommunications interface 66. These signals are provided tocommunications interface 66 via a communications path (i.e., channel) 68. Thischannel 68 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels. - In this document, the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as
main memory 52 andsecondary memory 54,removable storage drive 58, and a hard disk installed inhard disk drive 56. - Computer programs (also called computer control logic) are stored in
main memory 52 and/orsecondary memory 54. Computer programs may also be received viacommunications interface 66. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable theprocessor 44 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system. -
FIG. 7 represents an exemplary distributed data processing system in which aspects of the illustrative embodiments may be implemented. Distributeddata processing system 100 may include a network of computers in which aspects of the illustrative embodiments may be implemented. The distributeddata processing system 100 contains at least onenetwork 102, which is the medium used to provide communication links between various devices and computers connected together within distributeddata processing system 100. Thenetwork 102 may include connections, such as wire, wireless communication links, or fiber optic cables. - In the depicted example,
server 104 andserver 106 are connected to network 102 along withstorage unit 108. In addition,clients clients server 104 provides data, such as boot files, operating system images, and applications toclients Clients server 104 in the depicted example. Distributeddata processing system 100 may include additional servers, clients, and other devices not shown. - In the depicted example, distributed
data processing system 100 is the Internet withnetwork 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other computer systems that route data and messages. Of course, the distributeddata processing system 100 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like. As stated above,FIG. 7 is intended as an example, not as an architectural limitation for different embodiments of the present invention, and therefore, the particular elements shown inFIG. 7 should not be considered limiting with regard to the environments in which the illustrative embodiments of the present invention may be implemented. - In one use, as an example,
clients server 104.Server 104 stores the information instorage 108.Server 106 contains hardware devices and software tools to combine the information (e.g., into information mashups and/or combined/consensus lists) according to the present invention.Server 106 transmits the combined information toserver 104 and/orclients - In use,
client 114 may provide the server with business requirements embodied in a SWF. The server determines to best vote counting method to use for that particular application based on the SWF and the information stored, e.g., instorage 108. The server may transmit an identification of the vote counting method to the client. - References in the claims to an element in the singular is not intended to mean “one and only” unless explicitly so stated, but rather “one or more.” All structural and functional equivalents to the elements of the above-described exemplary embodiment that are currently known or later come to be known to those of ordinary skill in the art are intended to be encompassed by the present claims. No claim element herein is to be construed under the provisions of 35 U.S.C.
section 112, sixth paragraph, unless the element is expressly recited using the phrase “means for” or “step for.” - Thus, a system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information mashups based on a social welfare function is disclosed. While the preferred embodiments of the present invention have been described, it will be understood that modifications and adaptations to the embodiments shown may occur to one of ordinary skill in the art without departing from the scope of the present invention as set forth in the claims. Thus, the scope of this invention is to be construed according to the claims and not limited by the specific details disclosed in the exemplary embodiments.
Claims (20)
1. A computer-implemented method for determining preferences from cross-modality information mashups, the method comprising:
receiving a social welfare function (SWF);
identifying two or more vote computing methods;
for each of the two or more vote computing methods, using the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
for each combined list, inputting the combined list into the SWF to compute a score; and
outputting the combined list of the vote computing method associated with the highest score.
2. The method of claim 1 , wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
3. The method of claim 1 , wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
4. The method of claim 3 , further comprising processing the unstructured data using natural language mining.
5. The method of claim 1 , wherein receiving the SWF comprises receiving a custom constructed SWF based on a set of business objectives.
6. The method of claim 1 , wherein the two or more vote computing methods includes a parameterized vote computing method.
7. The method of claim 1 , wherein the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
8. A computer program product for determining preferences from cross-modality information, said computer program product comprising:
a computer readable medium;
first program instructions stored on the computer readable medium, the first program instructions to identify two or more vote computing methods;
second program instructions stored on the computer readable medium, the second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
third program instructions stored on the computer readable medium, the third program instructions to, for each combined list, input the combined list into a social welfare function to compute a score; and
fourth program instructions stored on the computer readable medium, the fourth program instructions to output the vote computing method associated with the highest score.
9. The computer program product of claim 8 , further comprising
fifth program instructions stored on the computer readable medium, the fifth program instructions to output the combined list of the vote computing method associated with the highest score.
10. The computer program product of claim 8 , wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
11. The computer program product of claim 10 , wherein the second source is selected from the group consisting of: an online blog, an online forum, and an online social networking website.
12. The computer program product of claim 8 , wherein the social welfare function is selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
13. The computer program product of claim 8 , wherein the social welfare function is a custom constructed social welfare function.
14. The computer program product of claim 8 , wherein the two or more vote computing methods includes a parameterized vote computing method.
15. The computer program product of claim 8 , wherein the number of preferences being ranked is at least an order magnitude more in number than the number of sources.
16. A system for determining preferences from cross-modality information, the system comprising:
a communications interface;
memory storing computer usable program code; and
a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory; wherein the computer usable program code comprises:
computer usable program code configured to identify two or more vote computing methods;
computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and
computer usable program code configured to identify the vote computing method associated with the highest score.
17. The system of claim 16 , wherein the computer usable program code further comprises:
computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
18. The system of claim 16 , wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
19. The system of claim 16 , wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
20. The system of claim 16 , wherein the processor is coupled to the communications interface to receive the information on preferences from a server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/195,126 US20090248690A1 (en) | 2008-03-31 | 2008-08-20 | System and method for determining preferences from information mashups |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US4112808P | 2008-03-31 | 2008-03-31 | |
US12/195,126 US20090248690A1 (en) | 2008-03-31 | 2008-08-20 | System and method for determining preferences from information mashups |
Publications (1)
Publication Number | Publication Date |
---|---|
US20090248690A1 true US20090248690A1 (en) | 2009-10-01 |
Family
ID=41118612
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/195,098 Active 2030-10-23 US8417694B2 (en) | 2008-03-31 | 2008-08-20 | System and method for constructing targeted ranking from multiple information sources |
US12/195,126 Abandoned US20090248690A1 (en) | 2008-03-31 | 2008-08-20 | System and method for determining preferences from information mashups |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/195,098 Active 2030-10-23 US8417694B2 (en) | 2008-03-31 | 2008-08-20 | System and method for constructing targeted ranking from multiple information sources |
Country Status (1)
Country | Link |
---|---|
US (2) | US8417694B2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110060628A1 (en) * | 2009-09-03 | 2011-03-10 | Olaf STOERMER | Method for assessing candidates by voting and a system intended for this purpose and a program product comprising a computer-readable medium |
US20130035988A1 (en) * | 2007-12-14 | 2013-02-07 | John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 | Integrated Gourmet Item Data Collection, Recommender and Vending System and Method |
US20160179849A1 (en) * | 2014-12-22 | 2016-06-23 | Verizon Patent And Licensing Inc. | Machine to machine data aggregator |
CN109828460A (en) * | 2019-01-21 | 2019-05-31 | 南京理工大学 | A kind of consistent control method of output for two-way heterogeneous multi-agent system |
CN110390535A (en) * | 2019-06-25 | 2019-10-29 | 阿里巴巴集团控股有限公司 | Customer complaint object determines method, apparatus, electronic equipment and readable storage medium storing program for executing |
CN113886723A (en) * | 2021-09-09 | 2022-01-04 | 盐城金堤科技有限公司 | Method and device for determining sequencing stability, storage medium and electronic equipment |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8234147B2 (en) * | 2009-05-15 | 2012-07-31 | Microsoft Corporation | Multi-variable product rank |
US20110219607A1 (en) * | 2010-03-12 | 2011-09-15 | Nanjundaswamy Kirakodu S | Cathode active materials and method of making thereof |
US8352406B2 (en) | 2011-02-01 | 2013-01-08 | Bullhorn, Inc. | Methods and systems for predicting job seeking behavior |
CN103377240B (en) * | 2012-04-26 | 2017-03-01 | 阿里巴巴集团控股有限公司 | Information providing method, processing server and merging server |
US20140181085A1 (en) | 2012-12-21 | 2014-06-26 | Commvault Systems, Inc. | Data storage system for analysis of data across heterogeneous information management systems |
US9021452B2 (en) | 2012-12-27 | 2015-04-28 | Commvault Systems, Inc. | Automatic identification of storage requirements, such as for use in selling data storage management solutions |
KR20150128238A (en) * | 2014-05-09 | 2015-11-18 | 삼성전자주식회사 | server, control method thereof and system for producing ranking of serach terms whose popularity increases rapidly |
US9760446B2 (en) * | 2014-06-11 | 2017-09-12 | Micron Technology, Inc. | Conveying value of implementing an integrated data management and protection system |
US10324914B2 (en) | 2015-05-20 | 2019-06-18 | Commvalut Systems, Inc. | Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files |
US20160364426A1 (en) * | 2015-06-11 | 2016-12-15 | Sap Se | Maintenance of tags assigned to artifacts |
US10949308B2 (en) | 2017-03-15 | 2021-03-16 | Commvault Systems, Inc. | Application aware backup of virtual machines |
US11032350B2 (en) | 2017-03-15 | 2021-06-08 | Commvault Systems, Inc. | Remote commands framework to control clients |
US11010261B2 (en) | 2017-03-31 | 2021-05-18 | Commvault Systems, Inc. | Dynamically allocating streams during restoration of data |
US10909125B2 (en) * | 2018-05-22 | 2021-02-02 | Salesforce.Com, Inc. | Asymmetric rank-biased overlap |
Citations (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026426A (en) * | 1996-04-30 | 2000-02-15 | International Business Machines Corporation | Application programming interface unifying multiple mechanisms |
US20020103695A1 (en) * | 1998-04-16 | 2002-08-01 | Arnold B. Urken | Methods and apparatus for gauging group choices |
US6718324B2 (en) * | 2000-01-14 | 2004-04-06 | International Business Machines Corporation | Metadata search results ranking system |
US6728704B2 (en) * | 2001-08-27 | 2004-04-27 | Verity, Inc. | Method and apparatus for merging result lists from multiple search engines |
US6763338B2 (en) * | 2002-04-05 | 2004-07-13 | Hewlett-Packard Development Company, L.P. | Machine decisions based on preferential voting techniques |
US6795820B2 (en) * | 2001-06-20 | 2004-09-21 | Nextpage, Inc. | Metasearch technique that ranks documents obtained from multiple collections |
US20040193698A1 (en) * | 2003-03-24 | 2004-09-30 | Sadasivuni Lakshminarayana | Method for finding convergence of ranking of web page |
US20040199419A1 (en) * | 2001-11-13 | 2004-10-07 | International Business Machines Corporation | Promoting strategic documents by bias ranking of search results on a web browser |
US20050097067A1 (en) * | 2003-10-29 | 2005-05-05 | Kirshenbaum Evan R. | System and method for combining valuations of multiple evaluators |
US6901409B2 (en) * | 2001-01-17 | 2005-05-31 | International Business Machines Corporation | Mapping data from multiple data sources into a single software component |
US20050210025A1 (en) * | 2004-03-17 | 2005-09-22 | Dalton Michael E | System and method for predicting the ranking of items |
US20050246322A1 (en) * | 2004-04-30 | 2005-11-03 | Shanmugasundaram Ravikumar | On the role of market economics in ranking search results |
US20050286772A1 (en) * | 2004-06-24 | 2005-12-29 | Lockheed Martin Corporation | Multiple classifier system with voting arbitration |
US20060074290A1 (en) * | 2004-10-04 | 2006-04-06 | Banner Health | Methodologies linking patterns from multi-modality datasets |
US20060184483A1 (en) * | 2005-01-12 | 2006-08-17 | Douglas Clark | Predictive analytic method and apparatus |
US20060190425A1 (en) * | 2005-02-24 | 2006-08-24 | Yuan-Chi Chang | Method for merging multiple ranked lists with bounded memory |
US7107263B2 (en) * | 2000-12-08 | 2006-09-12 | Netrics.Com, Inc. | Multistage intelligent database search method |
US20070016574A1 (en) * | 2005-07-14 | 2007-01-18 | International Business Machines Corporation | Merging of results in distributed information retrieval |
US20070038625A1 (en) * | 1999-05-05 | 2007-02-15 | West Publishing Company | Document-classification system, method and software |
US7188106B2 (en) * | 2001-05-01 | 2007-03-06 | International Business Machines Corporation | System and method for aggregating ranking results from various sources to improve the results of web searching |
US20070112720A1 (en) * | 2005-11-14 | 2007-05-17 | Microsoft Corporation | Two stage search |
US7242810B2 (en) * | 2004-05-13 | 2007-07-10 | Proximex Corporation | Multimodal high-dimensional data fusion for classification and identification |
US7257577B2 (en) * | 2004-05-07 | 2007-08-14 | International Business Machines Corporation | System, method and service for ranking search results using a modular scoring system |
US20070250784A1 (en) * | 2006-03-14 | 2007-10-25 | Workstone Llc | Methods and apparatus to combine data from multiple computer systems for display in a computerized organizer |
US20080027913A1 (en) * | 2006-07-25 | 2008-01-31 | Yahoo! Inc. | System and method of information retrieval engine evaluation using human judgment input |
US7689615B2 (en) * | 2005-02-25 | 2010-03-30 | Microsoft Corporation | Ranking results using multiple nested ranking |
US20100082639A1 (en) * | 2008-09-30 | 2010-04-01 | Microsoft Corporation | Processing maximum likelihood for listwise rankings |
US7716202B2 (en) * | 2003-06-27 | 2010-05-11 | At&T Intellectual Property I, L.P. | Determining a weighted relevance value for each search result based on the estimated relevance value when an actual relevance value was not received for the search result from one of the plurality of search engines |
US20100281023A1 (en) * | 2007-06-29 | 2010-11-04 | Emc Corporation | Relevancy scoring using query structure and data structure for federated search |
US8260787B2 (en) * | 2007-06-29 | 2012-09-04 | Amazon Technologies, Inc. | Recommendation system with multiple integrated recommenders |
-
2008
- 2008-08-20 US US12/195,098 patent/US8417694B2/en active Active
- 2008-08-20 US US12/195,126 patent/US20090248690A1/en not_active Abandoned
Patent Citations (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026426A (en) * | 1996-04-30 | 2000-02-15 | International Business Machines Corporation | Application programming interface unifying multiple mechanisms |
US20020103695A1 (en) * | 1998-04-16 | 2002-08-01 | Arnold B. Urken | Methods and apparatus for gauging group choices |
US20070038625A1 (en) * | 1999-05-05 | 2007-02-15 | West Publishing Company | Document-classification system, method and software |
US6718324B2 (en) * | 2000-01-14 | 2004-04-06 | International Business Machines Corporation | Metadata search results ranking system |
US7107263B2 (en) * | 2000-12-08 | 2006-09-12 | Netrics.Com, Inc. | Multistage intelligent database search method |
US6901409B2 (en) * | 2001-01-17 | 2005-05-31 | International Business Machines Corporation | Mapping data from multiple data sources into a single software component |
US7188106B2 (en) * | 2001-05-01 | 2007-03-06 | International Business Machines Corporation | System and method for aggregating ranking results from various sources to improve the results of web searching |
US6795820B2 (en) * | 2001-06-20 | 2004-09-21 | Nextpage, Inc. | Metasearch technique that ranks documents obtained from multiple collections |
US6728704B2 (en) * | 2001-08-27 | 2004-04-27 | Verity, Inc. | Method and apparatus for merging result lists from multiple search engines |
US20040199419A1 (en) * | 2001-11-13 | 2004-10-07 | International Business Machines Corporation | Promoting strategic documents by bias ranking of search results on a web browser |
US6763338B2 (en) * | 2002-04-05 | 2004-07-13 | Hewlett-Packard Development Company, L.P. | Machine decisions based on preferential voting techniques |
US20040193698A1 (en) * | 2003-03-24 | 2004-09-30 | Sadasivuni Lakshminarayana | Method for finding convergence of ranking of web page |
US20100153357A1 (en) * | 2003-06-27 | 2010-06-17 | At&T Intellectual Property I, L.P. | Rank-based estimate of relevance values |
US7716202B2 (en) * | 2003-06-27 | 2010-05-11 | At&T Intellectual Property I, L.P. | Determining a weighted relevance value for each search result based on the estimated relevance value when an actual relevance value was not received for the search result from one of the plurality of search engines |
US20050097067A1 (en) * | 2003-10-29 | 2005-05-05 | Kirshenbaum Evan R. | System and method for combining valuations of multiple evaluators |
US20050210025A1 (en) * | 2004-03-17 | 2005-09-22 | Dalton Michael E | System and method for predicting the ranking of items |
US20050246322A1 (en) * | 2004-04-30 | 2005-11-03 | Shanmugasundaram Ravikumar | On the role of market economics in ranking search results |
US7257577B2 (en) * | 2004-05-07 | 2007-08-14 | International Business Machines Corporation | System, method and service for ranking search results using a modular scoring system |
US7242810B2 (en) * | 2004-05-13 | 2007-07-10 | Proximex Corporation | Multimodal high-dimensional data fusion for classification and identification |
US20050286772A1 (en) * | 2004-06-24 | 2005-12-29 | Lockheed Martin Corporation | Multiple classifier system with voting arbitration |
US20060074290A1 (en) * | 2004-10-04 | 2006-04-06 | Banner Health | Methodologies linking patterns from multi-modality datasets |
US20060184483A1 (en) * | 2005-01-12 | 2006-08-17 | Douglas Clark | Predictive analytic method and apparatus |
US20060190425A1 (en) * | 2005-02-24 | 2006-08-24 | Yuan-Chi Chang | Method for merging multiple ranked lists with bounded memory |
US7689615B2 (en) * | 2005-02-25 | 2010-03-30 | Microsoft Corporation | Ranking results using multiple nested ranking |
US20070016574A1 (en) * | 2005-07-14 | 2007-01-18 | International Business Machines Corporation | Merging of results in distributed information retrieval |
US20070112720A1 (en) * | 2005-11-14 | 2007-05-17 | Microsoft Corporation | Two stage search |
US20070250784A1 (en) * | 2006-03-14 | 2007-10-25 | Workstone Llc | Methods and apparatus to combine data from multiple computer systems for display in a computerized organizer |
US20080027913A1 (en) * | 2006-07-25 | 2008-01-31 | Yahoo! Inc. | System and method of information retrieval engine evaluation using human judgment input |
US20100281023A1 (en) * | 2007-06-29 | 2010-11-04 | Emc Corporation | Relevancy scoring using query structure and data structure for federated search |
US8260787B2 (en) * | 2007-06-29 | 2012-09-04 | Amazon Technologies, Inc. | Recommendation system with multiple integrated recommenders |
US20100082639A1 (en) * | 2008-09-30 | 2010-04-01 | Microsoft Corporation | Processing maximum likelihood for listwise rankings |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130035988A1 (en) * | 2007-12-14 | 2013-02-07 | John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 | Integrated Gourmet Item Data Collection, Recommender and Vending System and Method |
US8671012B2 (en) * | 2007-12-14 | 2014-03-11 | John Nicholas and Kristin Gross | Methods and systems for promoting items based on event sampling data |
US9037515B2 (en) | 2007-12-14 | 2015-05-19 | John Nicholas and Kristin Gross | Social networking websites and systems for publishing sampling event data |
US10482484B2 (en) | 2007-12-14 | 2019-11-19 | John Nicholas And Kristin Gross Trust U/A/D April 13, 2010 | Item data collection systems and methods with social network integration |
US20110060628A1 (en) * | 2009-09-03 | 2011-03-10 | Olaf STOERMER | Method for assessing candidates by voting and a system intended for this purpose and a program product comprising a computer-readable medium |
US20160179849A1 (en) * | 2014-12-22 | 2016-06-23 | Verizon Patent And Licensing Inc. | Machine to machine data aggregator |
US10275476B2 (en) * | 2014-12-22 | 2019-04-30 | Verizon Patent And Licensing Inc. | Machine to machine data aggregator |
CN109828460A (en) * | 2019-01-21 | 2019-05-31 | 南京理工大学 | A kind of consistent control method of output for two-way heterogeneous multi-agent system |
CN110390535A (en) * | 2019-06-25 | 2019-10-29 | 阿里巴巴集团控股有限公司 | Customer complaint object determines method, apparatus, electronic equipment and readable storage medium storing program for executing |
CN113886723A (en) * | 2021-09-09 | 2022-01-04 | 盐城金堤科技有限公司 | Method and device for determining sequencing stability, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
US8417694B2 (en) | 2013-04-09 |
US20090248614A1 (en) | 2009-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20090248690A1 (en) | System and method for determining preferences from information mashups | |
Le et al. | Information sought by prospective students from social media electronic word-of-mouth during the university choice process | |
Oliveira et al. | Can social media reveal the preferences of voters? A comparison between sentiment analysis and traditional opinion polls | |
Reschke et al. | Status spillovers: The effect of status-conferring prizes on the allocation of attention | |
Holmberg | Altmetrics for information professionals: Past, present and future | |
US10567182B1 (en) | Revealing connections for persons in a social graph | |
Duverger | Curvilinear effects of user-generated content on hotels’ market share: a dynamic panel-data analysis | |
Berezina et al. | Understanding satisfied and dissatisfied hotel customers: text mining of online hotel reviews | |
Hadani et al. | Complementary relationships between corporate philanthropy and corporate political activity: An exploratory study of political marketplace contingencies | |
Malthouse et al. | Managing customer relationships in the social media era: Introducing the social CRM house | |
Mellet et al. | A ‘democratization’of markets? Online consumer reviews in the restaurant industry | |
Yang et al. | The changing public sphere on Twitter: Network structure, elites and topics of the# righttobeforgotten | |
Liket et al. | Nonprofit organizational effectiveness: Analysis of best practices | |
Kwok et al. | Spreading social media messages on Facebook: An analysis of restaurant business-to-consumer communications | |
Fichman | A comparative assessment of answer quality on four question answering sites | |
Doloreux et al. | Exploring and comparing innovation patterns across different knowledge intensive business services | |
Chalmers | With a lot of help from their friends: Explaining the social logic of informational lobbying in the European Union | |
Loukis et al. | Active and passive crowdsourcing in government | |
Phillips et al. | The influence of geographic and psychic distance on online hotel ratings | |
Panda et al. | Modelling the relationship between information technology infrastructure and organizational agility: A study in the context of India | |
Abdul‐Aziz et al. | Exploring the internationalization of Malaysian contractors: the international entrepreneurship dimension | |
Jerrim et al. | Are peer reviews of grant proposals reliable? An analysis of Economic and Social Research Council (ESRC) funding applications | |
Huang et al. | The effect of online and offline word-of-mouth on new product diffusion | |
Yarış et al. | The impact of social media use on restaurant choice | |
Nam et al. | Can web ecology provide a clearer understanding of people’s information behavior during election campaigns? |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BHAGWAN, VARUN;GRANDISON, TYRONE WILBERFORCE ANDRE;GRUHL, DANIEL FREDERICK;AND OTHERS;REEL/FRAME:021758/0836;SIGNING DATES FROM 20080819 TO 20080820 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION |