EP1643377A2 - Merging data from multiple data sources for use in an electronic document - Google Patents

Merging data from multiple data sources for use in an electronic document Download PDF

Info

Publication number
EP1643377A2
EP1643377A2 EP05108491A EP05108491A EP1643377A2 EP 1643377 A2 EP1643377 A2 EP 1643377A2 EP 05108491 A EP05108491 A EP 05108491A EP 05108491 A EP05108491 A EP 05108491A EP 1643377 A2 EP1643377 A2 EP 1643377A2
Authority
EP
European Patent Office
Prior art keywords
data source
recipient list
field
data
subsequently added
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP05108491A
Other languages
German (de)
French (fr)
Other versions
EP1643377A3 (en
Inventor
Sumi N. Singh
Juraj Gottweis
John E. Dimmic, Jr.
Tara M. Kraft
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of EP1643377A2 publication Critical patent/EP1643377A2/en
Publication of EP1643377A3 publication Critical patent/EP1643377A3/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs

Definitions

  • Modem desktop publishing applications enable a user to readily create electronic documents.
  • One feature available in many desktop publishing applications for creating electronic documents is known as "mail merge.”
  • Mail merge automates the process of creating variable data documents by allowing users to connect to and merge data from a single data source.
  • data is pulled from a single data source (such as a mailing list) and inserted at marked locations in a document.
  • One drawback associated with the mail merge feature offered by modem desktop publishing applications is that users often store data (such as mailing lists) in disparate formats and in multiple locations in a computer system. For instance, a user may store one list of contacts as a contacts file which is readable by a contact manager program while another list of contacts may be stored as a spreadsheet file which is readable by a spreadsheet program. Thus, users are often required to assemble data in disparate formats from multiple sources into a single data source to utilize the mail merge feature offered by desktop publishing applications.
  • the method includes receiving field names and field data from an initial data source, mapping field names from a subsequently added data source to the initial data source, and building a recipient list schema based on the field names from the initial data source and the mapped field names from the subsequently added data source.
  • the recipient list schema defines the relationships between the field names in the recipient list and the field names in the initial and subsequently added data sources.
  • the building of the recipient list includes comparing the mapped field names from the subsequently added data source to the field names from the initial data source and, if any of the mapped field names from the subsequently added data source do not correspond to the field names from the initial data source, then the method includes adding the mapped field names.
  • the method further includes creating a recipient list according to the recipient list schema, and saving a file that allows for the recipient list to be re-created.
  • the creation of the recipient list according to the recipient list schema may include adding rows of field data from the subsequently added data source to a temporary recipient list to create a master data source and calculating a hash value for each row of field data in the master data source.
  • the saved updated recipient list file may include a reference to the initial data source, a reference to the subsequently added data source, and the hash value calculated for each of the rows of field data in the master data source.
  • the method may further include modifying the recipient list to modify the field data in the initial data source and the field data in the subsequently added data source and resolving duplicates between the initial data source and the subsequently added data source in the recipient list.
  • the invention may be implemented as a computer process, a computing system, or as an article of manufacture such as a computer program product or computer readable media.
  • the computer program product may be a computer storage media readable by a computer system and encoding a computer program of instructions for executing a computer process.
  • the computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process.
  • FIGURE 1 and the corresponding discussion are intended to provide a brief, general description of a suitable computing environment in which embodiments of the invention may be implemented. While the invention will be described in the general context of program modules that execute in conjunction with program modules that run on an operating system on a personal computer, those skilled in the art will recognize that the invention may also be implemented in combination with other types of computer systems and program modules.
  • program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types.
  • program modules may be located in both local and remote memory storage devices.
  • FIGURE 1 an illustrative computer architecture for a computer 2 utilized in the various embodiments of the invention will be described.
  • the computer architecture shown in FIGURE 1 illustrates a conventional desktop or laptop computer, including a central processing unit 5 ("CPU"), a system memory 7, including a random access memory 9 (“RAM”) and a read-only memory (“ROM”) 11, and a system bus 12 that couples the memory to the CPU 5.
  • CPU central processing unit
  • RAM random access memory
  • ROM read-only memory
  • the computer 2 further includes a mass storage device 14 for storing an operating system 16, application programs, and other program modules, which will be described in greater detail below.
  • the mass storage device 14 is connected to the CPU 5 through a mass storage controller (not shown) connected to the bus 12.
  • the mass storage device 14 and its associated computer-readable media provide non-volatile storage for the computer 2.
  • computer-readable media can be any available media that can be accessed by the computer 2.
  • Computer-readable media may comprise computer storage media and communication media.
  • Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, digital versatile disks (“DVD”), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 2.
  • the computer 2 may operate in a networked environment using logical connections to remote computers through a network 18, such as the Internet.
  • the computer 2 may connect to the network 18 through a network interface unit 20 connected to the bus 12.
  • the network interface unit 20 may also be utilized to connect to other types of networks and remote computer systems.
  • the computer 2 may also include an input/output controller 22 for receiving and processing input from a number of other devices, including a keyboard, mouse, or electronic stylus (not shown in FIGURE 1).
  • an input/output controller 22 may provide output to a display screen, a printer, or other type of output device.
  • a number of program modules and data files may be stored in the mass storage device 14 and RAM 9 of the computer 2, including an operating system 16 suitable for controlling the operation of a networked personal computer, such as the WINDOWS XP operating system from MICROSOFT CORPORATION af Redmond, Washington.
  • the mass storage device 14 and RAM 9 may also store one or more program modules.
  • the mass storage device 14 and the RAM 9 may store a desktop publishing application 10.
  • the desktop publishing application 10 is operative to provide functionality for creating and editing electronic documents.
  • the desktop publishing application 10 comprises the PUBLISHER word processing application program from MICROSOFT CORPORATION.
  • desktop publishing applications from other manufacturers may be utilized to embody the various aspects of the present invention. It should further be appreciated that the various aspects of the present invention are not limited to desktop publishing applications but may also utilize other programs which are capable of processing text, such as the WORD program from MICROSOFT CORPORATION as well as spreadsheet programs and database programs.
  • the desktop publishing application 10 provides functionality for allowing a user to merge data sources 26 (Data Source 1), 28 (Data Source 2), and 30 (Data Source N) into various locations of an electronic document.
  • data sources 26 Data Source 1
  • 28 Data Source 2
  • Data Source N data sources 26
  • the desktop publishing application 10 creates a recipient list file 24 for storing the merged data.
  • An illustrative method utilized by the desktop publishing application 10 for merging the data sources 26 (Data Source 1), 28 (Data Source 2), and 30 (Data Source N) to create the recipient list file 24 will be described in greater detail with respect to FIGURE 2, below.
  • FIGURE 2 an illustrative routine 200 will be described illustrating a process performed by the desktop publishing application 10 for merging data from multiple data sources. It should be appreciated that although the embodiments of the invention described herein are presented in the context of the desktop publishing application 10, the invention may be utilized in other types of application programs that support text processing, such as word processing, spreadsheet, and database programs.
  • the routine 200 begins at operation 210, wherein the desktop publishing application 10 receives data sources 26 (Data Source 1), 28 (Data Source 2), and 28 (Data Source N) and "maps" the fields in each of the data sources.
  • the desktop publishing application 10 may retrieve a first data source as an initial data source, associate fields from each subsequent retrieved data source with the fields of the initial data source, and add fields in the subsequent data source which do not appear in the initial data source.
  • the data source 26 may include data fields "First Name,” and “Last Name”
  • data source 28 may include data fields "First Name,” “Last Name,” and “Title.”
  • the word processing application program may designate the data source 26 as the initial data source and map the common fields (i.e., "First Name” and "Last Name") shared by the data sources as well as add the unique field "Title” from the data source 28.
  • a user of the desktop publishing application 10 may designate the fields from subsequent data sources (e.g., the data source 28) to be mapped or added to the initial data source. It will further be appreciated that that mapped fields may also be un-mapped.
  • the routine 200 continues from operation 210 to operation 220, where the desktop publishing application 10 builds a recipient list (i.e., a master data source) schema from the mapped fields from data sources 26, 28, and 30.
  • a "schema” defines the field names mapped from each of the input data sources.
  • a recipient list schema for the data source 26 and the data source 28 may include the fields "First Name,” and "Last Name” (from the data sources 26 and 28) as well as the field “Title” (from the data source 28 only).
  • the routine 200 continues from operation 220 to operation 230 where the desktop publishing application 10 creates a temporary recipient list from the recipient list schema.
  • the desktop publishing application 10 may create a table of fields from the recipient list schema to receive data associated with the fields from each of the input data sources.
  • the routine 200 continues from operation 230 to operation 240 where the desktop publishing application 10 retrieves one or more rows of data from each of the input data sources to create a recipient list.
  • the desktop publishing application 10 retrieves data from each data source and fills the temporary recipient list according to the recipient list schema.
  • the routine 200 continues from operation 240 to operation 250 where the desktop publishing application 10 creates a hash for each row in the recipient list.
  • a hash is a number generated from a string of text which may be used to access data methods.
  • the hash (or hash value) is generated by a formula in such a way that it is extremely unlikely that some other text will produce the same hash value.
  • Various methods for generating hash values are well-known to those skilled in the art, and therefore not discussed in further detail herein. It will be appreciated that the hash serves as a link to data in the input data sources may be used to distinguish the merged data in the recipient list.
  • the routine 200 continues from operation 250 to operation 260 where the desktop publishing application 10 saves the recipient list to a file (such as the recipient list file 24). It will be appreciated that in saving the recipient list, a file is created which allows for the recipient list to be re-created. It will be further be appreciated that the recipient list file 24 may include a reference to the initial data source, a reference to each subsequent or added data source, and the hash value calculated for each of the rows of field data in the recipient list.
  • the routine 200 continues from operation 260 to operation 270 where the desktop publishing application 10 modifies the recipient list in response to input from a user. In particular, a user may update data in the recipient list by changing or removing data.
  • the modification in the recipient list also modifies the data in the supplying data source. For instance, a user modifying last name data in the recipient list retrieved from the data source 26 will result in the modification of the same last name data in the data source 26. It will be appreciated that in modifying data in the recipient list, the desktop publishing application 10 may reference the hash value calculated for each modified row of data in the recipient list to locate and update the corresponding data in the affected data sources.
  • the routine 200 continues from operation 270 to operation 280 where the desktop publishing application 10 resolves duplicate data in the recipient list.
  • the data sources 26, 28, and 30 may include identical data which may show up as duplicate entries in the recipient list. For instance, a user may have name and address information for the same person listed in two different data sources.
  • the desktop publishing application 10 may be configured to locate duplicate data entries by creating a hash table of the hash values from the recipient list and comparing the data entries which two hash values point to.
  • duplicates may be found by comparing every row with every other row. The hash value calculated for each row is stored in memory. It should be understood that one hash value is stored for every column in every row. The hash values are then compared. When two data entries match their hash values will also match.
  • FIGURE 3 shows a user interface window 300 including a number of user interface components for mapping fields from a data source.
  • a window 40 displays fields from a data source which may be mapped to a recipient list schema by dragging them from the window 40 and dropping them into the field 42 or into the field 44. Fields added to the window 44 will be added as a new column in the recipient list schema.
  • a mapping may be undone by dragging fields from the window 42 or 44 to the window 40.
  • a default map button 46 is provided for selecting a default mapping of the fields from the data source which is automatically determined by the desktop publishing application 10.
  • FIGURE 4 shows a user interface window 400 displaying a recipient list along with rows of mapped fields and associated data.
  • the window 52 identifies the data sources used to create the recipient list.
  • Links 54 and 56 are provided for adding a data source to the recipient list from a file, database, or from a contacts list.
  • Link 58 is provided for creating a data source by typing a new list.
  • FIGURE 5 shows a user interface window 500 displaying a data source table 60 along with rows of filed data.
  • An add new entry button 62 is provided for adding a new entry to the data source table 60 and a delete entry button 64 is provided for deleting a displayed entry in the data source table 60.
  • the user interface window 500 may include mapped data or copied data from a recipient list which is linked to an original data source stored in another location such that changes will be propagated back to the original data source.
  • FIGURE 6 shows a user interface 600 displaying a data source table 70 listing duplicate data entries.
  • a column of check boxes 72 is provided for allowing a user to select entries to be removed from the recipient list.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Economics (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

A method, apparatus, and computer-readable medium are provided for merging data from multiple data sources for use in an electronic document. The method includes receiving field names and field data from an initial data source, mapping field names from a subsequently added data source to the initial data source, building a recipient list schema based on the field names from the initial data source and the mapped field names from the subsequently added data source, creating a recipient list according to the recipient list schema, and saving the recipient list to a file. The recipient list schema defines the relationships between the field names in the recipient list and the field names in the initial and subsequently added data sources.

Description

    BACKGROUND OF THE INVENTION
  • Modem desktop publishing applications enable a user to readily create electronic documents. One feature available in many desktop publishing applications for creating electronic documents is known as "mail merge." Mail merge automates the process of creating variable data documents by allowing users to connect to and merge data from a single data source. In utilizing the mail merge feature, data is pulled from a single data source (such as a mailing list) and inserted at marked locations in a document.
  • One drawback associated with the mail merge feature offered by modem desktop publishing applications is that users often store data (such as mailing lists) in disparate formats and in multiple locations in a computer system. For instance, a user may store one list of contacts as a contacts file which is readable by a contact manager program while another list of contacts may be stored as a spreadsheet file which is readable by a spreadsheet program. Thus, users are often required to assemble data in disparate formats from multiple sources into a single data source to utilize the mail merge feature offered by desktop publishing applications.
  • It is with respect to these considerations and others that the various embodiments of the present invention have been made.
  • BRIEF SUMMARY OF THE INVENTION
  • In accordance with the present invention, the above and other problems are solved by a method, system, and computer-readable medium for merging data from multiple data sources for use in an electronic document. According to one aspect of the invention, the method includes receiving field names and field data from an initial data source, mapping field names from a subsequently added data source to the initial data source, and building a recipient list schema based on the field names from the initial data source and the mapped field names from the subsequently added data source. The recipient list schema defines the relationships between the field names in the recipient list and the field names in the initial and subsequently added data sources. The building of the recipient list includes comparing the mapped field names from the subsequently added data source to the field names from the initial data source and, if any of the mapped field names from the subsequently added data source do not correspond to the field names from the initial data source, then the method includes adding the mapped field names.
  • The method further includes creating a recipient list according to the recipient list schema, and saving a file that allows for the recipient list to be re-created. The creation of the recipient list according to the recipient list schema may include adding rows of field data from the subsequently added data source to a temporary recipient list to create a master data source and calculating a hash value for each row of field data in the master data source. The saved updated recipient list file may include a reference to the initial data source, a reference to the subsequently added data source, and the hash value calculated for each of the rows of field data in the master data source. The method may further include modifying the recipient list to modify the field data in the initial data source and the field data in the subsequently added data source and resolving duplicates between the initial data source and the subsequently added data source in the recipient list.
  • The invention may be implemented as a computer process, a computing system, or as an article of manufacture such as a computer program product or computer readable media. The computer program product may be a computer storage media readable by a computer system and encoding a computer program of instructions for executing a computer process. The computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process.
  • These and various other features, as well as advantages, which characterize the present invention, will be apparent from a reading of the following detailed description and a review of the associated drawings.
  • BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
    • FIGURE 1 is a computer system architecture diagram illustrating a computer system utilized in and provided by the various embodiments of the invention;
    • FIGURE 2 is an illustrative routine performed by a desktop publishing application in the computer system of FIGURE 1 for merging data from multiple data sources, according to an illustrative embodiment of the invention; and
    • FIGURES 3-6 are screen diagrams illustrating an aspect of the invention for providing a facility through which a user may merge data and manage merged data from multiple data sources, according to the various embodiments of the invention.
    DETAILED DESCRIPTION OF THE INVENTION
  • Referring now to the drawings, in which like numerals represent like elements, various aspects of the present invention will be described. In particular, FIGURE 1 and the corresponding discussion are intended to provide a brief, general description of a suitable computing environment in which embodiments of the invention may be implemented. While the invention will be described in the general context of program modules that execute in conjunction with program modules that run on an operating system on a personal computer, those skilled in the art will recognize that the invention may also be implemented in combination with other types of computer systems and program modules.
  • Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including handheld devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
  • Referring now to FIGURE 1, an illustrative computer architecture for a computer 2 utilized in the various embodiments of the invention will be described. The computer architecture shown in FIGURE 1 illustrates a conventional desktop or laptop computer, including a central processing unit 5 ("CPU"), a system memory 7, including a random access memory 9 ("RAM") and a read-only memory ("ROM") 11, and a system bus 12 that couples the memory to the CPU 5. A basic input/output system containing the basic routines that help to transfer information between elements within the computer, such as during startup, is stored in the ROM 11. The computer 2 further includes a mass storage device 14 for storing an operating system 16, application programs, and other program modules, which will be described in greater detail below.
  • The mass storage device 14 is connected to the CPU 5 through a mass storage controller (not shown) connected to the bus 12. The mass storage device 14 and its associated computer-readable media provide non-volatile storage for the computer 2. Although the description of computer-readable media contained herein refers to a mass storage device, such as a hard disk or CD-ROM drive, it should be appreciated by those skilled in the art that computer-readable media can be any available media that can be accessed by the computer 2.
  • By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, digital versatile disks ("DVD"), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 2.
  • According to various embodiments of the invention, the computer 2 may operate in a networked environment using logical connections to remote computers through a network 18, such as the Internet. The computer 2 may connect to the network 18 through a network interface unit 20 connected to the bus 12. It should be appreciated that the network interface unit 20 may also be utilized to connect to other types of networks and remote computer systems. The computer 2 may also include an input/output controller 22 for receiving and processing input from a number of other devices, including a keyboard, mouse, or electronic stylus (not shown in FIGURE 1). Similarly, an input/output controller 22 may provide output to a display screen, a printer, or other type of output device.
  • As mentioned briefly above, a number of program modules and data files may be stored in the mass storage device 14 and RAM 9 of the computer 2, including an operating system 16 suitable for controlling the operation of a networked personal computer, such as the WINDOWS XP operating system from MICROSOFT CORPORATION af Redmond, Washington. The mass storage device 14 and RAM 9 may also store one or more program modules. In particular, the mass storage device 14 and the RAM 9 may store a desktop publishing application 10. As known to those skilled in the art, the desktop publishing application 10 is operative to provide functionality for creating and editing electronic documents. According to one embodiment of the invention, the desktop publishing application 10 comprises the PUBLISHER word processing application program from MICROSOFT CORPORATION. It should be appreciated, however, that desktop publishing applications from other manufacturers may be utilized to embody the various aspects of the present invention. It should further be appreciated that the various aspects of the present invention are not limited to desktop publishing applications but may also utilize other programs which are capable of processing text, such as the WORD program from MICROSOFT CORPORATION as well as spreadsheet programs and database programs.
  • In conjunction with the creation of a, the desktop publishing application 10 provides functionality for allowing a user to merge data sources 26 (Data Source 1), 28 (Data Source 2), and 30 (Data Source N) into various locations of an electronic document. It will be appreciated that the each of the data sources 26, 28, and 30 may be a list or table of data divided into one or more fields. For instance, a data source may store contact information including names, companies, and addresses in data fields "Last Name," "First Name," "Title," "Company Name," and "Address." It should be understood that in merging multiple data sources, the desktop publishing application 10 creates a recipient list file 24 for storing the merged data. An illustrative method utilized by the desktop publishing application 10 for merging the data sources 26 (Data Source 1), 28 (Data Source 2), and 30 (Data Source N) to create the recipient list file 24 will be described in greater detail with respect to FIGURE 2, below.
  • Referring now to FIGURE 2, an illustrative routine 200 will be described illustrating a process performed by the desktop publishing application 10 for merging data from multiple data sources. It should be appreciated that although the embodiments of the invention described herein are presented in the context of the desktop publishing application 10, the invention may be utilized in other types of application programs that support text processing, such as word processing, spreadsheet, and database programs.
  • When reading the discussion of the routines presented herein, it should be appreciated that the logical operations of various embodiments of the present invention are implemented (1) as a sequence of computer implemented acts or program modules running on a computing system and/or (2) as interconnected machine logic circuits or circuit modules within the computing system. The implementation is a matter of choice dependent on the performance requirements of the computing system implementing the invention. Accordingly, the logical operations illustrated in FIGURE 2, and making up the embodiments of the present invention described herein are referred to variously as operations, structural devices, acts or modules. It will be recognized by one skilled in the art that these operations, structural devices, acts and modules may be implemented in software, in firmware, in special purpose digital logic, and any combination thereof without deviating from the spirit and scope of the present invention as recited within the claims set forth herein.
  • Referring now to FIGURE 2, the routine 200 begins at operation 210, wherein the desktop publishing application 10 receives data sources 26 (Data Source 1), 28 (Data Source 2), and 28 (Data Source N) and "maps" the fields in each of the data sources. In particular, in mapping the fields in each of the data sources, the desktop publishing application 10 may retrieve a first data source as an initial data source, associate fields from each subsequent retrieved data source with the fields of the initial data source, and add fields in the subsequent data source which do not appear in the initial data source. For instance, the data source 26 (Data Source 1) may include data fields "First Name," and "Last Name," while data source 28 (Data Source 2) may include data fields "First Name," "Last Name," and "Title." In mapping the data fields, the word processing application program may designate the data source 26 as the initial data source and map the common fields (i.e., "First Name" and "Last Name") shared by the data sources as well as add the unique field "Title" from the data source 28. It will be appreciated that in the various embodiments of the invention, a user of the desktop publishing application 10 may designate the fields from subsequent data sources (e.g., the data source 28) to be mapped or added to the initial data source. It will further be appreciated that that mapped fields may also be un-mapped.
  • The routine 200 continues from operation 210 to operation 220, where the desktop publishing application 10 builds a recipient list (i.e., a master data source) schema from the mapped fields from data sources 26, 28, and 30. As defined herein, a "schema" defines the field names mapped from each of the input data sources. For instance, a recipient list schema for the data source 26 and the data source 28 may include the fields "First Name," and "Last Name" (from the data sources 26 and 28) as well as the field "Title" (from the data source 28 only).
  • The routine 200 continues from operation 220 to operation 230 where the desktop publishing application 10 creates a temporary recipient list from the recipient list schema. In particular, the desktop publishing application 10 may create a table of fields from the recipient list schema to receive data associated with the fields from each of the input data sources. The routine 200 continues from operation 230 to operation 240 where the desktop publishing application 10 retrieves one or more rows of data from each of the input data sources to create a recipient list. In particular, the desktop publishing application 10 retrieves data from each data source and fills the temporary recipient list according to the recipient list schema.
  • The routine 200 continues from operation 240 to operation 250 where the desktop publishing application 10 creates a hash for each row in the recipient list. As is known to those skilled in the art, a hash is a number generated from a string of text which may be used to access data methods. The hash (or hash value) is generated by a formula in such a way that it is extremely unlikely that some other text will produce the same hash value. Various methods for generating hash values are well-known to those skilled in the art, and therefore not discussed in further detail herein. It will be appreciated that the hash serves as a link to data in the input data sources may be used to distinguish the merged data in the recipient list.
  • The routine 200 continues from operation 250 to operation 260 where the desktop publishing application 10 saves the recipient list to a file (such as the recipient list file 24). It will be appreciated that in saving the recipient list, a file is created which allows for the recipient list to be re-created. It will be further be appreciated that the recipient list file 24 may include a reference to the initial data source, a reference to each subsequent or added data source, and the hash value calculated for each of the rows of field data in the recipient list. The routine 200 continues from operation 260 to operation 270 where the desktop publishing application 10 modifies the recipient list in response to input from a user. In particular, a user may update data in the recipient list by changing or removing data. It will be appreciated that the modification in the recipient list also modifies the data in the supplying data source. For instance, a user modifying last name data in the recipient list retrieved from the data source 26 will result in the modification of the same last name data in the data source 26. It will be appreciated that in modifying data in the recipient list, the desktop publishing application 10 may reference the hash value calculated for each modified row of data in the recipient list to locate and update the corresponding data in the affected data sources.
  • The routine 200 continues from operation 270 to operation 280 where the desktop publishing application 10 resolves duplicate data in the recipient list. In particular, it will be appreciated that the data sources 26, 28, and 30 may include identical data which may show up as duplicate entries in the recipient list. For instance, a user may have name and address information for the same person listed in two different data sources.
  • It will be appreciated that in one embodiment, the desktop publishing application 10 may be configured to locate duplicate data entries by creating a hash table of the hash values from the recipient list and comparing the data entries which two hash values point to. In the various illustrative embodiments of the invention, duplicates may be found by comparing every row with every other row. The hash value calculated for each row is stored in memory. It should be understood that one hash value is stored for every column in every row. The hash values are then compared. When two data entries match their hash values will also match. When two rows satisfy predetermined duplicate rules (e.g., 75% of the data in two rows must match), they are loaded from the master data source and the actual data entries are compared by a user to determine if they are in fact duplicates of one another. The routine 200 then ends.
  • Referring now to FIGURE 3, an illustrative user interface will be described for allowing a user to map a data source to a recipient list schema. FIGURE 3 shows a user interface window 300 including a number of user interface components for mapping fields from a data source. In particular, a window 40 displays fields from a data source which may be mapped to a recipient list schema by dragging them from the window 40 and dropping them into the field 42 or into the field 44. Fields added to the window 44 will be added as a new column in the recipient list schema. A mapping may be undone by dragging fields from the window 42 or 44 to the window 40. A default map button 46 is provided for selecting a default mapping of the fields from the data source which is automatically determined by the desktop publishing application 10.
  • Referring now to FIGURE 4, an illustrative user interface will be described which displays a recipient list created from two data sources. FIGURE 4 shows a user interface window 400 displaying a recipient list along with rows of mapped fields and associated data. The window 52 identifies the data sources used to create the recipient list. Links 54 and 56 are provided for adding a data source to the recipient list from a file, database, or from a contacts list. Link 58 is provided for creating a data source by typing a new list.
  • Referring now to FIGURE 5, an illustrative user interface will be described for allowing a user to edit a data source. FIGURE 5 shows a user interface window 500 displaying a data source table 60 along with rows of filed data. An add new entry button 62 is provided for adding a new entry to the data source table 60 and a delete entry button 64 is provided for deleting a displayed entry in the data source table 60. It will be appreciated that the user interface window 500 may include mapped data or copied data from a recipient list which is linked to an original data source stored in another location such that changes will be propagated back to the original data source.
  • Referring now to FIGURE 6, an illustrative user interface will be described for allowing a user to remove duplicates found in a recipient list by the desktop publishing application 10 as discussed above with respect to FIGURE 2. FIGURE 6 shows a user interface 600 displaying a data source table 70 listing duplicate data entries. A column of check boxes 72 is provided for allowing a user to select entries to be removed from the recipient list.
  • Based on the foregoing, it should be appreciated that the various embodiments of the invention include a method, system, and computer-readable medium for merging data from multiple data sources for use in an electronic document. The above specification, examples and data provide a complete description of the manufacture and use of the composition of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.

Claims (19)

  1. A method for merging data from a plurality of data sources for use in an electronic document, comprising:
    receiving field names and field data from an initial data source;
    mapping field names from at least one subsequently added data source to the initial data source;
    building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source;
    creating a recipient list according to the recipient list schema; and
    saving the recipient list to a file, wherein saving the recipient list to a file comprises saving a file which allows for the recipient list to be re-created.
  2. The method of claim 1, further comprising generating a temporary recipient list for receiving field data from the at least one subsequently added data source and the field data from the initial data source.
  3. The method of claim 1, wherein building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source comprises:
    comparing the mapped field names from the at least one subsequently added data source to the field names from the initial data source; and
    if any of the mapped field names from the at least one subsequently added data source do not correspond to the field names from the initial data source, then adding the mapped field names.
  4. The method of claim 2, wherein creating the recipient list according to the recipient list schema comprises:
    adding at least one row of field data from the at least one new data source to the temporary recipient list to create a master data source, the master data source comprising a plurality of rows of master field data; and
    calculating a hash value for each of the plurality of rows of master field data in the master data source.
  5. The method of claim 1 further comprising modifying the recipient list to modify the field data in the initial data source and the field data in the at least one subsequently added data source.
  6. The method of claim 1 further comprising resolving duplicates between the initial data source and the at least one subsequently added data source in the recipient list.
  7. A system for merging data from a plurality of data sources for use in an electronic document, comprising a client computer operative to execute an application program for consuming data from the plurality of data sources, the application further operative to receive field names and field data from an initial data source, map field names from at least one subsequently added data source to the initial data source, building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source, create a recipient list according to the recipient list schema, and to save the recipient list to a file.
  8. The system of claim 7, wherein the application program is further operative to generate a temporary recipient list for receiving field data from the at least one subsequently added data source and the field data from the initial data source.
  9. The system of claim 7, wherein building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source comprises:
    comparing the mapped field names from the at least one subsequently added data source to the field names from the initial data source; and
    if any of the mapped field names from the at least one subsequently added data source do not correspond to the field names from the initial data source, then adding the mapped field names.
  10. The system of claim 8, wherein creating the recipient list according to the recipient list schema comprises:
    adding at least one row of field data from the at least one new data source to the temporary recipient list to create a master data source, the master data source comprising a plurality of rows of master field data; and
    calculating a hash value for each of the plurality of rows of master field data in the master data source.
  11. The system of claim 7, wherein the application program is further operative to modify the recipient list to modify the field data in the initial data source and the field data in the at least one subsequently added data source.
  12. The system of claim 7, wherein the application program is further operative to resolve duplicates between the initial data source and the at least one subsequently added data source in the recipient list.
  13. The system of claim 10, wherein the saved updated recipient list file comprises:
    a reference to the initial data source;
    a reference to the at least one subsequently added data source; and
    the hash value calculated for each of the plurality of rows of master field data in the master data source.
  14. A computer-readable medium having computer-executable instructions stored thereon which, when executed by a computer, will cause the computer to perform a method for merging data from a plurality of data sources for use in an electronic document, the method comprising:
    receiving field names and field data from an initial data source;
    mapping field names from at least one subsequently added data source to the initial data source;
    building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source;
    creating a recipient list according to the recipient list schema; and
    saving the recipient list to a file, wherein saving the recipient list to a file comprises saving a file which allows for the recipient list to be re-created.
  15. The computer readable medium of claim 14, further comprising generating a temporary recipient list for receiving field data from the at least one subsequently added data source and the field data from the initial data source.
  16. The computer readable medium of claim 14, wherein building a recipient list schema based on the field names from the initial data source and the mapped field names from the at least one subsequently added data source comprises:
    comparing the mapped field names from the at least one subsequently added data source to the field names from the initial data source; and
    if any of the mapped field names from the at least one subsequently added data source do not correspond to the field names from the initial data source, then adding the mapped field names.
  17. The computer readable medium of claim 15, wherein updating the recipient list according to the recipient list schema comprises:
    adding at least one row of field data from the at least one new data source to the temporary recipient list to create a master data source, the master data source comprising a plurality of rows of master field data; and
    calculating a hash value for each of the plurality of rows of master field data in the master data source.
  18. The computer readable medium of claim 14 further comprising modifying the recipient list to modify the field data in the initial data source and the field data in the at least one subsequently added data source.
  19. The computer readable medium of claim 14 further comprising resolving duplicates between the initial data source and the at least one subsequently added data source in the recipient list.
EP05108491A 2004-09-30 2005-09-15 Merging data from multiple data sources for use in an electronic document Withdrawn EP1643377A3 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/955,433 US7739309B2 (en) 2004-09-30 2004-09-30 Method, system, and computer-readable medium for merging data from multiple data sources for use in an electronic document

Publications (2)

Publication Number Publication Date
EP1643377A2 true EP1643377A2 (en) 2006-04-05
EP1643377A3 EP1643377A3 (en) 2009-01-14

Family

ID=35563470

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05108491A Withdrawn EP1643377A3 (en) 2004-09-30 2005-09-15 Merging data from multiple data sources for use in an electronic document

Country Status (5)

Country Link
US (1) US7739309B2 (en)
EP (1) EP1643377A3 (en)
JP (1) JP4906292B2 (en)
KR (1) KR101130443B1 (en)
CN (1) CN100552677C (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060156135A1 (en) * 2004-12-16 2006-07-13 Marek Sikora Tabbed form with error indicators
US7788259B2 (en) * 2006-07-21 2010-08-31 Microsoft Corporation Locating, viewing and interacting with information sources
KR100803006B1 (en) * 2006-09-20 2008-02-14 삼성전자주식회사 Method and device for dynamic input in wireless terminal
EP1912404B1 (en) * 2006-10-11 2011-06-01 Murata Machinery, Ltd. File transfer server
KR100834293B1 (en) * 2006-11-06 2008-05-30 엔에이치엔(주) Document processing system and method
US7675527B2 (en) * 2007-01-26 2010-03-09 Microsoft Corp. Multisource composable projection of text
US20080243823A1 (en) * 2007-03-28 2008-10-02 Elumindata, Inc. System and method for automatically generating information within an eletronic document
US8694518B2 (en) * 2007-06-14 2014-04-08 Colorquick, L.L.C. Method and apparatus for database mapping
US8126928B2 (en) * 2007-06-27 2012-02-28 Sap Ag Systems and methods for merging data into documents
US8056073B2 (en) * 2008-01-08 2011-11-08 International Business Machines Corporation Method, computer program product, and system for merging multiple same class instance states
US9189478B2 (en) * 2008-04-03 2015-11-17 Elumindata, Inc. System and method for collecting data from an electronic document and storing the data in a dynamically organized data structure
US8661342B2 (en) * 2008-06-17 2014-02-25 Microsoft Corporation Mail merge integration techniques
US8176042B2 (en) * 2008-07-22 2012-05-08 Elumindata, Inc. System and method for automatically linking data sources for providing data related to a query
US8037062B2 (en) 2008-07-22 2011-10-11 Elumindata, Inc. System and method for automatically selecting a data source for providing data related to a query
US8041712B2 (en) * 2008-07-22 2011-10-18 Elumindata Inc. System and method for automatically selecting a data source for providing data related to a query
US8341131B2 (en) * 2010-09-16 2012-12-25 Sap Ag Systems and methods for master data management using record and field based rules
US9767132B2 (en) 2011-10-10 2017-09-19 Salesforce.Com, Inc. Systems and methods for real-time de-duplication
US10546057B2 (en) 2011-10-28 2020-01-28 Microsoft Technology Licensing, Llc Spreadsheet program-based data classification for source target mapping
US9201558B1 (en) 2011-11-03 2015-12-01 Pervasive Software Inc. Data transformation system, graphical mapping tool, and method for creating a schema map
US9430114B1 (en) 2011-11-03 2016-08-30 Pervasive Software Data transformation system, graphical mapping tool, and method for creating a schema map
US8943059B2 (en) * 2011-12-21 2015-01-27 Sap Se Systems and methods for merging source records in accordance with survivorship rules
US8645332B1 (en) 2012-08-20 2014-02-04 Sap Ag Systems and methods for capturing data refinement actions based on visualized search of information
JP6136694B2 (en) 2013-07-19 2017-05-31 富士通株式会社 Data management program, data management apparatus, and data management method
US10282407B1 (en) * 2013-08-21 2019-05-07 The United States Of America, As Represented By The Secretary Of The Navy Method for filtering data to generate a balance sheet
US11062293B2 (en) * 2013-12-10 2021-07-13 De Lage Landen Financial Services Method and system for negotiating, generating, documenting, and fulfilling vendor financing opportunities
US20160321757A1 (en) * 2015-04-29 2016-11-03 Pacific Resources Benefits Advisors, Llc Methods and systems for generating, and responding to, requests for proposals and requests for information for insurance products
CN105069542A (en) * 2015-06-25 2015-11-18 中铁四局集团有限公司 Responsibility cost budgeting method and system
CN104881762A (en) * 2015-06-25 2015-09-02 中铁四局集团有限公司 Engineering quantity list decomposition method and system
CN104978307A (en) * 2015-06-25 2015-10-14 中铁四局集团有限公司 List processing method and device
US10467615B1 (en) 2015-09-30 2019-11-05 Square, Inc. Friction-less purchasing technology
CN106919443A (en) * 2015-12-25 2017-07-04 阿里巴巴集团控股有限公司 Perform method, the apparatus and system of calculating task
US10515121B1 (en) 2016-04-12 2019-12-24 Tableau Software, Inc. Systems and methods of using natural language processing for visual analysis of a data set
US10810569B2 (en) 2017-01-30 2020-10-20 Square, Inc. Contacts for misdirected payments and user authentication
US10810574B1 (en) 2017-06-29 2020-10-20 Square, Inc. Electronic audible payment messaging
CN107479910B (en) * 2017-07-07 2020-11-20 广州视源电子科技股份有限公司 Document repairing method, system, readable storage medium and computer equipment
CN108989062B (en) * 2018-07-25 2020-05-01 北京达佳互联信息技术有限公司 Method, device, terminal, system and storage medium for updating group member data
US11079954B2 (en) * 2018-08-21 2021-08-03 Samsung Electronics Co., Ltd. Embedded reference counter and special data pattern auto-detect
CN110457084B (en) * 2019-07-10 2021-11-30 五八有限公司 Loading method and device
US11314935B2 (en) * 2019-07-25 2022-04-26 Docusign, Inc. System and method for electronic document interaction with external resources
CN112860737B (en) * 2021-03-11 2022-08-12 中国平安财产保险股份有限公司 Data query method and device, electronic equipment and readable storage medium
US11604839B2 (en) 2021-05-17 2023-03-14 Docusign, Inc. Document package merge in document management system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5111395A (en) 1989-11-03 1992-05-05 Smith Rodney A Automated fund collection system including means to eliminate duplicate entries from a mailing list
US6748402B1 (en) 2001-04-02 2004-06-08 Bellsouth Intellectual Property Corporation System and method for converting and loading interactive pager address books

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6167405A (en) * 1998-04-27 2000-12-26 Bull Hn Information Systems Inc. Method and apparatus for automatically populating a data warehouse system
JP2000187668A (en) * 1998-12-22 2000-07-04 Hitachi Ltd Grouping method and overlap excluding method
US6424969B1 (en) * 1999-07-20 2002-07-23 Inmentia, Inc. System and method for organizing data
JP4552242B2 (en) * 1999-10-06 2010-09-29 株式会社日立製作所 Virtual table interface and query processing system and method using the interface
US7363581B2 (en) 2003-08-12 2008-04-22 Accenture Global Services Gmbh Presentation generator
JP2005078612A (en) * 2003-09-04 2005-03-24 Hitachi Ltd File sharing system, and file transfer method between file sharing systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5111395A (en) 1989-11-03 1992-05-05 Smith Rodney A Automated fund collection system including means to eliminate duplicate entries from a mailing list
US6748402B1 (en) 2001-04-02 2004-06-08 Bellsouth Intellectual Property Corporation System and method for converting and loading interactive pager address books

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HERNANDEZ M. A.: "SIGMOD RECORD", vol. 24, 1 June 1995, ACM, article "The merge/purge problem for large databases", pages: 127 - 138
LISA WALIN: "Extending Mail Merge Features", WORD 2002 TECHNICAL ARTICLES, April 2001 (2001-04-01)
MERLINMERGE SPEEDPRO SOFTWARE USE, 5 August 2004 (2004-08-05)

Also Published As

Publication number Publication date
US7739309B2 (en) 2010-06-15
US20060075323A1 (en) 2006-04-06
EP1643377A3 (en) 2009-01-14
CN100552677C (en) 2009-10-21
KR20060050395A (en) 2006-05-19
KR101130443B1 (en) 2012-03-28
JP4906292B2 (en) 2012-03-28
JP2006107466A (en) 2006-04-20
CN1755689A (en) 2006-04-05

Similar Documents

Publication Publication Date Title
US7739309B2 (en) Method, system, and computer-readable medium for merging data from multiple data sources for use in an electronic document
US10755234B2 (en) System and method for offline synchronization of exception items of shared services for client applications
US7895179B2 (en) Asynchronous updating of web page data views
US7711754B2 (en) System and method for managing data using static lists
KR101344101B1 (en) Redirection to local copies of server based files
KR101238541B1 (en) Methods and systems for providing a customized user interface for viewing and editing meta-data
US7254784B2 (en) User-driven menu generation system with multiple submenus
US20060031587A1 (en) Method of synchronising between three or more devices
KR20060046282A (en) Method, system, and apparatus for exposing workbooks as data sources
KR20060070405A (en) Management and use of data in a computer-generated document
KR20080064796A (en) Electronic data snapshot generator
US20060218198A1 (en) Method and computer-readable medium for formula-based document retention
US8346869B2 (en) Granular data synchronization for editing multiple data objects
US20110307243A1 (en) Multilingual runtime rendering of metadata
CA2327196C (en) System and method for detecting dirty data fields
US7941453B1 (en) Method and system for deployment of content using proxy objects
US7546322B2 (en) Generating unique name/version number pairs when names can be re-used
US7283994B2 (en) Merging of products into a database
US7546526B2 (en) Efficient extensible markup language namespace parsing for editing
US11698935B2 (en) System and method for copying linked documents
US20080319780A1 (en) Defining reports for dimension based enterprise resource planning systems

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

17P Request for examination filed

Effective date: 20090608

17Q First examination report despatched

Effective date: 20090710

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20121015