CN114168628A - Method and system for screening target data - Google Patents

Method and system for screening target data Download PDF

Info

Publication number
CN114168628A
CN114168628A CN202110896757.3A CN202110896757A CN114168628A CN 114168628 A CN114168628 A CN 114168628A CN 202110896757 A CN202110896757 A CN 202110896757A CN 114168628 A CN114168628 A CN 114168628A
Authority
CN
China
Prior art keywords
data
tables
field
screening
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110896757.3A
Other languages
Chinese (zh)
Inventor
金沃洙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Armiq Co Ltd
Original Assignee
Armiq Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Armiq Co Ltd filed Critical Armiq Co Ltd
Publication of CN114168628A publication Critical patent/CN114168628A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24547Optimisations to support specific applications; Extensibility of optimisers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/282Hierarchical databases, e.g. IMS, LDAP data stores or Lotus Notes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a system for screening target data. The method for screening target data can comprise the following steps: grouping data contained in the database with the dependency among the tables as a reference, wherein the data are used as objects of a minimum flow processing unit; setting a screening standard of target data based on at least one field of the data contained in the database; and extracting target data corresponding to the set screening standard by considering the dependency among the tables in the object.

Description

Method and system for screening target data
Technical Field
The invention relates to a method and a system for screening target data.
Background
Recently, the phenomenon of enterprise merger (M & a) of overseas enterprises and domestic enterprises frequently occurs. Enterprises now have attracted attention as keywords for enterprise business.
Enterprises can continue to pursue growth and development by using external resources, and can enjoy the effects of shortening the time to enter a new market, avoiding conflict with existing market participants, increasing market dominance, and introducing sophisticated technologies. Further, the number of concurrent purchasing activities for cleaning and selling existing marginal enterprises in order to adapt to changing environments and promoting enterprise acquisition in order to secure new business power is increasing.
In this case, the acquiring party requests the seller to transfer the IT system, hopes that the acquiring organization has continuous business continuity, and the seller hopes that the data of the local organization that is not sold is not transferred as a business asset. Accordingly, there is an increasing need for techniques to accurately screen, separate, transmit, and convert data from a multitude of data within a system for a desired purpose.
Documents of the prior art
Korean laid-open patent No. 10-2019-
Disclosure of Invention
The present invention provides a method and system for screening target data, which can be used for screening target data in consideration of dependency between tables.
The invention provides a method and a system for screening target data, which can screen the required target data by selecting, combining, eliminating, adding and other modularization on data in a database according to various data processing purposes such as filing, cleaning, transmission, conversion, encryption, de-identification and the like.
The invention provides a method for screening target data, which is executed in a computer device comprising at least one processor and comprises the following steps: grouping, by the at least one processor, data contained in the database into objects that are units of minimum flow processing, based on dependencies between tables in the database; setting, by the at least one processor, a screening criterion for the target data based on at least one field of data contained in the database; and extracting, by the at least one processor, destination data corresponding to the set filtering criteria in consideration of dependencies among the tables in the object.
According to an embodiment, in the step of setting a filtering criterion, the filtering criterion of the target data may be set based on at least one of the following fields: (1) a period field having a range of a specific time or period related to the data as a field value; (2) an organization field having a field value for at least one of a legal code, a factory, a business organization, and a department; and (3) other property fields having field values for at least one of region, country code, language, user, file type, customer group, and creator.
According to yet another embodiment, the method for filtering target data may further include classifying, by the at least one processor, the object according to at least one of a process-based application area, a data type, and a characteristic.
According to another embodiment, the method for filtering the target data may further include the step of setting, by the at least one processor, the utilization target of the target data to be any one of selective archiving, backup, cleaning, transmission, conversion, de-identification and encryption.
According to still another embodiment, the objects of the filtering criteria and the filtering target may be changed according to the set target of the target data.
In still another embodiment, the step of extracting the target data may extract the target data by searching for a key value of a head table (header table) that is the highest in the table of the object, and sequentially extracting data corresponding to the searched key value based on a dependency between tables in the object.
According to yet another embodiment, the method for filtering target data may further include the step of setting, by the at least one processor, an additional reference for data to be excluded from the target data or data to be added to the target data.
The invention also provides a computer program stored on a computer readable storage medium for, in combination with a computer apparatus, performing the above method in the computer apparatus.
The present invention also provides a computer-readable storage medium in which a computer program for executing the above method in a computer apparatus is stored.
The present invention also provides a computer apparatus, comprising at least one processor configured to execute computer-readable instructions; executing, by the at least one processor: grouping data contained in the database with the dependency among tables as a reference, wherein the data is used as an object of a minimum flow processing unit; setting a screening standard of target data based on at least one field of data contained in the database and the classification of the object; and extracting target data corresponding to the set screening standard by considering the dependency among the tables in the object.
The destination data can be filtered by considering the dependencies between the tables.
According to a plurality of data processing purposes such as filing, cleaning, transmission, conversion, encryption, de-identification and the like, the data are selected, combined, eliminated, added and the like in a database in a modular manner, so that the required purpose data can be screened.
Drawings
Fig. 1 is a diagram showing an example of a network environment of an embodiment of the present invention.
FIG. 2 is a block diagram illustrating an example of a computer apparatus of an embodiment of the invention.
Fig. 3 is a flowchart illustrating an example of a method of screening destination data according to an embodiment of the present invention.
FIG. 4 is a flow diagram illustrating an example of a process for defining and classifying objects in one embodiment of the invention.
Fig. 5 is a diagram showing an example of an object of an embodiment of the present invention.
Fig. 6 and 7 are diagrams showing examples of classifying objects according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating an example of multiple tables with dependencies in an embodiment of the invention.
FIG. 9 is a diagram illustrating an example of the use of fields to retrieve different domains in one embodiment of the invention.
Fig. 10 is a diagram for explaining an example of a header table and an entry table in a dependency relationship in an embodiment of the present invention.
Detailed Description
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. However, the present invention is not limited to these specific embodiments, and it should be understood that the present invention includes all modifications, equivalents, and alternatives included in the spirit and scope of the present invention. In the description for the respective drawings, similar reference numerals are used for similar structural elements.
The terms "first", "second", "a", "B", and the like may be used to describe various structural elements, but the structural elements are not limited by the terms. The above terms are only used to distinguish one structural element from another. For example, a first structural element may be termed a second structural element, and similarly, a second structural element may be termed a first structural element, without departing from the scope of the present invention. The term "and/or" includes any combination of the recited plurality of related items or any of the plurality of related items.
When a structural element is described as being "connected" or "coupled" to another structural element, the structural element may be directly connected or coupled to the other structural element, but it is understood that other structural elements may be present therebetween. On the contrary, when a description is made that one component is "directly connected" or "directly coupled" to another component, it is to be understood that no other component is present therebetween.
The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. Unless a different meaning is explicitly stated in context, an expression in the singular includes an expression in the plural. In the present invention, terms such as "including" or "having" are intended to specify the presence of stated features, numerals, steps, actions, structural elements, components or combinations thereof, and are not to be construed as precluding the presence or addition of one or more other features or numerals, steps, actions, structural elements, components or combinations thereof in advance.
Unless otherwise defined, all terms including technical and scientific terms have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms as defined in dictionaries should be interpreted as having meanings consistent with context in the relevant art and should not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The system for screening target data according to various embodiments of the present invention may be implemented by at least one computer device, and the method for screening target data according to various embodiments of the present invention may be performed by at least one computer device included in the system for screening target data. The computer program according to the embodiments of the present invention may be installed and driven in a computer device, and the computer device may perform the method of screening target data according to the embodiments of the present invention under the control of the driven computer program. The computer program may be stored in a computer readable storage medium for use in combination with a computer apparatus to perform a method of screening data for a purpose at the computer apparatus.
Fig. 1 is a diagram showing an example of a network environment of an embodiment of the present invention. Fig. 1 is a diagram showing an example of a network environment of an embodiment of the present invention. The network environment of fig. 1 illustrates an example including a plurality of electronic devices 110, 120, 130, 140, a plurality of servers 150, 160, and a network 170. Such fig. 1 is an example for describing the invention, and the number of electronic devices or servers is not limited by fig. 1. The network environment in fig. 1 is only one example applicable to the present embodiment, and the environment applicable to the present embodiment is not limited to the network environment in fig. 1.
The plurality of electronic devices 110, 120, 130, 140 may be fixed terminals or mobile terminals implemented as computer systems. For example, the plurality of electronic devices 110, 120, 130, 140 may be smart phones (smart phones), mobile phones, navigators, computers, notebook computers, Digital broadcast terminals, Personal Digital Assistants (PDAs), Portable Multimedia Players (PMPs), tablet computers, and the like. As an example, fig. 1 illustrates a smartphone shape as an example of the electronic device 110, but in an embodiment of the present invention, the electronic device 110 may actually refer to one of various physical computer devices that communicate with other plurality of electronic devices 120, 130, 140 and/or servers 150, 160 over the network 170 using wireless or wired communication.
The communication method is not limited, and includes not only a communication method of a communication network (as an example, a mobile communication network, a wired internet, a wireless internet, a broadcasting network, etc.) which the application network 170 may include, but also short-range wireless communication between devices. For example, the network 170 may include any one or more of Personal Area Network (PAN), Local Area Network (LAN), Campus Area Network (CAN), Metropolitan Area Network (MAN), Wide Area Network (WAN), broadband network (BBN), and the internet. Also, the network 170 may include any one or more of network topologies including, but not limited to, a bus network, a star network, a ring network, a mesh network, a star bus network, a tree or hierarchical (hierarchical) network, and the like.
Servers 150, 160 may each be implemented as a computer device or multiple computer devices that communicate with multiple electronic devices 110, 120, 130, 140 over network 170 to provide instructions, code, files, content, services, etc. For example, the server 150 may be a system that provides services (e.g., archiving services, file distribution services, mapping services, content providing services, group call services (or voice conferencing services), information services, mail services, social networking services, translation services, financial services, payment services, retrieval services, etc.) to a plurality of electronic devices 110, 120, 130, 140 accessed over a network 170.
FIG. 2 is a block diagram illustrating an example of a computer apparatus of an embodiment of the invention. The previously described plurality of electronic devices 110, 120, 130, 140 or plurality of servers 150, 160 may be implemented by the computer apparatus 200 shown in fig. 2, respectively.
As shown in FIG. 2, such a computer device 200 may include a memory 210, a processor 220, a communication interface 230, and an input/output interface 240. The memory 210 may include a non-volatile mass storage device (non-volatile mass storage device) such as a Random Access Memory (RAM), a Read Only Memory (ROM), and a disk drive as a computer-readable storage medium. Wherein a non-volatile mass storage device, such as a read-only memory, disk drive, etc., may be included in computer device 200 as a separate persistent storage device distinct from memory 210. Also, the memory 210 may store an operating system and at least one program code. Such a plurality of software structural elements may be loaded into memory 210 from a computer-readable storage medium separate from memory 210. Such separate computer-readable storage media may include floppy disk drives, magnetic disks, magnetic tape, compact disk read only drive (DVD/CD-ROM) drives, and memory cards, among other computer-readable storage media. In yet another embodiment, these software features are loaded into memory storage 210 not through a computer-readable storage medium, but through communication interface 230. For example, the software components may be loaded into memory 210 of computing device 200 using a computer program installed from a file received over network 170.
Processor 220 may be configured to process computer instructions by performing basic computational, logical, and input/output operations. The instructions may be provided to processor 220 by memory 210 or communication interface 230. For example, the processor 220 may be configured to execute instructions received in accordance with program code stored in a storage device, such as the memory 210.
The communication interface 230 can provide a function of allowing the computer device 200 to communicate with other devices (for example, a plurality of storage devices described above) via the network 170. For example, the processor 220 of the computer device 200 may transmit requests, instructions, data, files, etc. generated from program codes stored in a storage device such as the memory 210 to other devices via the network 170 under the control of the communication interface 230. Conversely, signals or instructions, data, files, etc. from other devices may be provided to computer device 200 through communication interface 230 of computer device 200 over network 170. Signals or instructions, data, etc. received through the communication interface 230 may be transferred to the processor 220 or the memory 210, and files, etc. may be stored in a storage medium (the above-described permanent storage device) that the computer device 200 may further include.
The input/output interface 240 may be a unit for interfacing with the input/output device 250. For example, the input device may include a microphone, a keyboard, a mouse, and the like, and the output device may include a display, a speaker, and the like. As another example, the input/output interface 240 may be a unit for interfacing with a device integrated with an input/output function such as a touch panel. The input/output device 250 may be constituted by the computer device 200 and one device.
Moreover, in another embodiment, computer device 200 may also include more structural elements than those of FIG. 2. However, many components in most of the prior art are not necessarily explicitly shown. For example, the computer device 200 may be implemented to include at least a portion of the input/output device 250 described above, or further include other multiple structural elements such as a transceiver (transceiver), a database, and the like.
Fig. 3 is a flowchart illustrating an example of a method of screening destination data according to an embodiment of the present invention. The method for screening target data according to the present embodiment may be performed by the computer device 200 described in fig. 2. In this case, the processor 220 of the computer apparatus 200 may execute an instruction (instruction) based on the code of the operating system and at least one program code contained in the memory 210. The processor 220 can control the computer device 200 to execute a plurality of steps (steps 310 to 330) included in the method shown in fig. 3 according to instructions provided by the program code stored in the computer device 200.
In step 310, the computer device 200 may group the data included in the database into objects that are the minimum flow processing unit based on the dependency between tables in the database. For example, the computer apparatus 200 groups a plurality of tables having dependencies in a database into the same object, and a plurality of objects may be defined according to the database. Inter-table dependencies may mean that two tables each include data identified by the same key value. In this case, in a plurality of tables having dependencies, data identified by the same key value may include at least one field different from each other.
In step 320, the computer device 200 may set a filtering criterion for the destination data based on at least one field of the data contained in the database. Where a field may be information identifying the type of field value contained in a data record in a table. For example, the computer apparatus 200 may set a filtering criterion for the destination data based on at least one of the following fields: (1) a period field having a range of a specific time or period related to the data as a field value; (2) an organization field having a field value for at least one of a legal code, a factory, a business organization, and a department; and (3) other property fields having field values for at least one of region, country code, language, user, file type, customer group, and creator. The period field may include, for example, a creation date, a change date, an accounting year, a voucher date, a production shipment date, etc. of the form data.
In step 330, the computer device 200 extracts destination data corresponding to the set filtering criteria, taking into account the dependency between tables in the object. For example, the computer device 200 retrieves the key value of the top head table in the object, and extracts the data corresponding to the retrieved key value in order according to the dependency between tables in the object, thereby extracting the destination data. The method of extracting the target data is further described in detail with reference to fig. 10.
According to an embodiment, the computer apparatus 200 may set the utilization target of the destination data to any one of selective archiving, backup, cleaning, transmission, conversion, de-identification, and encryption before setting the filtering criterion. In this case, the objects to be the filtering criteria and the filtering target may be changed according to the set target data utilization target. For example, in step 320, the computer device 200 may further set a filtering criterion for the destination data based on the utilization target of the destination data.
The computer device 200 classifies the object according to at least one of the application area, the data type, and the property based on the flow. Such classification of the object can be used as a reference for selecting an object for screening the target data according to the utilization target of the target data. The classification of objects is explained in further detail below with reference to fig. 4.
FIG. 4 is a flow diagram illustrating an example of a process for defining and classifying objects in one embodiment of the invention. According to an embodiment, steps 410 to 450 as shown in fig. 4 may be included in step 310 shown by fig. 3.
In step 410, the computer device 200 may analyze the table in the database.
For example, after grouping data by objects in such a step 410 and using for an analysis work for classifying the objects, the computer apparatus 200 lists the entire tables in the database, and by analyzing the size and number of the tables, the tables without data can be excluded from the analysis target. According to the embodiment, even if there is no actual data, it can be selected as a table as a screening target, but can be excluded in an analysis target.
Also, the computer apparatus 200 may analyze the table fields. As an example, there may be at least one field (duration field, organization field, and/or other characteristic field) in the table that is considered for setting the screening conditions, or not at all. Such table field analysis is used to analyze which fields are in the table, and in the case of table a having 3 date fields as shown in table 1, the computer apparatus 200 may select all three fields for analysis.
TABLE 1
Figure BDA0003198170530000091
Also, the computer apparatus 200 may analyze the field data distribution. For example, after the computer device 200 lists the tables/fields by the analysis target, the distribution degree of the actual data values may be analyzed in the tables/fields. For example, in table a of table 1, when the creation Date (Create Date) of the Data filtering query is 2020.01 to 2020.02 and the Order Date (Order Data) is 2020, it can be seen that only one Data having a field value of "2" as a field "file number (doc. Also, the computer apparatus 200 may analyze a table having no field value or a table having field values of numbers in fields or a missing value ratio as a ratio to the number of fields. For example, when there are a plurality of fields having the same characteristics, the missing value ratio may be considered in order to improve the screening accuracy of the target data. For example, when screening the objective data in the organization code as in table 2 below, the field "organization field a (org. field a)" is preferably selected as in table B below.
TABLE 2
Figure BDA0003198170530000092
Figure BDA0003198170530000101
In addition, the computer device 200 analyzes the table by retrieving the field-used-list (WUL) of the different domains. The method for retrieving the field usage of different domains is explained in detail below with reference to fig. 9. In step 420, the computer device 200 may define an object. For example, the computer device 200 defines a plurality of tables grouped based on the dependency between the tables as one object. Such an object may be constituted by one or more tables as a minimum flow processing unit. As a more specific example, the plurality of tables of the minimum process unit may include a plurality of tables generated for main materials, main customers, price conditions, customer credits, sales orders, loan requests, financial instruments, account balances, profitability analysis, interface logs, user login history, etc., and such minimum process unit may be defined in various forms according to the setting of the enterprise maintaining the management-related database.
In step 430, the computer device 200 may classify the objects by module. Wherein a module may mean an application area according to each process, such application area may be defined in various forms by production, sales, materials, financial accounting, management accounting, infrastructure, communication, industry, etc.
In step 440, the computer apparatus 200 may classify the object by type. The types of objects classified may include master data, transaction data, configuration data, control data, system data, and the like. For example, the main data may mean data in which related data becomes a reference of data generation, and the transaction data may mean data in which data generation occurs continuously with time, organization, and the like. Objects may be classified into various defined types according to the type of data of the table contained by the object.
In step 450, the computer device 200 may classify the object by characteristic. For example, the object characteristics may include a file (Document) indicating a certificate, an order, and the like that continuously occur, a History (History) recording a change content such as a current production Status (Status) of a product, a save file, and a Summary (Summary) recording a total amount of transactions of a client for a certain period of time.
Fig. 5 is a diagram showing an example of an object of an embodiment of the present invention. Fig. 6 and 7 are diagrams showing examples of classifying objects according to an embodiment of the present invention.
Fig. 5 shows an Object 1(Object #1) including Table 1(Table #1), Table 2(Table #2), and Table 3(Table #3), an Object 2(Object #2) including Table 4(Table #4) and Table 5(Table #5), and an Object 3(Object #3) including Table 6(Table # 6). It can be seen that table 1, table 2, and table 3 have dependencies, and table 4 and table 5 have dependencies.
Fig. 6 shows that the object 1 and the object 2 are classified by the Application Area 1(Application Area #1), and the object 3 is classified by the Application Area 2(Application Area # 2). As described above, the application area may correspond to a module.
FIG. 7 is an example illustrating a process of classifying objects by module, type, and characteristic. It is illustrated in fig. 4 that the application area can be variously defined by production, sales, materials, financial accounting, management accounting, infrastructure, communication, industry, and the like. In the embodiment of fig. 7, it is shown that the object can be classified by one of Application areas (Application areas) of Production (Production), Sales (Sales), Finance (Finance), System (System), and Industry (Industry)). Also, it can be illustrated in fig. 4 that the types of classifying the object may include main data, transaction data, configuration data, control data, system data, and the like. In the embodiment of fig. 7, it is shown that the object may be classified by one of a Transaction (Transaction) data type, a Master (Master) data type, a custom (customization) data type, a Temporary (temporal) data type, a Control (Control) data type, and a System (System) data type. In fig. 4, the characteristics of the object, the file such as a certificate and an order that are continuously generated, the history of the current production state of the product, the history of the change contents such as the file that is stored, and the summary of the total amount of the transaction of the client for a certain period have been described. The characteristics of such objects are also reflected in the embodiment of fig. 7. Also, in the embodiment of fig. 7, it is shown that a Classification (Summary Classification) object as a Summary property can be subdivided into a Period Summary (Period Sum.), an organization Summary (org.sum.), and a Key Summary (Key Sum.), and that a Frequency (Summary Frequency) as a Period Summary can be subdivided into a Day/Week Summary (Day/Week Sum.), a Month Summary (Month Sum.), and a Year Summary (Year Sum.). Thus, objects may be categorized into various ways according to the settings of the enterprise maintaining and managing the relevant databases.
FIG. 8 is a diagram illustrating an example of multiple tables with dependencies in an embodiment of the invention. An object may mean a table package of minimum flow processing units in order to ensure consistency and integrity of data. That is, multiple tables of objects must be processed with a particular flow. The embodiment of FIG. 8 shows four tables with dependencies: table A (Table A), Table B (Table B), Table C (Table C) and Table D (Table D). As described above, dependencies between tables may mean that two tables each include data identified by the same key value. Table a, table B, table C, and table D of fig. 8 each include data identified by the same key value (file number 1). In this case, the period (DATE) field is in table a, the organization (Plant) field is in table B, the Status (Status) field is in table C, and the Region (Region) field is in table D. In this case, when four tables of the tables a, B, C, and D are grouped into one object, the object data can be extracted by using the field values of the period, the organization, the state, and the region. In contrast, when the tables a, B, C, and D are not grouped as objects, if the destination data is filtered by the field value of the period, the data of the tables B, C, and D is missing, and the data integrity is poor.
FIG. 9 is a diagram illustrating an example of the use of fields to retrieve different domains in one embodiment of the invention. The fields may be utilized to select the form/field. A Domain (Domain) may mean technical attributes such as the Type (Type) and number of bits of a field. For example, in table 2, in order to screen a plurality of tables/fields having the same domain as the domain of the organization field "organization field a" of table B, fields of different domains may be searched for. This is a method that can maintain consistency and prevent missing tables when screening target data. In the embodiment of fig. 9, field B of table B, field C of table C, field D of table D, field E of table E having the same domain as the domain "ZORG _ a" of field a of table a may be retrieved and stored in association. The saved data may then be used in tables and/or fields having the same fields to filter the destination data.
Fig. 10 is a diagram for explaining an example of a header table and an entry table in a dependency relationship in an embodiment of the present invention. The header table may mean a table including all key values located in lower tables by a table whose key value unique (unique) is not invalid (null). In this case, the grouping of the item tables into the same object may mean excluding all the remaining lower tables other than the head table. In the embodiment of fig. 10, table 1 may be formed as a head table that includes both unique and non-invalid key values 1, 2, 3, 4. On the other hand, table 2 does not include the key values 3, 4 uniquely and the key values 1, 2 in the field "file number", and table 3 does not include the key value "3" of the field "file number", and thus can be formed as an entry table instead of a header table. Table 4 may become a header table in a separate object, but in the embodiment of fig. 10, it is assumed to be included in the same object as table 1, table 2, and table 3.
Such a table relationship in the object of fig. 10 may be as shown in table 3 below.
TABLE 3
Figure BDA0003198170530000131
Table 3 defines the relationship of the attributes and distinguishes the header table and the item table. Where "sequence 1" may mean the top head table in an object, which may be represented as a separate head table in a head table-item table relationship. As an example, it is assumed that a filtering criterion of the target data is used to filter data having a field value of "P100" in a field "organization (Plant)". In this case, a field "organization (Plant)" exists in table 2, and a key value satisfying that the field value of the field "organization (Plant)" is "P100" is a field value of "1" of the field "file number (doc.no.)". In this case, in order to extract destination data having a field value of "1" of a field "file number (doc.no.)", which has integrity, data is sequentially extracted from the header table. For example, data having a field value of "1" in the field "file number (doc.no.)" in table 1, data having a field value of "1" in the field "file number (doc.no.)" in table 2, and data having a field value of "1" in the field "file number (doc.no.)" in table 3 may be sequentially extracted. In table 3, since the data having a field value of "1" in the field "file number (doc.no.)" includes the data having a field value of "R01" in the field "reference number (ref.no.)", the data having a field value of "R01" in the field "reference number (ref.no.)" in table 4 is extracted finally.
On the other hand, in step 320 of fig. 3, when the computer apparatus 200 sets the filtering reference, an additional reference may also be set for data to be excluded from the destination data or data to be added to the destination data. For example, the computer device 200 may exclude a part of the filtering criterion data from the filtering even if the filtering criterion data is satisfied. Conversely, even if the data does not meet the screening criteria, a portion of it may be added to the screening. As a more specific example, the computer apparatus 200 may exclude uncompleted data such as an unclean item (In-Process) from the screening of the destination data, or may exclude specific raw data (raw data) from the screening of the destination data. As still another example, although the computer apparatus 200 calculates the range of destination data on a period basis, the unclear item data may be added regardless of the period.
An uncleared item is one of the characteristics of the data, and may mean the data being processed. For example, an outstanding item may mean that an accounts Payable (AR)/accounts Receivable (AP) has been received, but is in an outstanding state where collection/payment of the loan is not in progress or is in production and belongs to an outstanding product in work (work in process). The addition or exclusion of such an unclear term may mean that the above-described specific state can be confirmed to be added or excluded in the screening condition. As an example, in fig. 10, when data having a field value of "P100" in the field "organization (Plant)" in table 3 is filtered, data in the current "unclear item" is additionally added, and field values of "1" and "2" may be field values of the target key values "file number (doc.no.)".
On the other hand, if the individual tables all satisfy the filtering criteria, the dependency relationship can be ignored to extract the data. For example, the case where the filtering criterion is satisfied may mean a case where the correlation table may include a value (field value) that may become the filtering criterion. More specifically, in the case where the filtering criterion is date, if all individual tables have a date field, the dependency of the object may be ignored, and data of the relevant date may be extracted separately. As another example, in the case where the filtering criteria is organization, if all tables have an organization field, the dependencies of the objects can be ignored and the data of the relevant organization can be extracted separately.
The extracted destination data may be saved to various media. For example, the extracted destination data may be saved to a separate table or to a separate file. The extracted destination data may be directly transmitted or converted to other devices without saving according to the embodiment. And, the destination data can be saved or transmitted using a state compressed by a lossless compression algorithm. For example, the lossless compression algorithm may use ZIP, CTW, LZ77, LZW, gzip, bzip2, DEFLATE, or the like.
Also, the extracted destination data can be used according to various utilization targets such as transmission, conversion, de-identification, encryption, and the like. Or may be used for backup or repair purposes to prevent loss or distortion of data.
Moreover, when storing or transmitting the extracted target data to another medium, the following data reflection rule may be considered:
(1) clear & Insert (Clear & Insert): when data is inserted, all existing data must be deleted and inserted.
(2) Modification (Modify): and if the screening data matched with the existing data exists, updating the screening data.
(3) Addition (appendix): the filter data is inserted only if there is no filter data that matches the existing data.
As such, according to the embodiment of the present invention, the destination data can be filtered in consideration of the dependency between tables. In addition, according to a plurality of data processing purposes such as filing, cleaning, transmission, conversion, encryption, de-identification and the like, the data are modularized in a database such as selection, combination, exclusion, addition and the like, so that the required purpose data can be screened.
The systems or devices described above may be embodied as hardware structural elements, software structural elements, and/or combinations of hardware structural elements and software structural elements. For example, the devices and components described in the embodiments may be implemented by one or more general purpose or special purpose computers such as a processor, a controller, an Arithmetic Logic Unit (ALU), a digital signal processor (digital signal processor), a microcomputer, a Field Programmable Gate Array (FPGA), a Programmable Logic Unit (PLU), a microprocessor, or other devices that can execute and respond to instructions (instructions). The processing device may execute an Operating System (OS) and one or more software applications executing on the OS. Also, the processing device accesses, stores, manipulates, processes, and generates data in response to execution of the software. For convenience of understanding, the case where only one processing device is used is described, but a person having ordinary skill in the art to which the present invention pertains may appreciate that a processing device may include a plurality of processing elements (processing elements) and/or various types of processing elements. For example, the processing device may include multiple processors or a processor and a controller. Further, other processing configuration such as a parallel processor (parallel processor) may be used.
The software may include a computer program (computer program), code, instructions (instructions), or a combination of one or more thereof, to configure the processing device to operate as desired or to instruct the processing device independently or in conjunction (collectively). The software and/or data may be embodied (embodied) as any type of machine, component, physical device, virtual device, computer recording medium or device for interpretation by a processing device or for providing instructions or data to a processing device. The software is distributed over network-connected computer systems so that it may be stored or executed in a distributed fashion. The software and data may be stored in more than one computer-readable storage medium.
The methods of the embodiments are embodied in the form of program instructions that are executable by various computer units and stored on computer-readable media. The computer-readable media described above may include program instructions, data files, data structures, etc., alone or in combination. The medium may continue to store the program capable of being executed by the computer or may be temporarily stored for execution or download. The medium may be various recording units or storage units in the form of a combination of one or more pieces of hardware, and is not limited to a medium directly connected to a computer system, and may be distributed over a network. Examples of the medium include magnetic media such as a hard disk, a flexible disk, a magneto-optical disk, and a magnetic disk, optical storage media such as a CD-ROM and a DVD, magneto-optical media such as a magneto-optical disk (floptical disk), a read-only memory, a random access memory, and the like, so that program instructions can be stored. Further, the other media may be exemplified by an application store for distributing applications, a web page for providing or distributing other various kinds of software, a storage medium or a recording medium managed by a server or the like. Examples of the program instructions include not only machine language codes as generated by a compiler, but also high-level language codes executed by a computer using an interpreter or the like.
As described above, although a plurality of embodiments have been described with reference to the limited embodiments and the drawings, various modifications and variations can be made from the above description by those skilled in the art to which the present invention pertains. For example, even if the technique described is performed in a different order from the method described and/or the constituent elements of the system, structure, device, circuit, and the like described are combined or combined with a different form from the method described, or replaced by other constituent elements or equivalent technical means, an appropriate result can be achieved.
Therefore, other examples, other embodiments, and equivalents of the claimed invention are also within the scope of the claimed invention.

Claims (9)

1. A method of screening data of interest, performed in a computer apparatus comprising at least one processor, wherein,
the method comprises the following steps:
grouping, by the at least one processor, data contained in the database into objects that are units of minimum flow processing, based on dependencies between tables in the database;
classifying, by the at least one processor, the object according to at least one of a process-based application area, a data type, and a characteristic;
setting, by the at least one processor, a screening criterion for the target data based on at least one field of data contained in the database and the classification of the object; and
extracting, by the at least one processor, destination data corresponding to the set filtering criteria in consideration of dependencies among the tables in the object,
at least one of the objects grouped includes two or more tables having dependencies with each other.
2. The method of claim 1, wherein the step of setting a filtering criterion sets the filtering criterion of the target data based on at least one of the following fields:
(1) a period field having a range of a specific time or period related to the data as a field value;
(2) an organization field having a field value for at least one of a legal code, a factory, a business organization, and a department; and
(3) other property fields having field values for at least one of region, country code, language, user, file type, customer group, and creator.
3. The method of screening target data according to claim 1, further comprising: the utilization target of the target data is set to any one of selective archiving, backup, cleaning, transmission, conversion, de-identification and encryption by the at least one processor.
4. The method of screening target data according to claim 3, wherein the objects of the screening criterion and the screening target are changed according to the set target of use of the target data.
5. The method of claim 1, wherein in the step of extracting the target data, key values of a top head table among tables in the object are searched, and data corresponding to the searched key values are sequentially extracted according to dependency between tables in the object, thereby extracting the target data.
6. The method of screening target data according to claim 1, further comprising: by the at least one processor, an additional reference is set for data to be excluded from the destination data or data to be added to the destination data.
7. A computer program stored on a computer-readable storage medium for use in conjunction with a computer device to perform the method of any one of claims 1, 2 or 3 to 6 in the computer device.
8. A computer-readable storage medium, characterized by storing a computer program for executing the method of any one of claims 1, 2 or 3 to 6 in a computer apparatus.
9. A computer device, characterized in that,
the method comprises the following steps:
at least one processor configured to execute computer-readable instructions;
the at least one processor performs:
grouping data contained in the database with the dependency among tables as a reference, wherein the data is used as an object of a minimum flow processing unit;
classifying the object according to at least one of a process-based application area, a data type, and a characteristic;
setting a screening standard of target data based on at least one field of data contained in the database and the classification of the object;
extracting destination data corresponding to the set filtering criteria in consideration of the dependency between tables in the object,
at least one of the objects grouped includes two or more tables having dependencies with each other.
CN202110896757.3A 2020-09-10 2021-08-05 Method and system for screening target data Pending CN114168628A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020200116251A KR102256814B1 (en) 2020-09-10 2020-09-10 Method and system for selecting target data
KR10-2020-0116251 2020-09-10

Publications (1)

Publication Number Publication Date
CN114168628A true CN114168628A (en) 2022-03-11

Family

ID=76135415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110896757.3A Pending CN114168628A (en) 2020-09-10 2021-08-05 Method and system for screening target data

Country Status (4)

Country Link
US (1) US11372899B2 (en)
JP (1) JP7300684B2 (en)
KR (1) KR102256814B1 (en)
CN (1) CN114168628A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20240032493A (en) * 2022-09-02 2024-03-12 주식회사 아미크 Method and system for visualizing target data

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3878507B2 (en) * 2002-03-27 2007-02-07 株式会社エイコット Database system
JP2003288513A (en) * 2002-03-28 2003-10-10 Japan Tobacco Inc Order entry management system equipped with returned article processing function
US20050073731A1 (en) * 2003-10-03 2005-04-07 Deer Anna Y. Color correction method for an imaging system
EP1687767A1 (en) * 2003-11-14 2006-08-09 Koninklijke Philips Electronics N.V. Product data exchange
JP4700462B2 (en) * 2005-09-27 2011-06-15 株式会社野村総合研究所 Database usage system
US9892026B2 (en) * 2013-02-01 2018-02-13 Ab Initio Technology Llc Data records selection
KR20150056989A (en) * 2013-11-18 2015-05-28 주식회사 케이티 CRM apparatus and CRM system for supporting user's design of marketing strategy and method thereof
KR101663547B1 (en) * 2016-02-26 2016-10-07 주식회사 아미크 Method and apparatus for archiving and searching database
KR101656750B1 (en) * 2016-02-26 2016-09-23 주식회사 아미크 Method and apparatus for archiving and searching database with index information
KR20180096066A (en) * 2017-02-20 2018-08-29 주식회사 핸디소프트 Apparatus and method for converting source code for user interface
CN109325218B (en) * 2017-08-01 2024-09-06 珠海金山办公软件有限公司 Data screening statistical method and device, electronic equipment and storage medium
US10698674B2 (en) * 2018-02-06 2020-06-30 Smartshift Technologies, Inc. Systems and methods for entry point-based code analysis and transformation
US10528343B2 (en) * 2018-02-06 2020-01-07 Smartshift Technologies, Inc. Systems and methods for code analysis heat map interfaces
KR102076555B1 (en) * 2018-06-22 2020-02-12 주식회사 한글과컴퓨터 Spreadsheet document editing apparatus for providing filtering functionality based on data patterns and operating method thereof

Also Published As

Publication number Publication date
KR102256814B1 (en) 2021-05-27
JP7300684B2 (en) 2023-06-30
US11372899B2 (en) 2022-06-28
JP2022046415A (en) 2022-03-23
US20220075802A1 (en) 2022-03-10

Similar Documents

Publication Publication Date Title
US20200050968A1 (en) Interactive interfaces for machine learning model evaluations
CN111402061B (en) Asset management method and system
AU2019236628B2 (en) Integrated entity view across distributed systems
AU2003274987B2 (en) Deploying multiple enterprise planning models across clusters of applications servers
US10078843B2 (en) Systems and methods for analyzing consumer sentiment with social perspective insight
Buono et al. Big data types for macroeconomic nowcasting
CA2953826A1 (en) Machine learning service
CN1347529A (en) Method for visualizing information in data warehousing environment
US11810007B2 (en) Self-building hierarchically indexed multimedia database
CN114510735B (en) Role management-based intelligent shared financial management method and platform
Sharma et al. Big data analysis in cloud and machine learning
CN115423555A (en) Commodity recommendation method and device, electronic equipment and storage medium
AU2003272566B2 (en) Inline compression of a network communication within an enterprise planning environment
US20220148084A1 (en) Self-building hierarchically indexed multimedia database
Handojo et al. A multi layer recency frequency monetary method for customer priority segmentation in online transaction
CN114168628A (en) Method and system for screening target data
JP2023159414A (en) Source code trading system by using ai
Goodridge et al. How much is UK business investing in big data?
Morshed et al. Real-time Data analytics: An algorithmic perspective
Saltos-Cruz et al. Digital Media Ecosystem: A Core Component Analysis According to Expert Judgment
CN112348298A (en) Designer management method, designer management device, electronic terminal and storage medium
JP7376027B2 (en) Method and system for selecting and transferring organizational data at the time of company split
CN116670665A (en) Method and system for screening and handing over organization data during enterprise segmentation
Prokopowicz et al. The Importance and Organization of Business Information Offered to Business Entities in Poland via the Global Internet Network
CN117217936A (en) Member configuration method, device, equipment and storage medium of service processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination