WO2016039238A1

WO2016039238A1 - Information processing device, information processing method, and program

Info

Publication number: WO2016039238A1
Application number: PCT/JP2015/074972
Authority: WO
Inventors: 細川　晃
Original assignee: 株式会社東芝
Priority date: 2014-09-11
Filing date: 2015-09-02
Publication date: 2016-03-17
Also published as: JP2016057970A

Abstract

[Problem] To reduce the volume of text format data including headers and instances. [Solution] According to an embodiment, provided is an information processing device, comprising a data configuration coding unit which, on the basis of a header included in text format data, allocates data configuration identification information to a combination of class identification information which identifies a class into which a subject which the data describes is classified, and property definition information which defines combinations of properties which characterize the class and sequences thereof, said data configuration identification information identifying the said combination. The information processing device further comprises a header configuration coding unit which, on the basis of the data, allocates header configuration identification information to header configuration information which defines a combination and a sequence of instructions which are heads recited in each line of the header, said header configuration identification information identifying the said header configuration information. The information processing device further comprises a compressed data generating unit which generates compressed data which includes the data configuration identification information, the header configuration identification information, and an instance which is included in the data.

Description

Information processing apparatus, information processing method, and program

Embodiments described herein relate generally to an information processing apparatus, an information processing method, and a program.

In recent years, with the globalization and liberalization of industry, interoperability between different systems has become important. In order to realize interoperability between heterogeneous systems, standard ontology is being developed as a standard specification for electronically describing, exchanging and registering the performance, features and services of devices in each field. . IEC 62656 (referred to as the parcel standard) is known for registration and exchange of this standard ontology.

IEC62656 is an international standard that defines a data exchange format using a spreadsheet. This international standard spreadsheet consists of two parts: a header section and a data section. The header section provides semantic and structural information for the description of the instance in the data section. The header section is further composed of a class header section that describes information on the entire sheet and a schema header section that describes property information for describing instances in individual columns. In the data section, one instance is described in one row, and the value of the corresponding property is described in each cell. Hereinafter, data described using this spreadsheet as a format will be referred to as parcel data using the term “parcel”, which is a common name of IEC62656.

∙ Parcel data can be identified from the structure of tabular form, by describing metadata once in each column of the header section, which property value each cell value has. This is one of the advantages compared to other data formats such as XML in which each value needs to be structured by delimiting with a tag from the viewpoint of data capacity.

JP 2009-77141 A JP 2007-214627 A

However, when the instance described in the data section is about one line or several lines, there is a disadvantage that the data capacity of the header described in the header section is larger than that of the data body. For example, if there are 20 properties for data consisting of 6 alphanumeric characters for the property ID, 120 characters are consumed only by the header. For this reason, there is a problem that not all data can be stored when there is a limit to the data capacity that can be stored, such as a two-dimensional code.

Therefore, a problem to be solved by the embodiment of the present invention is to provide an information processing apparatus, an information processing method, and a program capable of reducing the capacity of text data including a header and an instance.

According to one embodiment, the information processing apparatus, based on a header included in text format data, class identification information for identifying a class into which a target described by the data is classified, and an attribute characterizing the class A data configuration encoding unit that assigns data configuration identification information for identifying the set to a set of the combination of the attribute and the attribute defining information that defines the order. Based on the data, the information processing apparatus identifies the header configuration information for identifying the header configuration information with respect to the header configuration information that defines a combination of instructions and their order, which are headings described in individual rows of the header. A header configuration encoding unit that assigns information is provided. The information processing apparatus includes a compressed data generation unit that generates compressed data including the data configuration identification information, the header configuration identification information, and an instance included in the data.

The figure which shows the structure of the information processing system 1 which concerns on this embodiment. The figure which illustrates the display form of parcel data, and its text expression. The figure which shows the structure of the compressed data generation apparatus 300 which concerns on this embodiment. The functional block diagram of the compressed data generation apparatus 300 which concerns on this embodiment. The figure which shows the structure of the data management apparatus 330 which concerns on this embodiment. The functional block diagram of the data management apparatus 330 which concerns on this embodiment. The figure which illustrates the structure of structure table T1 memorize | stored in remote DB332. The figure which illustrates the structure of alias table T2 memorize | stored in remote DB332. The figure which illustrates the structure of cell column table T3 memorize | stored in remote DB332. The figure which illustrates the structure of header table T4 memorize | stored in remote DB332. 5 is a flowchart showing an example of a processing flow of a data configuration encoding unit 302 of the compressed data generation device 300. The figure which illustrates the case where the process of the data structure encoding part 302 is applied to the parcel data of FIG. 5 is a flowchart showing an example of a process flow of a header configuration encoding unit 303 of the compressed data generation device 300. FIG. 11 is a flowchart showing an example of a process flow of a class header section in step S702 of FIG. The flowchart which shows an example of the flow of the process of the instruction | indication in step S900 of FIG. FIG. 11 is a flowchart showing an example of a process flow of a schema header section in step S703 of FIG. It is a figure which illustrates the case where the process of the header structure encoding part 303 is applied to the parcel data of FIG. The figure showing the example at the time of processing another parcel data in the state which registered the header information of the parcel data of FIG. 2 in remote DB332 of the data management apparatus 330. FIG. 5 is a flowchart illustrating an example of a processing flow of a compressed data generation unit 305 of the compressed data generation apparatus 300. The figure which shows the example of the compression parcel data output by the series of processes of the compression data generation apparatus 300 in this embodiment, using the parcel data of FIG. The figure which shows the structure of the compressed data generation apparatus 300 which concerns on this embodiment. The functional block diagram of the data decompression | restoration apparatus 360 which concerns on this embodiment. 6 is a flowchart illustrating an example of a processing flow of a determination unit 362 of the data restoration device 360. The flowchart which shows an example of the flow of a process of the header information acquisition part 363 of the data decompression | restoration apparatus 360. FIG. The flowchart which shows an example of the flow of a process of the decompression | restoration part 367 of the data restoration apparatus 360. FIG.

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

The data handled in this embodiment is data in a text format that includes a header and instances, and the header can be represented by a spreadsheet configured in a matrix. In the present embodiment, an example will be described using data conforming to IEC 62656 (hereinafter referred to as parcel data).

First, the configuration of the information processing system 1 according to the present embodiment will be described. FIG. 1 is a diagram illustrating a configuration of an information processing system 1 according to the present embodiment. As illustrated in FIG. 1, the information processing system 1 includes a compressed data generation device (information processing device) 300, a data management device 330, and a data restoration device (information processing device) 360. The compressed data generation device 300, the data management device 330, and the data restoration device 360 are connected to each other via the network 150 and can communicate with each other.

The compressed data generation apparatus 300 acquires original data in text format (for example, original parcel data) and compresses the acquired original data to generate compressed data (for example, compressed parcel data). The compressed data generation device 300 is, for example, a terminal device.

The data management device 330 stores header information for restoring the compressed data to the original data. The data management device 330 is, for example, a server that stores header information.

The data restoration device 360 restores the original data (for example, the original parcel data) from the compressed data (for example, the compressed parcel data) using the header information stored in the data management device 330. The data restoration device 360 is, for example, a terminal device.

Subsequently, the parcel data structure according to the present embodiment will be described with reference to FIG. The parcel data according to the present embodiment is expressed by a spreadsheet in a format defined by the international standard IEC62656 relating to registration and exchange of product ontology using a spreadsheet. FIG. 2 is a diagram illustrating a display form of parcel data and a text representation thereof.

As shown in Table D1 on the upper side of FIG. 2, parcel data is expanded in order from the head of the data in two parts, a header section and a data section, in the row direction. On the other hand, in the column direction, the instruction column and the cell column are expanded in this order. The header section is further composed of two parts: a class header section for describing information related to the entire parcel data, and a schema header section having a set of properties and values for describing instances in the data section. Here, one instance is described in one line of the data section. In the example of FIG. 2, since only one instance is described, a set of values included in an area surrounded by a thick frame indicates one instance. When two or more instances are described, Lines equal to the number are expanded into data sections, and individual instances are described in individual lines.

The instruction column is the first column of parcel data, and in the header section, an instruction that indicates what the header of each row represents to the computer and the user is described. In the instruction of the header section, an instruction word predefined in IEC62656 or an instruction word uniquely defined by the user is described following a # (pound) symbol.

Each line of the class header section has only an instruction column, and a value corresponding to the instruction word is described after “: =” after the # symbol and the instruction word. For example, “#CLASS_ID” in #CLASS_ID: = AAX001 described in the first cell in the first row of the table in FIG. 2 is one of the instructions in the class header section, and this parcel data is specified by AAX001. This is a sheet for describing an instance of a class. This AAX001 is an example of a class identifier for identifying a class corresponding to the classification of goods and services. For example, when the upper class is an electrical product, there are individual classes such as a motor, a personal computer, and a flash memory in the lower class of the electrical product, and these individual classes are identified by a class identifier. .

Cell column is a column for describing properties and their values. The parcel data has one or more cell columns, and the cell columns are developed in order from the second column onward after the instruction column. In each row of the schema header section, the ID, name, data type, unit, etc. of the property assigned to each cell column are described based on the instruction word described in the instruction column of the same row.

For example, in the first cell of the fourth row in the table of FIG. 2, #PROPERTY_ID indicating that the property ID for identifying the property is expanded in the cell column of this row is described, and the cell columns in the second and subsequent columns are described. Individual property IDs are displayed in order. In the other rows of the schema header section, property information such as name, data type, and unit is displayed in the same order as the display order of property IDs according to the instruction word described in each instruction column.

Next, one or more instances are displayed in the row direction in the data section. Here, an instance is represented by a set of a set of a property and its property value, and the value of each property included in one instance is displayed in one line. If a # symbol is written in the instruction column of the data section, the line is treated as a comment line and ignored by the system.

In the parcel data having such a structure, in order for the computer and the user to identify the boundary between the header section and the data section, the instruction column is scanned in order from the first row, and the first row where the cell value does not start with the # symbol is displayed. By specifying, the specified line is the first line of the data section, and the previous line is the header section.

Next, the lower text D2 in FIG. 2 is the CSV (Comma （Separated Values) format of the upper table D1 in FIG. Here, the specification of the CSV format is disclosed in non-patent document RFC4180, and it is implemented as one of the standard methods for exchanging spreadsheet data in many applications that can interpret spreadsheets. . In the present embodiment, the CSV format is taken as an example, but other formats that express spreadsheet data in a text format may be used. For example, TSB (Tab Separated Values) that uses tabs as cell delimiters instead of commas may be used.

Subsequently, the configuration of the compressed data generation apparatus 300 according to the present embodiment will be described with reference to FIG. FIG. 3 is a diagram illustrating a configuration of the compressed data generation device 300 according to the present embodiment. As illustrated in FIG. 3, the compressed data generation device 300 includes a CPU (Central Processing Unit) 101, a ROM 102, a RAM 103, a storage device 104, a medium reading device 106, a bus controller 107, a display device 108, an input device 109, and a communication unit. 304 is provided. As shown in FIG. 3, each component of the compressed data generation device 300 is connected via a bus controller 107 and can exchange data with each other.

CPU 101 controls the entire compressed data generating apparatus 300.

The ROM 102 stores various data and various programs read and executed by the CPU 101.

The RAM 103 is a storage device that primarily stores information, and temporarily stores various programs read by the CPU 101.

The storage device 104 stores various data and various programs that the CPU 101 reads and executes. The storage device 104 is, for example, a hard disk drive (Hard Disk Drive: HDD).

The medium reading device 106 is a drive device for reading data recorded on a computer-readable storage medium (for example, a CD (Compact Disk) etc.) In order to execute each process of the CPU 101 according to the present embodiment. These programs may be recorded on a computer-readable recording medium.

The display device 108 displays information according to control by the CPU 101.

The input device 109 receives an instruction input or operation by the user. The input device 109 is, for example, a keyboard or a mouse.

The communication unit 304 communicates with the data management device 330 having the storage device 114 described later via the network 150. This communication may be wired or wireless.

Subsequently, a functional configuration of the compressed data generation apparatus 300 according to the present embodiment will be described with reference to FIG. FIG. 4 is a functional block diagram of the compressed data generation apparatus 300 according to this embodiment. The CPU 101 reads out a program from the ROM 102 or the storage device 104 to the RAM 103 and executes the program, or executes a program read from the computer-readable storage medium into the RAM 103 by the medium reading device 106. By executing this program, a header acquisition unit 301, a data configuration encoding unit 302, a header configuration encoding unit 303, and a compressed data generation unit 305 are generated on the RAM 103.

The compressed data generation process in the compressed data generation device 300 is started when the user performs a compressed data generation operation for selecting the parcel data 308 via the input device 109 while viewing the screen displayed on the display device 108.

Upon receiving a compressed data generation operation by the user from the input device 109, the header acquisition unit 301 reads the parcel data 308 selected by the user into the RAM 103, and extracts header section information (hereinafter referred to as a header) included in the parcel data. To do. As described above, the header acquisition unit 301 acquires the header included in the parcel data 308. Then, the header acquisition unit 301 converts the acquired header into text format data, and passes the text format data and parcel data 308 to the data configuration encoding unit 302.

The data configuration encoding unit 302 uses class identification information (for example, a class ID to be described later) that identifies a class into which an article or service to be described by the data is classified using a header included in the text format data. Data configuration identification information (for example, described later) for a combination of a combination of properties (attributes) that characterize this class and attribute definition information (for example, cell column text described later) for defining the order Assigned structure ID). As a result, different data configuration identification information is assigned to each set of class identification information and attribute defining information. As a result, the data configuration identification information functions as information for identifying the data configuration.

Specifically, for example, the data configuration encoding unit 302 uses the class ID that is the value of the instruction #CLASS_ID and the value of the cell column in the row of the instruction #PROPERTY_ID (hereinafter, cell Column text). Then, the data configuration encoding unit 302 generates a structure ID by text encoding the set of the extracted class ID and cell column text. Thus, as an example, the data configuration encoding unit 302 generates data configuration identification information by performing text encoding on a set of class identification information and attribute defining information. This text encoding is, for example, calculation of a hash value using a hash function.

Then, in order to register the set of class ID, cell column text, and structure ID in the remote DB 332 of the data management device 330, the data configuration encoding unit 302 registers the set of class ID, cell column text, and structure ID as a header. Along with the request, the data is transmitted from the communication unit 304 to the data management device 330. That is, the data configuration encoding unit 302 transmits data to be stored in the storage device 114 from the communication unit 304 to the data management device 330 and causes the data management device 330 to store the data to be stored. Further, the data configuration encoding unit 302 passes a set of the class ID, cell column text, and structure ID to the header configuration encoding unit 303. Details of the operation of the data structure encoding unit 302 will be described later.

The header configuration encoding unit 303 uses the parcel data 308 to generate header configuration information (for example, header text to be described later) that defines the combination of instructions that are headings described in the individual rows of the header and the order of the combinations. On the other hand, header configuration identification information (for example, a header ID described later) for identifying the header configuration information is assigned.

Specifically, for example, the header configuration encoding unit 303 performs processing on each of the class header section and the schema header section of the header section of the parcel data 308, and is a header that is text data representing the configuration of the header section of the parcel data. Generate text. The header configuration encoding unit 303 then encodes the header text to generate a header ID. Thus, the header structure encoding part 303 produces | generates header structure identification information by performing text encoding with respect to header structure information as an example. This text encoding is, for example, calculation of a hash value using a hash function.

Then, the header configuration encoding unit 303 registers the header text and header ID pair in the remote DB 332 of the data management device 330, so that the header text and header ID pair is sent via the communication unit 304 together with the header registration request. To the data management device 330. That is, the header configuration encoding unit 303 transmits data to be stored in the storage device 114 from the communication unit 304 to the data management device 330, and causes the data management device 330 to store the data to be stored. Further, the combination of the header text and the header ID is passed to the compressed data generation unit 305. Details of the operation of the header configuration encoding unit 303 will be described later.

As described above, the communication unit 304 transmits the data passed from the data configuration encoding unit 302 and the header configuration encoding unit 303 to the data management device 330 via the network 150.

The compressed data generation unit 305 generates compressed data including data configuration identification information (for example, structure ID), header configuration identification information (for example, header ID), and an instance included in the parcel data 308. Specifically, for example, the compressed data generation unit 305 has the structure ID generated by the data configuration encoding unit 302 as the value of instruction #CLASS_ID, and further uses the header ID generated by the header configuration encoding unit 303 as a predetermined instruction ( For example, a header having a value of #HEADER) is generated. Then, the compressed data generation unit 305 generates compressed parcel data 309 obtained by combining the generated header and data of the data section of the parcel data 308 as an example of compressed data. Then, the compressed data generation unit 305 outputs the compressed parcel data 309 to the outside of the compressed data generation apparatus 300. Details of the operation of the compressed data generation unit 305 will be described later.

Subsequently, the configuration of the data management apparatus 330 according to the present embodiment will be described with reference to FIG. FIG. 5 is a diagram illustrating a configuration of the data management apparatus 330 according to the present embodiment. As shown in FIG. 5, the data management device 330 includes a CPU (Central Processing Unit) 111, a ROM 112, a RAM 113, a storage device 114, a medium reading device 116, a bus controller 117, a display device 118, an input device 119, and a communication unit 333. Is provided. As shown in FIG. 5, each component of the data management device 330 is connected via a bus controller 117 and can exchange data with each other.

CPU 111 controls the entire data management device 330.

The ROM 112 stores various data and various programs that the CPU 111 reads and executes.

The RAM 113 is a storage device that primarily stores information, and temporarily stores various programs read by the CPU 111.

The storage device 114 stores various data. The storage device 114 is, for example, a hard disk drive (Hard Disk Drive: HDD).

The medium reading device 116 is a drive device for reading data recorded on a computer-readable storage medium (for example, CD (Compact Disk) etc.) In order to execute each process of the CPU 111 according to the present embodiment. These programs may be recorded on a computer-readable recording medium.

Display device 118 displays information in accordance with control by CPU 111.

The input device 119 accepts an instruction input or operation by a user using the data management device 330. The input device 119 is, for example, a keyboard or a mouse.

The communication unit 333 communicates with the compressed data generation apparatus 300 via the network 150. This communication may be wired or wireless.

Subsequently, a functional configuration of the data management apparatus 330 according to the present embodiment will be described with reference to FIG. FIG. 6 is a functional block diagram of the data management apparatus 330 according to the present embodiment. The CPU 111 reads the program from the ROM 112 to the RAM 113 and executes the program, or executes the program read by the medium reading device 116 from the computer-readable storage medium to the RAM 113. By executing this program, a data management unit 331 is generated on the RAM 113. Further, the storage device 114 stores a remote DB 332.

The data management unit 331 interprets the header registration request received from the compressed data generation device 300 via the communication unit 333 and stores the data received from the compressed data generation device 300 via the communication unit 333 as header information in the remote DB 332. sign up. As described above, the remote DB 332 stores header information generated as a result of the processing of the compressed data generation device 300. Further, it interprets the header information inquiry request received from the compressed data generation device 300 and the data decompression device 360 via the communication unit 333, extracts the header information from the remote DB 332, and transmits the header information via the communication unit 333. .

The remote DB 332 includes a structure table (structure table) T1, an alias table (alias table) T2, a cell column table (cell column) table T3 that stores data generated by the processing of the data structure encoding unit 302, and a header structure code. The header table (header table) T4 that stores data generated by the processing of the conversion unit 303 is included.

FIG. 7A is a diagram illustrating the structure of the structure table T1 stored in the remote DB 332. The structure table T1 includes a structure_id field that stores the structure ID generated by the processing of the data structure encoding unit 302, a class_id field that stores the class ID that is the value of the instruction #CLASS_ID in association with the structure ID, and an instruction # It has a cell_column_text field for storing text obtained by concatenating the contents described in the cell column of the PROPERTY row using a delimiter, and stores those sets as records.

FIG. 7B is a diagram illustrating the structure of the alias table T2 stored in the remote DB 332. The alias table T2 is used when an instruction alias (hereinafter referred to as an alias) is created in the processing of the header configuration encoding unit 303, and stores a header_id field for storing a header ID that uses the alias, and an instruction alias. It has an alias field and an original field for storing the original instruction, and stores those sets as records.

FIG. 7C is a diagram illustrating the structure of the cell column table T3 stored in the remote DB 332. The cell column table T3 includes a structure_id field that externally references the structure_id field of the structure_table, an instruction field that stores an instruction associated with the structure ID, and an instruction_value field that stores a value associated with the instruction. Is stored as a record.

FIG. 7D is a diagram illustrating the structure of the header table T4 stored in the remote DB 332. The header table T4 has a header_id field for storing the header ID generated by the processing of the header configuration encoding unit 303, and a header_text field for storing the header text associated with the header ID, and stores those sets as records. To do.

Here, when the instruction is in the class header section, that is, when the header is composed of one column and the instruction and its value are separated by the string “: =”, the instruction and the string “:” are included in the instruction field. = ”Is stored, and the value is stored in the instruction_value field.

On the other hand, if the instruction is in the schema header section, the instruction itself is stored in the instruction field, and the text created by combining the cell columns with a delimiter is stored in the instruction_value field.

In the above description, it is assumed that data is stored in a relational database, but other types of databases such as an XML database may be used as long as similar information can be stored. Further, the table name, field name, and table configuration are not limited to the above as long as similar information can be stored. For example, a column may be added for other processing, or another table may be added.

Subsequently, details of each process of the compressed data generation apparatus 300 will be described using respective flowcharts. First, the operation of the data configuration encoding unit 302 of the compressed data generation apparatus 300 will be described with reference to FIG. FIG. 8 is a flowchart illustrating an example of a process flow of the data configuration encoding unit 302 of the compressed data generation device 300.

(Step S501) First, the data configuration encoding unit 302 acquires the text data of the header section from the header acquisition unit 301.

(Step S502) Next, the data structure encoding unit 302 extracts the class ID from the value of the instruction #CLASS_ID of this text data.

(Step S503) Next, the data structure encoding unit 302 extracts cell column data, that is, cell column text, from the row of instruction #PROPERTY_ID of this text data.

(Step S504) Next, the data structure encoding unit 302 uses the extracted class ID and cell column text combination as a key, and has a record including the class ID and cell column text combination already registered in the structure table T1. The data management apparatus 330 is inquired.

(Step S505) Next, the data structure encoding unit 302 uses the inquiry result in step S504 to determine whether or not a record including a set of class ID and cell column text has already been registered in the structure table T1 of the remote DB 332. To determine.

(Step S506) When a record including a set of class ID and cell column text is already registered in the structure table T1 of the remote DB 332 (Step S505)
YES), the data structure encoding unit 302 acquires the structure ID from the structure table T1 of the remote DB 332. Thereafter, the process proceeds to step S509.

(Step S507) On the other hand, when the record including the set of the class ID and the cell column text is not yet registered in the structure table T1 of the remote DB 332 (NO in Step S5050), the data configuration encoding unit 302 selects the extracted class ID and A set of cell column text is text-encoded to generate a structure ID.

(Step S508) The data configuration encoding unit 302 registers the set of the structure ID, the class ID, and the cell column text in the structure table T1 of the remote DB 332 of the data management device 330 via the communication unit 304. As described above, the data configuration encoding unit 302 extracts the class identification information from the parcel data 308, generates the attribute definition information from the parcel data 308, and determines the combination of the class identification information and the attribute definition information. Data configuration identification information is assigned, and the data configuration identification information, class identification information, and attribute defining information are associated with each other and stored in the storage device 114. Then, the process proceeds to step S509.

(Step S509) Finally, the data configuration encoding unit 302 passes the structure ID, the header section text data, and the parcel data 308 to the header configuration encoding unit 303, and ends the processing.

FIG. 9 is a diagram illustrating a case where the processing of the data configuration encoding unit 302 is applied to the parcel data in FIG. As shown in FIG. 9, the value of the instruction #CLASS_ID and the cell column text of #PROPERTY_ID are extracted by the processing of the data configuration encoding unit 302, and the structure ID CCL001 is generated by text encoding. As described above, the cell column text which is an example of the attribute defining information is information in which values associated with attribute identification information #PROPERTY_ID for identifying an attribute are arranged in the order of appearance in the header.

Note that the structure ID may be generated by the data configuration encoding unit 302 using a text encoding function such as a hash function with a text obtained by concatenating the class ID and the cell column text as an input.

Alternatively, without using such a function, the data configuration encoding unit 302 may use a character string designated by the user as the structure ID. As described above, the data configuration encoding unit 302 may assign the character string received from the user by the input device 109 to the data configuration information (for example, the structure ID).

Alternatively, the data configuration encoding unit 302 may manage sequential alphanumeric characters, and the data configuration encoding unit 302 may acquire the next sequential alphanumeric characters as the structure ID from the managed alphanumeric characters.

Subsequently, the operation of the header configuration encoding unit 303 of the compressed data generation device 300 will be described with reference to FIG. FIG. 10 is a flowchart illustrating an example of a processing flow of the header configuration encoding unit 303 of the compressed data generation device 300.

(Step S701) First, the header configuration encoding unit 303 acquires the structure ID, the text data of the header section, and the parcel data 308 from the data configuration encoding unit 302.

(Step S702) Next, the header configuration encoding unit 303 performs processing on each line of the class header section included in the text data of the header section for header configuration encoding. Details of this processing will be described later with reference to FIG.

(Step S703) Next, when the header configuration encoding unit 303 finishes the processing of the class header section, the header configuration encoding unit 303 processes each row of the schema header section included in the text data of the header section. Details of this processing will be described later with reference to FIG.

(Step S704) Next, after finishing the instruction processing of the class header section and the schema header section, the header configuration encoding unit 303 reads the instructions in order from the head of the instruction column of the header section, and uses the delimited characters as the read instructions. Generate concatenated header text. Thus, the header text as an example of the header configuration information is information in which instructions included in the header are arranged in the order of appearance in the header.

(Step S705) Next, the header configuration encoding unit 303 inquires of the data management apparatus 330 whether the record including the header text generated in step S704 is registered in the header table T4 of the remote DB 332 of the data management apparatus 330.

(Step S706) Next, the header configuration encoding unit 303 determines whether or not a record including the header text generated in step S704 is registered in the header table T4 using the inquiry result.

(Step S707) When a record including the header text generated in step S704 is registered in the header table T4 (YES in step S706), the header configuration encoding unit 303 transmits the header table of the remote DB 332 via the communication unit 304. From T4, the header ID corresponding to this header text is acquired. Thereafter, the process proceeds to step S712.

(Step S708) On the other hand, when the record including the header text generated in Step S704 is not registered in the header table T4 (NO in Step S706), the header configuration encoding unit 303 converts the header text generated in Step S704 into a text code. To generate a header ID.

(Step S709) Next, the header configuration encoding unit 303 registers the set of the header ID generated in step S708 and the header text generated in step S704 in the header table T4 of the remote DB 332 via the communication unit 304. .

In this way, the header configuration encoding unit 303 generates header configuration information (for example, header text) from the data, assigns header configuration identification information (for example, header ID) to the generated header configuration information, and The configuration identification information and the header configuration information are associated with each other and stored in the storage device 114.

(Step S710) Next, the header configuration encoding unit 303 determines whether or not an instruction alias has been created in the processing of the class header section in step S702 or the processing of the schema header section in step S703.

(Step S711) If it is determined in step S710 that an instruction alias has been created (YES in step S710), the header configuration encoding unit 303 sets the header ID generated in step S708, the created alias, and the alias. The pair with the instruction is registered in the alias table T2 of the remote DB 332 via the communication unit 304. Thereafter, the process proceeds to step S712. On the other hand, if it is determined in step S710 that an instruction alias has not been created (NO in step S710), the process proceeds to step S712.

(Step S712) When the structure ID and header ID are obtained by the above processing, the header configuration encoding unit 303 passes these structure ID and header ID and parcel data 308 to the compressed data generation unit 305, and ends the processing. .

Subsequently, details of the processing of the class header section in step S702 of FIG. 10 will be described with reference to FIG. FIG. 11 is a flowchart showing an example of the processing flow of the class header section in step S702 of FIG.

(Step S801) First, the header configuration encoding unit 303 performs the following steps S802 and S900 on each row in order from the first row of the class header section (Step S802). Next, the header configuration encoding unit 303 obtains a set of the target instruction and its value (hereinafter referred to as an instruction value) from the instruction column for the processing target row.

(Step S900) The header configuration encoding unit 303 executes an instruction process of FIG. 12 to be described later on the group acquired in Step S802.

(Step S803) When the next line exists in the class header section, the process returns to Step S801. If there is no next line in the class header section, the processing of the class header section is terminated.

Next, details of the instruction processing in step S900 of FIG. 11 will be described with reference to FIG. FIG. 12 is a flowchart showing an example of the flow of instruction processing in step S900 of FIG. This process is performed not only in the process of the class header section but also in the process of the schema header section in FIG. 13 described later.

(Step S901) First, the header configuration encoding unit 303 uses the combination of the structure ID and the instruction acquired by the data configuration encoding unit 302 as a key to change these combinations from the cell column table T3 of the remote DB 332 of the data management device 330. Get the associated instruction value.

(Step S902) It is determined whether the target instruction value is the same as or different from the instruction value acquired in step S901, or a record including the target instruction value is not registered in the cell column table T3 of the remote DB 332. When the target instruction value and the instruction value acquired in step S901 are the same (step S902 SAME), the header configuration encoding unit 303 ends the instruction processing.

(Step S903) On the other hand, when the target instruction value is different from the instruction value acquired in Step S901 (Step S902 DIFFERENT), the header configuration encoding unit 303 generates an alias of the target instruction.

(Step S904) The header configuration encoding unit 303 replaces the target instruction of the header section with an alias. Then, the process proceeds to step S905.

(Step S905) When the record including the target instruction value is not registered in the cell column table T3 of the remote DB 332 (S902: NO RECORD), the header configuration encoding unit 303 generates the structure ID generated by the data configuration encoding unit 302 The set of the target instruction and the target instruction value is registered in the cell column table T3 of the remote DB 332 of the data management device 330 via the communication unit 304, and the processing of the instruction is finished.

As described above, the header configuration encoding unit 303 reads out a combination of an instruction and an instruction value from the data, and stores the data configuration identification information, the instruction, and the instruction value in the storage device 114 in association with each other.

On the other hand, when the process proceeds from step S904 to step S905, the header configuration encoding unit 303 communicates the set of the structure ID generated by the data configuration encoding unit 302, the alias replaced in step S904, and the target instruction value. The data is registered in the cell column table T3 of the remote DB 332 of the data management device 330 via the unit 304, and the instruction process is terminated.

As described above, the header configuration encoding unit 302 acquires a combination of an instruction and an instruction value included in the header from the parcel data 308 (step S802), and identifies the acquired instruction value and the data configuration in the storage device 114. The data structure identification information to be compared is compared with the instruction value associated with the instruction (step S902). If the values are different as a result of the comparison, the header configuration encoding unit 302 generates an alias for the instruction (step S903).

Then, the header configuration encoding unit 302 stores the data configuration identification information, the alias, and the read instruction value in the storage device 114 in association with each other (step S905). The header configuration encoding unit 302 associates the header configuration identification information (for example, header ID) and the header configuration information (for example, header text) in which the instruction included in the header configuration information is replaced with an alias in the storage device 114. Store (step S709). In addition, the header configuration encoding unit 302 stores the instruction configuration identification information (for example, header ID), the alias, and the instruction in which the alias is generated in association with each other in the storage device 114 (step S711).

Next, details of the processing of the schema header section in step S703 in FIG. 10 will be described with reference to FIG. FIG. 13 is a flowchart showing an example of the processing flow of the schema header section in step S703 of FIG.

(Step S1011) First, the header configuration coding unit 303 performs the following steps S1012 and S900 on each row in order from the first row of the schema header section.

(Step S1012) Next, the header configuration encoding unit 303 acquires the target instruction and the target instruction value for the processing target row.

(Step S900) Next, the header configuration encoding unit 303 performs the processing of the instructions in FIG.

(Step S1013) When the instruction processing in step S900 is completed and the schema header section has the next line, the header configuration encoding unit 303 performs the processes in steps S1012 and S900 on the next line. If there is no next line in the schema header section, the header configuration encoding unit 303 ends the processing of the schema header section.

(Example of header ID generated by the process of the header configuration encoding unit 303)
FIG. 14 is a diagram illustrating a case where the processing of the header configuration encoding unit 303 is applied to the parcel data of FIG. The header configuration encoding unit 303 concatenates the instructions acquired in order from the first row of the instruction string using a delimiter, and generates a header text. The header configuration encoding unit 303 text-encodes the header text to generate HDR001 that is the header ID of the header text.

Note that the header ID may be generated by the header configuration encoding unit 303 using a text encoding function such as a hash function with the header text as an input.

Alternatively, a character string specified by the user may be used as the header ID without using such a function. As described above, the header configuration encoding unit 303 may assign the character string received from the user by the input device 109 to the header configuration information (for example, header text).

Alternatively, the header configuration encoding unit 303 may manage sequential alphanumeric characters and automatically generate alphanumeric characters as header IDs.

<Example of Instruction Alias Generated by Processing of Header Configuration Encoding Unit 303>
FIG. 15 is a diagram illustrating an example in which processing of another parcel data is performed in a state where the header information of the parcel data in FIG. 2 is registered in the remote DB 332 of the data management device 330. The upper table of FIG. 15 shows parcel data, and the text data below the parcel data represents the parcel data in a text format.

Compared with the parcel data in FIG. 2, since the class IDs specified by #CLASS_ID are equal and the cell column configuration of the row of #PROPERTY_ID is equal, the structure ID is not generated by the processing of the data configuration encoding unit 302, and the data The structure ID CCL001 registered in the remote DB 332 of the management apparatus 330 is applied as the structure ID.

On the other hand, the parcel data in FIG. 15 does not have instruction #DATABASE: =, the information described in the cell column of instruction #MEMO is different, and the order of instructions in the instruction row is different. Accordingly, header text is generated by the header configuration encoding unit 303.

At that time, an alias # MEMO-001 for instruction #MEMO is created by the header configuration encoding unit 303. Since the instruction is replaced by this alias, # MEMO-001 appears instead of #MEMO in the header text generated by the header configuration encoding unit 303. By the processing of the header configuration encoding unit 303, HDR002 is finally assigned as the header ID corresponding to the header text of the header section of the parcel data in FIG. Registered in the remote DB 332.

<Flowchart of Processing of Compressed Data Generation Unit 305>
Next, processing of the compressed data generation unit 305 will be described with reference to FIG. FIG. 16 is a flowchart illustrating an example of a process flow of the compressed data generation unit 305 of the compressed data generation apparatus 300.

(Step S1301) First, the compressed data generation unit 305 acquires the structure ID, header ID, and parcel data 308 from the header configuration encoding unit 303.

(Step S1302) Next, the compressed data generation unit 305 creates an empty header section.

(Step S1303) Next, the compressed data generation unit 305 outputs information that the value of the instruction #CLASS_ID is the structure ID acquired in Step S1301 to the header section created in Step S1302. For example, when the structure ID is CCL001 as shown in FIG. 17, the compressed data generation unit 305 adds #CLASS_ID: = CCL001 to the header section created in step S1302.

(Step S1304) Next, the compressed data generation unit 305 adds information that the value of the instruction #HEADER is the header ID acquired in Step S1301 to the header section created in Step S1302. For example, when the structure ID is HDR001 as shown in FIG. 17, the compressed data generation unit 305 adds #HEADER: = HDR001 to the header section created in step S1302.

(Step S1305) Next, the compressed data generation unit 305 combines the data section of the parcel data 308 with the header section generated in this way.

<Example of compressed parcel data generated by the compressed data generation unit 305>
FIG. 17 is a diagram illustrating an example of compressed parcel data output by a series of processes of the compressed data generation apparatus 300 according to the present embodiment using the parcel data of FIG. 2 as an input. The header section of FIG. 17 includes the structure ID generated by the data configuration encoding unit 302 as the value of instruction #CLASS_ID, and includes the header ID generated by the header configuration encoding unit 303 as the value of instruction #HEADER. Yes. Thereby, it can be seen that compressed parcel data in which the data capacity of the header section is reduced as compared with the parcel data of FIG. 2 is generated.

In the example of FIG. 17, the predetermined instruction describing the header ID is #HEADER, but other instructions may be set and used as long as the system can interpret them.

(Step S1306) Next, the compressed data generation unit 305 outputs the data generated in step S1305 as the compressed parcel data 309, and ends the processing.

As described above, in the compressed data generation apparatus 300 according to the present embodiment, the data configuration encoding unit 302 identifies the class in which the target article or service described by the data is classified, using the header included in the text format data. The data configuration identification information (for example, structure ID) for identifying the combination is assigned to the combination of the class identification information to be performed, the combination of the property (attribute) that characterizes the class, and the attribute definition information that defines the order of the combination. .

Then, the header configuration encoding unit 303 uses the data to perform header configuration information (for example, header text) that defines a combination of instructions that are headings described in the individual rows of the header and their order. The header configuration identification information (for example, header ID) for identifying the header configuration information is assigned. Then, the compressed data generation unit 305 generates compressed data including the data configuration identification information (for example, structure ID), the header configuration identification information (for example, header ID), and the instance included in the data.

Thereby, the capacity of the compressed data can be reduced compared to the original data because the capacity of the header is reduced by replacing the header with data including the data structure identification information and the header structure identification information.

In this embodiment, as an example, a reversible compressed header with reduced data capacity is generated based on the configuration of the parcel data header expressed in text format, and the compressed header is replaced with the original parcel data header. I explained that. Thereby, compressed parcel data with a reduced data capacity can be generated. Further, compressed parcel data with a reduced header capacity can be created in the text format, and compressed parcel data can be stored even in an environment where the data capacity that can be stored is limited.

In addition, since the compressed parcel data created in this embodiment is expressed in a format compliant with the parcel standard, these applications can be used without adding special functions to applications that handle parcel data. Can read and write.

Further, according to the present embodiment, these compressed parcel data can be efficiently generated for parcel data that is a sheet for describing data of the same class and has the same property order.

In this embodiment, the compressed data generation apparatus 300 stores the remote DB 332 in the storage device 114 included in the data management apparatus 330, but the present invention is not limited to this. The compressed data generation device 300 may store the remote DB 332 in the storage device 104 within the device itself. Further, the compressed data generation apparatus 300 and the data management apparatus 330 may be configured as an integrated information processing apparatus.

Subsequently, the configuration of the data restoration apparatus 360 according to the present embodiment will be described with reference to FIG. FIG. 18 is a diagram showing the configuration of the data restoration device 360 according to the present embodiment. As illustrated in FIG. 18, the data restoration device 360 includes a CPU (Central Processing Unit) 121, a ROM 122, a RAM 123, a storage device 124, a medium reading device 126, a bus controller 127, a display device 128, an input device 129, and a communication unit 364. Is provided. As shown in FIG. 18, each component of the data restoration device 360 is connected via a bus controller 127 and can exchange data with each other.

CPU 121 controls the entire compressed data generating apparatus 300.

The ROM 122 stores various data and various programs that the CPU 121 reads and executes.

The RAM 123 is a storage device that primarily stores information, and temporarily stores various programs read by the CPU 121.

The storage device 124 stores various data and various programs that the CPU 121 reads and executes. The storage device 124 is, for example, a hard disk drive (Hard Disk Drive: HDD).

The medium reading device 126 is a drive device for reading data recorded on a computer-readable storage medium (for example, a CD (Compact Disk) etc.) In order to execute each process of the CPU 121 according to the present embodiment. These programs may be recorded on a computer-readable recording medium.

The display device 128 displays information according to control by the CPU 121.

The input device 129 receives an instruction input or operation by the user. The input device 129 is, for example, a keyboard or a mouse.

The communication unit 364 communicates with the data management device 330 via the network 150. This communication may be wired or wireless.

Subsequently, a functional configuration of the data restoration apparatus 360 according to the present embodiment will be described with reference to FIG. FIG. 19 is a functional block diagram of the data restoration device 360 according to the present embodiment. The CPU 121 reads out the program from the ROM 122 or the storage device 124 to the RAM 123 and executes the program, or executes the program read from the computer-readable storage medium into the RAM 123 by the medium reading device 126. By executing this program, an acquisition unit 361, a determination unit 362, a header information acquisition unit 363, a data registration unit 365, a restoration unit 367, a parcel data processing unit 368, and a prior acquisition unit 369 are generated on the RAM 123. The storage device 124 stores a local DB 366.

The parcel data restoration process in the data restoration device 360 is started when the user performs a parcel data acquisition operation instructing the input device 129 to read the parcel data 372 while viewing the screen displayed on the display device 128. .

The acquisition unit 361 acquires data 372 when the input device 129 receives a parcel data acquisition operation. Here, the data 372 is either the parcel data 308 or the compressed parcel data 309 in which the header section is compressed. The acquisition unit 361 loads the parcel data 372 into the memory and passes it to the determination unit 362.

The determination unit 362 determines whether or not the header of the data 372 is compressed based on the header included in the text format data 372. Specifically, for example, the determination unit 362 analyzes the header information of the data 372 and passes the data 372 to the header information acquisition unit 363 when the header needs to be restored. On the other hand, when restoration of the header is unnecessary or impossible, the data 372 is passed to the parcel data processing unit 368. Details of the operation of the determination unit 362 will be described later.

Here, as described above, the storage device 124 of the data management device 330 characterizes the data configuration identification information and the class identification information and the class for identifying the class into which the goods or services to be described by the original data are classified. A combination of properties (attributes) and attribute defining information that defines their order are stored in association with each other. Further, the storage device 124 stores header configuration identification information and header configuration information that defines the configuration of the original header in association with each other.

When the determination unit 362 determines that the header of the data 372 is compressed, the header information acquisition unit 363 determines the data configuration identification information (for example, structure ID) and the header configuration identification information (for example, the header) from the header of the data 372. ID) is extracted, and header information including information associated with either the extracted header configuration identification information or data configuration identification information is acquired from the storage device 124.

Specifically, for example, the header information acquisition unit 363 acquires the structure ID from the value of the instruction #CLASS_ID of the header section of the parcel data 372 received from the determination unit 362, and further extracts the header ID specified by the instruction #HEADER_ID. To do. Then, the header information acquisition unit 363 acquires header information associated with these from the data management device 330 via the local DB 366 or the communication unit 364. Here, when the header information is acquired from the data management device 330, the header information is transferred to the data registration unit 365 for storage in the local DB 366. Thereafter, the header information and the parcel data 372 are sent to the restoration unit 367. Details of the operation of the header information acquisition unit 363 will be described later.

The communication unit 364 communicates with the data management device 330 having the storage device 124. Specifically, the communication unit 364 transmits an inquiry from the header information acquisition unit 363 to the data management device 330 via the network 150. Then, the communication unit 364 receives the header information transmitted from the data management device 330 in response to this inquiry via the network 150 and passes the received header information to the header information acquisition unit 363.

The data registration unit 365 stores the header information acquired by the header information acquisition unit 363 in the storage device 124. Specifically, the data registration unit 365 stores the header information passed from the header information acquisition unit 363 in the local DB 366 in the storage device 124. In addition, the data registration unit 365 causes the storage device 366 to store the header information acquired by the advance acquisition unit 369.

The storage device (second storage device) 124 stores data including the local DB 366. The local DB 366 stores the header information acquired by the header information acquisition unit 363 or the advance acquisition unit 369. The configuration of information stored in the local DB 366 is the same as that of the remote DB 332 of the data management device 330, as shown in FIGS. 7A to 7D. That is, the local DB 366 stores part or all of the information of the remote DB 332 of the data management device 330. Thus, by storing the header information in the data decompression device 360, it is possible to reduce the amount of communication when restoring the original parcel data from the compressed parcel data, and to improve the processing speed.

The restoration unit 367 restores original data (for example, original parcel data) using the header information acquired by the header information acquisition unit 363. Alternatively, the restoration unit 367 restores the original data using the header information stored in the storage device 124.

More specifically, the restoration unit 367 is described in each line of the header from header configuration information (for example, header text) associated with header configuration identification information (for example, header ID) included in the header information. Extracts a list of instructions that are headings. Then, for each extracted instruction, the restoration unit 367 acquires an instruction value associated with the instruction and data configuration identification information (for example, structure ID), and is included in the combination of the instruction and instruction value and the compressed data. Data including instances is generated as original data.

Specifically, for example, the restoration unit 367 uses the header information passed from the header information acquisition unit 363 to convert the header section of the data 372 passed from the header information acquisition unit 363 into the header section of the original parcel data. By restoring, the original parcel data is restored. The restoration unit 367 passes the restored original parcel data to the parcel data processing unit 368. Details of the operation of the restoration unit 367 will be described later.

The parcel data processing unit 368 performs various processes on the data 372 passed from the determination unit 362 or the original parcel data passed from the restoration unit 367.

The process in the pre-acquisition unit 369 is started when the user performs a pre-acquisition operation of header information via the input device 129 while viewing the screen displayed on the display device 128.

The pre-acquisition unit 369 acquires header information for restoring the header in the data 372 to the header before compression from the storage device 114 via the communication unit 364 in advance. Specifically, for example, the advance acquisition unit 369 acquires necessary header information from the data management device 330 in advance and registers it in the local DB 366. This is effective when the compressed parcel data to be processed by the data decompression apparatus is known in advance. By registering in advance, the data 372 can be restored to the original parcel data even in an offline environment. it can.

In addition, although the process in the pre-acquisition part 369 was started in response to header pre-acquisition operation, not only this but the prior acquisition part 369 may perform a process automatically. For example, the pre-acquisition unit 369 acquires a list of header IDs used by the compressed parcel data allocated to the equipment in the building based on information from a sensor arranged at the entrance of the building, and automatically The header information may be acquired.

Subsequently, details of each process of the data restoration apparatus 360 having the above-described configuration will be described using flowcharts with reference to FIGS.

<Flowchart of processing of determination unit 362>
FIG. 20 is a flowchart illustrating an example of a process flow of the determination unit 362 of the data restoration device 360.

(Step S1501) First, the determination unit 362 acquires data 372 from the acquisition unit 361.

(Step S1502) Next, the determination unit 362 analyzes the header section of the data 372.

(Step S1503) Next, the determination unit 362 determines whether the header section of the data 372 is compressed. Specifically, for example, the determination unit 362 inquires the value of the instruction #CLASS_ID of the data 372 to the data management device 330, and if this is registered in the local DB 366 or the remote DB 332 of the data management device 330 as header information, this data It is determined that 372 header sections are compressed. On the other hand, in other cases, the determination unit 362 determines that the header section of the data 372 is not compressed.

(Step S1504) When it is determined in step S1503 that the header section of the data 372 is compressed (YES in step S1503), the determination unit 362 passes this data 372 to the header information acquisition unit 363 and ends the processing. .

(Step S1505) On the other hand, when it is determined in Step S1503 that the header section of the data 372 is not compressed (NO in Step S1503), it is not necessary to perform the restoration process, so the determination unit 362 sends the parcel data processing unit 368 to The data 372 is passed and the process is terminated.

<Flowchart of Processing of Header Information Acquisition Unit 363>
Next, processing of the header information acquisition unit 363 will be described with reference to FIG. FIG. 21 is a flowchart illustrating an example of the processing flow of the header information acquisition unit 363 of the data restoration device 360.

(Step S1601) First, the header information acquisition unit 363 acquires data 372 from the determination unit 362.

(Step S1602) Next, the header information acquisition unit 363 extracts the structure ID associated with the instruction #CLASS_ID and the header ID associated with the instruction #HEADER from the header section of the acquired data 372.

(Step S1603) Next, the header information acquisition unit 363 uses the combination of the structure ID and the header ID extracted in step S1602 as a key to check whether there is a record including the structure ID and the header ID in the local DB 366. Inquire.

(Step S1604) Next, the header information acquisition unit 363 determines whether or not there is a corresponding record in the local DB 366 using the result inquired in Step S1603.

(Step S1605) When it is determined in Step 1604 that there is a corresponding record in the local DB 366 (YES in Step S1604), the header information acquisition unit 363 acquires header information from the local DB 366.

(Step S1606) On the other hand, if it is determined in Step 1604 that there is no corresponding record in the local DB 366 (NO in S1604), the header information acquisition unit 363 performs communication using the structure ID and header ID pair extracted in Step S1602 as a key. Header information is acquired from the remote DB 322 of the data management device 330 via the unit 364.

(Step S1607) Next, the header information acquisition unit 363 registers the header information acquired in Step S1606 in the local DB 366.

(Step S1608) When the header information is acquired by the processing of step S1605 or S1607, the header information acquisition unit 363 passes the acquired header information, data 372, structure ID, and header ID to the restoration unit 367, and ends the processing.

<Flowchart of processing of restoration unit 367>
Next, processing of the restoration unit 367 will be described with reference to FIG. FIG. 22 is a flowchart illustrating an example of a processing flow of the restoration unit 367 of the data restoration device 360.

(Step S1701) First, the restoration unit 367 acquires header information, data 372, a structure ID, and a header ID from the header information acquisition unit 363.

(Step S1702) Next, the restoration unit 367 generates an empty header section for restoring the header section before compression.

(Step S1703) Next, the restoration unit 367 acquires the header text associated with the header ID acquired in Step S1701 from the header table included in the header information acquired in Step S1701. Then, the restoration unit 367 acquires a list of instructions by extracting each instruction delimited by a delimiter from the header text. Hereinafter, restoration processing is performed in order from the first instruction in the instruction list thus obtained.

(Step S1704) First, if the instruction to be processed is an instruction in the class header section, that is, if the end of the instruction ends with “: =”, the following processing from Step S1705 to Step S1707 is executed.

(Step S1705) The restoration unit 367 acquires the instruction value associated with the set of the structure ID and the instruction to be processed from the cell column text included in the header information.

(Step S1706) When the record including the structure ID acquired in step S1701 and the instruction to be processed is included in the alias table included in the header information, that is, when the instruction to be processed is an alias, the restoring unit 367 In the alias table included in the header information, the original value associated with the set of the structure ID and the instruction to be processed is acquired as the original instruction.

(Step S1707) The restoration unit 367 generates text data of the class header section using the instructions and the instruction values obtained through the processing of Step S1705 and Step S1706, and adds the text data to the header section.

In the header information, data structure identification information (for example, structure ID), the original instruction, and an alias of the original instruction are associated. When the header information includes the original instruction associated with the combination of the data configuration identification information (for example, the structure ID) and the instruction to be processed, the restoration unit 367 includes the original data in the original data. A combination of an instruction and an instruction value associated with the instruction to be processed is included in the header information.

(Step S1708) When the next instruction in the instruction list is an instruction in the class header section, the processing from step S1705 to step S1707 is performed on the next instruction as a processing target. When exiting the loop from step S1704 to step S1708, the instruction to be processed next in the instruction list is an instruction in the schema header section.

(Step S1709) If the instruction to be processed in the instruction list is an instruction in the schema header section, the following processing from Step S1710 to Step S1712 is executed.

(Step S1710) The restoration unit 367 acquires the instruction value associated with the combination of the structure ID acquired in Step S1701 and the instruction to be processed from the cell column table included in the header information.

(Step S1711) When the record including the structure ID acquired in step S1701 and the instruction to be processed is included in the alias table included in the header information, that is, when the instruction to be processed is an alias, the restoring unit 367 In the alias table included in the header information, the original value associated with the set of the structure ID and the instruction to be processed is acquired as an instruction, thereby restoring the instruction.

(Step S1712) The restoration unit 367 generates text data of the schema header section from the instructions and instruction values obtained through the processing of steps S1710 and S1711, and adds the text data to the header section.

(Step S1713) If there is a next instruction in the instruction list, the processing from step S1710 to step S1712 is performed on the next instruction as a processing target.

(Step S1714) When the loop from Step S1709 to Step S1713 is exited, the restoration unit 367 combines the data section of the data 372 with the header section obtained by these processings, before compressing the header section. Restore parcel data.

(Step S1715) Next, the restoration unit 367 passes the restored parcel data to the parcel data processing unit 368 and ends the processing.

As described above, in the data restoration device 360 according to the present embodiment, the determination unit 362 determines whether the header of the data 372 is compressed based on the header included in the text format data 372. When the determination unit 362 determines that the header of the data 372 is compressed, the header information acquisition unit 363 determines the data configuration identification information (for example, structure ID) and the header configuration identification information (for example, the header) from the header of the data 372. ID) is extracted, and header information including information associated with either the extracted header configuration identification information or data configuration identification information is acquired from the storage device 114. The restoration unit 367 restores the original data using the header information acquired by the header information acquisition unit 363.

Thus, the data restoration device 360 according to the present embodiment can restore the original data by restoring the header included in the data to the header before compression. Further, according to the present embodiment, the original parcel data can be completely restored even when the display order of the instructions is individually different.

As described above, the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

1 Information processing system 150 Network 300 Compressed data generation device (information processing device)
330 Data management device 360 Data restoration device (information processing device)
101, 111, 121 CPU (Central Processing Unit)
102, 112, 122 ROM
103, 113, 122 RAM
104, 114, 124

Storage device

106, 116, 126

Medium reader

107, 117, 127

Bus controller

108, 118, 128

Display device

109, 119, 129

Input device

304, 333, 364 Communication unit 301 Header acquisition unit 302 Data configuration Encoding unit 303 Header configuration encoding unit 305 Compressed data generation unit 331 Data management unit 361 Acquisition unit 362 Determination unit 363 Header information acquisition unit 365 Data registration unit 367 Restoration unit 368 Parcel data processing unit 369 Pre-acquisition unit

Claims

Based on the header included in the text data, the class identification information for identifying the class into which the object described by the data is classified, and the attribute defining information for defining the combination and order of the attributes characterizing the class A data configuration encoding unit that assigns data configuration identification information for identifying the set to the set;
A header configuration in which header configuration identification information for identifying the header configuration information is assigned to header configuration information that defines a combination of instructions that are headings described in individual rows of the header and their order based on the data An encoding unit;
A compressed data generation unit that generates compressed data including the data configuration identification information, the header configuration identification information, and an instance included in the data;
An information processing apparatus comprising:
The data configuration encoding unit extracts the class identification information from the data, generates the attribute definition information from the data, and the data configuration identification information for a set of the class identification information and the attribute definition information And storing the data configuration identification information, the class identification information, and the attribute definition information in a storage device in association with each other,
The header configuration encoding unit generates the header configuration information from the data, assigns the header configuration identification information to the generated header configuration information, and associates the header configuration identification information with the header configuration information. The storage device stores the instruction, the instruction value pair is read from the data, and the data configuration identification information, the instruction value, and the instruction value are associated with each other and stored in the storage device. Information processing device.
The header configuration encoding unit acquires a set of an instruction included in the header and a value of the instruction from the data, and a data configuration for identifying the acquired instruction value and the configuration of the data in the storage device Compare the identification information and the value of the instruction associated with the instruction, and if the comparison results in different values, generate an alias for the instruction,
Associating the data configuration identification information, the alias, and the read instruction value in the storage device,
The header configuration identification information and the header configuration information in which the instruction included in the header configuration information is replaced with the alias are associated with each other and stored in the storage device,
The information processing apparatus according to claim 1, wherein the instruction configuration identification information, the alias, and the instruction are associated with each other and stored in the storage device.
A communication unit that communicates with a data management device having the storage device;
The data configuration encoding unit transmits data to be stored in the storage device from the communication unit to the data management device, and stores the data to be stored in the data management device.
The header configuration encoding unit causes the data to be stored in the storage device to be transmitted from the communication unit to the data management device, and the data to be stored is stored in the data management device. Information processing device.
The said data structure encoding part produces | generates the said data structure identification information by performing text encoding with respect to the group of the said class identification information and the said attribute prescription | regulation information. The information processing apparatus described.
The information processing apparatus according to claim 1, wherein the header configuration encoding unit generates the header configuration identification information by performing text encoding on the header configuration information.
The information processing apparatus according to claim 5, wherein the text encoding is calculation of a hash value using a hash function.
An input device for receiving user input;
The data configuration encoding unit assigns a character string received from a user by the input device to the data configuration information,
The information processing apparatus according to any one of claims 1 to 4, wherein the header configuration encoding unit allocates a character string received from a user by the input device to the header configuration information.
The said attribute prescription | regulation information is the information which arranged the value linked | related with the attribute identification information which identifies the said attribute in the said header in the order of appearance in the said header. Information processing device.
The information processing apparatus according to any one of claims 1 to 9, wherein the header configuration information is information in which instructions included in the header are arranged in the order of appearance in the header.
A determination unit for determining whether or not the header of the data is compressed based on a header included in the data in the text format;
When it is determined by the determination unit that the header of the data is compressed, the data configuration identification information and the header configuration identification information are extracted from the header of the data, and the extracted header configuration identification information and the data configuration identification information A header information acquisition unit for acquiring header information including information associated with any one from a storage device;
Based on the header information acquired by the header information acquisition unit, a restoration unit that restores the original data;
With
In the storage device, the data configuration identification information, class identification information for identifying a class in which a target described by the original data is classified, and attribute definition information for defining a combination and order of attributes characterizing the class, Is stored in association with each other, and the header configuration identification information and the header configuration information that defines the configuration of the original header are stored in association with each other.
The restoration unit extracts a list of instructions that are headings described in individual rows of the header from the header configuration information associated with the header configuration identification information included in the header information, and for each extracted instruction, 12. An instruction value associated with the instruction and the data configuration identification information is acquired, and data including a pair of the instruction and the instruction value and an instance included in the compressed data is generated as the original data. The information processing apparatus described.
The header information is associated with the data structure identification information, the original instruction, and an alias of the original instruction,
When the original information associated with the set of the data configuration identification information and the instruction to be processed is included in the header information, the restoration unit includes the original instruction and the header in the original data. The information processing apparatus according to claim 12, wherein the information includes a pair with an instruction value associated with the instruction to be processed.
A second storage device for storing data;
A data registration unit for storing the header information acquired by the header information acquisition unit in the storage device;
The information processing apparatus according to any one of claims 11 to 13, further comprising:
The information processing apparatus according to claim 14, wherein the restoration unit restores the original data based on header information stored in the second storage device.
A communication unit that communicates with a data management device having the storage device;
A pre-acquisition unit for acquiring in advance header information for restoring the header in the data into a header before compression from the storage device via the communication unit;
A data registration unit for storing the header information acquired by the pre-acquisition unit in the second storage device;
The information processing apparatus according to claim 14 or 15, further comprising:
Based on the header included in the data in the text format, the data structure encoding unit determines the class identification information for identifying the class into which the object to be described by the data is classified, the combination of attributes characterizing the class, and the order thereof. Assigning data configuration identification information for identifying the set to the set with the attribute specifying information to define;
A header for identifying the header configuration information with respect to the header configuration information that defines a combination of instructions and their order as headings described in individual rows of the header, based on the data, by the header configuration encoding unit Assigning configuration identification information;
A compressed data generation unit generating compressed data including the data configuration identification information, the header configuration identification information, and an instance included in the data;
An information processing method comprising:
Based on the header included in the text data, the class identification information for identifying the class into which the object described by the data is classified, and the attribute defining information for defining the combination and order of the attributes characterizing the class A data configuration encoding unit that assigns data configuration identification information for identifying the set to the set,
A header configuration in which header configuration identification information for identifying the header configuration information is assigned to header configuration information that defines a combination of instructions that are headings described in individual rows of the header and their order based on the data Encoding unit,
A compressed data generation unit that generates compressed data including the data configuration identification information, the header configuration identification information, and an instance included in the data;
Program to function as.
A step of determining whether the header of the data is compressed based on a header included in the text format data;
When the header information acquisition unit determines that the header of the data is compressed by the determination unit, the data configuration identification information and the header configuration identification information are extracted from the header of the data, and the extracted header configuration identification information And obtaining from the storage device header information including information associated with any of the data configuration identification information;
A restoring unit restoring the original data based on the header information acquired by the header information acquiring unit;
Have
In the storage device, the data configuration identification information, class identification information for identifying a class in which a target described by the original data is classified, and attribute definition information for defining a combination and order of attributes characterizing the class, Is stored in association with each other, and the header configuration identification information and the header configuration information that defines the configuration of the original header are stored in association with each other.
A determination unit that determines whether or not the header of the data is compressed based on a header included in the data in the text format;
When it is determined by the determination unit that the header of the data is compressed, the data configuration identification information and the header configuration identification information are extracted from the header of the data, and the extracted header configuration identification information and the data configuration identification information A header information acquisition unit for acquiring header information including information associated with any one from a storage device;
Based on the header information acquired by the header information acquisition unit, a restoration unit that restores the original data,
Is a program for functioning as
In the storage device, the data configuration identification information, class identification information for identifying a class in which a target described by the original data is classified, and attribute definition information for defining a combination and order of attributes characterizing the class, Is stored in association with each other, and the header configuration identification information and the header configuration information that defines the configuration of the original header are stored in association with each other.