WO2010095457A1 - 解析前処理システム、解析前処理方法および解析前処理プログラム - Google Patents
解析前処理システム、解析前処理方法および解析前処理プログラム Download PDFInfo
- Publication number
- WO2010095457A1 WO2010095457A1 PCT/JP2010/001106 JP2010001106W WO2010095457A1 WO 2010095457 A1 WO2010095457 A1 WO 2010095457A1 JP 2010001106 W JP2010001106 W JP 2010001106W WO 2010095457 A1 WO2010095457 A1 WO 2010095457A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- analysis
- unit
- buffer
- sampling
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/05—Digital input using the sampling of an analogue quantity at regular intervals of time, input from a/d converter or output to d/a converter
Definitions
- the present invention relates to an analysis preprocessing system, an analysis preprocessing method, and an analysis preprocessing program for performing preprocessing on data to be analyzed.
- time series analysis device that analyzes data in a time series for multiple sensors and geographically distributed server logs.
- data to be analyzed is temporarily stored as a database or a file and analyzed by batch processing or the like.
- Non-patent document 1 describes a database for storing such data.
- sensor data observed by a sensor network is stored in a single database on the network.
- the data is referred to by making an inquiry using SQL.
- FIG. 28 shows an example of a general configuration for collecting data to be analyzed by analysis data.
- Each Web server 202 serving as a data generation source is accessed by the client 201 to generate data (log).
- Each Web server 202 transmits the log to the log collection unit 203.
- the log collection unit 203 Upon receipt of the data, the log collection unit 203 stores the data in the storage unit as a database or a file. Then, the log collection unit 203 converts the data into a data format for data analysis and passes it to the data analysis device 204, and the data analysis device 204 performs data analysis.
- the generated data is saved as a database or file, and data analysis is performed.
- a configuration in which the apparatus analyzes the data is mentioned. Further, in the configuration in which the data generation source and the data analysis apparatus advance the processing asynchronously while communicating with each other, it is necessary for both parties to determine whether or not there is a request for communication from the other party, resulting in a complicated system. In order to avoid such a complicated operation, a configuration in which generated data is stored as a database or a file is employed.
- license-free libraries that can be used for the process of transmitting data from the data generation source, the process of receiving the data, and the process of temporarily storing the received data.
- an FTP server may be used.
- an ODBC (Open Database Connectivity) driver may be used in the database. Since such a library can be used, a configuration in which generated data is stored as a database or a file is employed.
- Patent Document 1 describes a configuration in which a microcomputer collects data measured by a plurality of sensors such as a vibration sensor and a pulse sensor, and the microcomputer outputs data to a PDA or the like.
- the microcomputer performs processing for removing the disturbance signal on the original data of the biological signal, totaling processing in units of seconds and minutes, and the like, and generates processed data.
- the microcomputer transmits the processing data to the PDA. Further, in Patent Document 1, when it is determined that there is no change in measurement data and the state of the subject is not yet to measure a biological signal, the measurement operation of the biological signal is waited until a predetermined time elapses. Are listed.
- Patent Document 2 describes a process for suppressing the amount of data per unit time output by a sensor in a sensor network. Specifically, increase the measurement interval of sensor nodes, perform batch transmission of observation information, or perform communication between sensor nodes and router nodes to reduce the amount of data transmitted per unit time Is described.
- Patent Document 3 describes that when the received data is received again in the subsequent stream, the subsequent data stream is interrupted. Further, it is described that filtering related to a customer organization or a user organization is performed on a data stream.
- Patent Document 4 describes a charged beam length measuring device that deletes measurement data when the absolute value of the difference between the first measurement data and the second measurement data exceeds a predetermined value.
- JP 2008-42458 A (paragraph 0051) JP 2002-77277 A (paragraphs 0033, 0035) JP 2002-62123 A (paragraph 0021)
- the number of data generation sources is If the number increases, processing by the data collecting means may not be in time due to concentration of access to the data collecting means (for example, the log collecting means 203 shown in FIG. 28).
- the I / O for data storage is low speed, so there is a possibility that the process of storing the data may not be in time.
- Patent Document 2 describes that a sensor node increases a measurement interval, performs communication between a sensor node and a router node, and the like.
- Japanese Patent Application Laid-Open No. H10-228707 describes waiting for measurement by a sensor.
- the number of data generation sources such as sensor nodes is large, it is difficult to individually control the data generation sources. For example, if a probe car is a data generation source, it is difficult to individually instruct tens of thousands of probe cars to wait for data transmission and the like from the viewpoint of processing load.
- the present invention provides an analysis preprocessing system capable of passing data to a means for analyzing data at high speed while preventing data from overflowing even when a large amount of data is transmitted from a large number of data generation sources.
- An object of the present invention is to provide an analysis preprocessing method and an analysis preprocessing program.
- a pre-analysis processing system uses a data acquisition means for acquiring a data group generated by a plurality of data generation sources, a data cutout means for cutting out individual data from the data group acquired by the data acquisition means, and used for analysis.
- a buffer for storing data to be sampled, sampling means for sampling a part of the extracted data and storing the data in the buffer, and analysis that is a set of data used for analysis from the data stored in the buffer It comprises analysis data determining means for determining a data group, and analysis data output means for sending the analysis data group to the data analysis means for analyzing data.
- the analysis preprocessing method acquires a data group generated by a plurality of data generation sources, cuts out individual data from the acquired data group, samples a part of the cut out data, and stores it in a buffer Then, an analysis data group that is a set of data used for analysis is determined from the data stored in the buffer, and the analysis data group is sent to a data analysis unit that analyzes the data.
- the pre-analysis processing program is a data acquisition process for acquiring a data group generated by a plurality of data generation sources, a data extraction process for extracting individual data from the data group acquired by the data acquisition process, A sampling process for sampling a part of data from the collected data and storing it in the buffer, and an analysis data determination process for determining an analysis data group that is a set of data used for analysis from the data stored in the buffer, An analysis data output process for sending an analysis data group to a data analysis means for analyzing data is executed.
- the data can be transferred at high speed to the means for analyzing the data while preventing the data from overflowing.
- FIG. 1 is a block diagram illustrating an example of a pre-analysis processing system according to the first embodiment of this invention.
- the analysis preprocessing system 7 of the present invention includes a data receiving means 3 for receiving data generated by the time series data generation source 1 and a data stream generating means 4 for processing the received data and sending it to the time series data analyzing means 5.
- a data receiving means 3 for receiving data generated by the time series data generation source 1
- a data stream generating means 4 for processing the received data and sending it to the time series data analyzing means 5.
- the time-series data generation source 1 is a data generation source that sequentially generates data with the passage of time.
- the data transmission means 2 transmits the data generated by the time series data generation source 1 to the analysis preprocessing system 7.
- the time-series data analysis unit 5 performs analysis processing on the data input from the data stream generation unit 4. As shown in FIG. 1, a plurality of time-series data generation sources 1 and data transmission means 2 may be provided.
- the data receiving means 3 receives the data generated by the time series data generating source 1 from each data transmitting means 2.
- the data stream generation means 4 samples the received data. That is, some data is extracted from the received data. Then, the data stream generation means 4 determines a set of data to be analyzed for one time out of the extracted data for each analysis in the time series data analysis means 5, and the time series data analysis means 5 Send to.
- the time series data analysis means 5 performs analysis using this data.
- the operation of the data stream generation unit 4 corresponds to preprocessing for analysis.
- the time-series data generation source 1 and the data transmission means 2 may be included in the analysis preprocessing system.
- the time series data analysis means 5 may be included in the analysis preprocessing system.
- FIG. 2 is a block diagram showing a configuration example of the data stream generation means 4.
- the same elements as those shown in FIG. 1 are denoted by the same reference numerals as those in FIG.
- the data stream generation unit 4 includes a stream data generation unit 401, a sampling unit 406, a transmission data buffer 402, an analysis window generation unit 403, and a stream data transmission unit 404.
- the stream data generating unit 401 converts the data received by the data receiving unit 3 into a data format for analysis.
- the sampling means 406 samples (extracts) data and stores the extracted data in the transmission data buffer 402.
- the transmission data buffer 402 is a memory that temporarily stores data.
- the analysis window generation means 403 When notified that the data has been registered in the transmission data buffer, the analysis window generation means 403 generates a set of data that the time-series data analysis device 5 analyzes at a time.
- the stream data transmission unit 404 transmits data from the transmission data buffer 402 to the time-series data analysis unit 5 in response to a command from the analysis window generation unit 403.
- FIG. 3 is an explanatory diagram showing an example of a physical configuration of the analysis preprocessing system.
- the time-series data generation source 1 exists at physically dispersed positions, and the server collects and analyzes the data.
- each of the n clients PC1, PC2,..., PCn includes a time-series data generation source 1 and a data transmission unit 2.
- Each client is an information processing apparatus such as a PC (personal computer).
- the data receiving means 3, the data stream generating means 4, and the time series data analyzing means 5 are provided in the server PC 8 that performs data analysis.
- the physical configuration shown in FIG. 3 is an example, and is not limited to the example shown in FIG.
- a plurality of time-series data generation sources may be realized by a single computer.
- the data receiving means 3, the data stream generating means 4, and the time series data analyzing means 5 may be realized by different computers. What kind of apparatus implements each unit shown in FIG. 3 may be determined as appropriate according to the number of data to be generated, the processing capability of the computer, and the physical distribution of the time-series data generation source 1.
- the time series data generation source 1, the data transmission means 2, the data reception means 3, the data stream generation means 4, and the time series data analysis means 5 may be provided in one computer.
- the time series data generation source 1 continuously generates data to be analyzed.
- the time-series data generation source 1 may be a sensor, and sensor data to be analyzed may be continuously generated. Further, the time-series data generation source 1 may be a server device such as a Web server, and a log to be analyzed may be continuously generated.
- a case where the time series data generation source 1 is mounted on a vehicle (probe car) and is a sensor that measures, for example, speed, position, traveling direction, and the like will be described as an example. Traffic information can be generated by running tens of thousands of probe cars and collecting and analyzing data from the sensors of each probe car. However, the present invention is applicable to other than data analysis of probe cars.
- FIG. 3 shows a case where each PC operates as the time-series data generation source 1 and the data transmission unit 2. In this example, a base station provided separately from the probe car corresponds to the data transmission unit 2.
- FIG. 4 is an explanatory diagram showing an example of data generated by a sensor (time-series data generation source 1) provided in each probe car.
- the time-series data generation source 1 provided in each probe car generates data including date and time, vehicle ID, latitude, longitude, and speed.
- the date and time is the date and time when the data occurred.
- the vehicle ID is an ID (identification information) of a probe car on which the time-series data generation source 1 is mounted.
- Each probe car is assigned a unique vehicle ID.
- the latitude is the latitude of the probe car position, and the longitude is the longitude of the probe car position.
- the speed is the speed of the probe car, and is the speed in the example shown in FIG. Therefore, the data shown in FIG.
- the data transmission means 2 transmits the data generated by the time series data generation source 1 to the analysis preprocessing system (server PC).
- a base station provided separately from the probe car corresponds to the data transmission means 2.
- the probe car is also provided with transmission means (not shown) for transmitting data to the base station.
- Transmitting means (not shown) provided in the probe car transmits data to the base station (data transmitting means 2) via the wireless LAN, and the base station (data transmitting means 2) transmits the data to the server PC.
- the base station (data transmission means 2) is connected to the server PC via a wired LAN, for example.
- the present invention is also applicable to cases other than data collected from a probe car, and the data transmission method of the data transmission means 2 is not particularly limited.
- data may be transmitted using FTP (FILE
- FIG. 5 is an explanatory diagram showing an example of data transmitted by the data transmission means 2. It is preferable that the data transmission means 2 does not transmit each piece of data individually to the server PC, but transmits a certain number of data collectively. Thus, by transmitting a plurality of data collectively, communication cost can be reduced. As illustrated in FIG. 5, the data transmission unit 2 concatenates data at a delimiter 107, adds a header 106, and transmits the data to the server PC.
- the header 106 is a header defined by a communication protocol, and includes parameters such as the size of transmission data, for example.
- the delimiter 107 is information indicating the boundaries of individual data.
- the data receiving unit 3 receives the data transmitted by the data transmitting unit 2 (for example, data illustrated in FIG. 5).
- the data receiving unit 3 may receive data according to the same communication protocol as the data transmitting unit 2. For example, data may be received by FTP.
- the data stream generating unit 4 divides the data received by the data receiving unit 3 into data one by one and collects the data for analysis by the time-series data unit 5. Further, the data stream generation means 4 samples data and generates an analysis window from the sampled data. Usually, the time-series data analysis means 5 does not analyze data one by one, but repeatedly analyzes a set of data. The analysis window is a set of data to be analyzed in this one analysis. FIG. 6 is an explanatory diagram schematically showing an analysis window. Each circle shown in FIG. 6 represents data generated over time. The set of data 110 is an analysis window 120, and the time-series data analysis means 5 performs one analysis process using one analysis window. The data stream generation unit 4 performs processing for determining an analysis window from the sampled data, and sends the analysis window to the time-series data analysis unit 5.
- analysis window types include time-base window (Time-Base Window) and top-base window (Topple-Base Window).
- the time base window is an analysis window in which data belonging to a certain time is collected.
- the topple base window is an analysis window in which a certain number of data is specified in time series and collected.
- FIG. 6 shows an example of a tuple base window and shows a case where an analysis window is generated for each two pieces of data.
- the data stream generation means 4 determines an ID (window ID) for identifying the analysis window for each analysis window, inserts the window ID into the data, and passes it to the time-series data analysis means 5.
- FIG. 7 is an explanatory diagram showing an example of input / output of the data stream generating means 4.
- the data stream generating unit 4 receives data including a communication header 106 from the data receiving unit 3, in which a plurality of pieces of data are concatenated 107.
- the data stream generation means 4 cuts out each piece of data from the input data, assigns a window ID, and passes the data assigned the window ID to the time-series data analysis means 5.
- the data stream generation means 4 assigns a common window ID to each data to be included in one analysis window.
- a set of data to which a common window ID is assigned is analyzed simultaneously in one analysis.
- the individual data to which the window ID is assigned is data generated by the time-series data generation source 1, and in this example includes date and time, vehicle ID, latitude, longitude, and speed.
- the stream data generating means 401 performs format conversion on the data received by the data receiving means 3 from the data transmitting means 2 (not shown in FIG. 2, refer to FIG. 1), and divides the data into individual data.
- the stream data generation unit 401 may determine the header 106 and the break 107 (see FIG. 7), and cut out the data between the header 106 and the break 107 and the data between the breaks 107, respectively.
- the format of the data is standardized by RFC (Request for Comments) etc., and when the received data conforms to the RFC specification, the boundary between the header and the data and the delimiter between the data are determined according to the specification, and each data Can be cut out.
- FIG. 8 shows an example of data cut out by the stream data generating unit 401. When the data illustrated in FIG. 5 is input, the stream data generation unit 401 cuts out three pieces of data as shown in FIG.
- Sampling means 406 samples individual data cut out by stream data generation means 401 and stores the sampled data in transmission data buffer 402. The sampling means 406 discards each data that has not been sampled.
- the transmission data buffer 402 is a memory that stores the data sampled by the sampling means 406.
- FIG. 9 is a schematic diagram illustrating an example of a memory image in the transmission data buffer 402.
- FIG. 9 illustrates a case where a list structure is employed.
- One data is stored in the memory area 131 for storing one data.
- a pointer 132 that connects the memory areas is defined.
- the sampling unit 406 notifies the analysis window generation unit 403 of each pointer via the stream data generation unit 401 when each data is stored.
- the pointer may be notified directly to the analysis window generation unit 403. By following the pointer, each data can be accessed in order.
- the manner in which the transmission data buffer 402 stores data is not limited to the example of FIG.
- the transmission data buffer 402 may store data in a table structure instead of a list structure.
- the analysis window generation unit 403 receives a pointer notification to the memory area storing the data at the timing when the sampling unit 406 stores the data in the transmission data buffer, and generates an analysis window based on the pointer.
- the specification of the analysis window is set in advance.
- the analysis window specifications include the type of analysis window and the size of the window. As the type of the analysis window, it is determined whether the analysis is performed in the time base window or the top base window. As the window size, time is determined in the case of a time base window, and the number of data is determined in the case of a top base window.
- the analysis window generation means 403 generates an analysis window according to the defined specifications. For example, assume that analysis is performed using a time base window, and time is defined as the window size. In this case, when generating the analysis window, the analysis window generating unit 403 stores the generation date and time of the analysis window, and calculates the timing for generating the next analysis window by adding the window size to the date and time. . When the notification of the pointer is received from the sampling unit 406 as new data is added, the analysis window generation unit 403 accesses the date / time field in the data in the memory area indicated by the notified pointer. Then, it is determined whether or not the date and time exceeding the generation timing of the next analysis window is stored.
- the analysis window generation means 403 assigns a new window ID to each data stored in the transmission data buffer, thereby analyzing one of those analysis.
- a window is defined, and a transmission command for the collection of data (analysis window) is issued to the stream data transmission unit 404.
- the analysis window generation unit 403 counts the number of times the notification is received each time the pointer notification is received as new data is added. The number of times the notification is received means the number of data added to the transmission data buffer 402. Upon receiving notifications for the number of windows determined by the window size, the analysis window generation means 403 assigns a new window ID to each data stored in the transmission data buffer, thereby determining that one analysis window.
- the stream data transmission means 404 issues a command to transmit the data set (analysis window). At this time, the count value of the number of times of notification is initialized to zero.
- a set of pointers to memory areas for storing data belonging to the newly defined analysis window is issued as a data set transmission command.
- the stream data transmission unit 404 When the stream data transmission unit 404 receives a transmission command for a set of data (that is, a pointer to a memory area for storing data to be transmitted) from the analysis window generation unit 403, the stream data transmission unit 404 stores the instruction in the memory area indicated by each pointer. Data is transmitted to the time series data analysis means 5. When the data is transmitted, the stream data transmission unit 404 deletes the data from the transmission data buffer 402.
- a transmission command for a set of data that is, a pointer to a memory area for storing data to be transmitted
- the time series data analysis means 5 analyzes the data received from the data stream generation means 4.
- the time series data analysis means 5 includes storage means (not shown) for storing the data received from the data stream generation means 4, and stores the received data in the storage means. Then, the time-series data analysis unit 5 reads the data to which the same window ID is added and analyzes the data. The read data is deleted from the storage means.
- the time-series data analysis means 5 matches the probe car data with a road map, for example, and shows the traffic jam at which position the traffic jam occurs from the average speed of the probe car. Generate information. This process is performed at regular intervals (for example, every 5 minutes). In this case, it may be determined that the analysis is performed in the time base window.
- the processing performed by the time-series data analysis unit 5 may be determined according to the data generated by the data generation source 1 and the analysis purpose, and is not limited to a specific analysis process.
- FIG. 10 is a block diagram showing a configuration example of the sampling means 406. As shown in FIG.
- the sampling unit 406 includes a sampling rate storage unit 40603, a sampling rate setting unit 40602, and a sample extraction unit 40601.
- Sampling rate storage means 40603 is a memory for storing the sampling rate.
- the sampling rate is a rate at which data is sampled from the data group given from the stream data generating unit 401.
- the sampling rate setting unit 40602 stores the sampling rate input from the outside in the sampling rate storage unit 40603.
- the sampling rate setting means 40602 displays a GUI (Graphic User Interface) on a display device (not shown) of the analysis preprocessing system, receives a sampling rate from an administrator of the analysis preprocessing system, and receives the sampling rate. Is stored in the sampling rate storage means 40603.
- the sampling rate may be input in other manners.
- the administrator of the analysis preprocessing system may input the sampling rate “0.2”.
- the sampling rate setting unit 40602 stores the sampling rate “0.2” in the sampling rate storage unit 40603.
- the sampling rate “0.2” is an example, and other values may be used.
- a uniform sampling rate that does not depend on the time-series data generation source 1 may be set as the sampling rate.
- the sampling rate may be determined for each time series data generation source 1 (for example, for each vehicle ID of the probe car).
- the sampling rate setting unit 40602 causes the sampling rate storage unit 40603 to store the sampling rate for each time-series data generation source.
- the sample extraction unit 40601 samples the plurality of data divided by the format conversion in the stream data generation unit 401 at the sampling rate set in the sampling rate storage unit 40603, and transmits the sampled data to the transmission data buffer 402 is stored. In addition, the sample extraction unit 40601 discards data that has not been sampled.
- the sample extraction means 40601 randomly extracts data in order to reduce the influence on the analysis accuracy of the time series data analysis means 5 due to the data discard. For example, assuming that the sampling rate is s, one piece of data is sampled from (1 / s) pieces of data.
- the sample extraction means 40601 may generate a random number in the range from 0 to n ⁇ 1 for each data, and store the data in which the random number is divisible by n in the transmission data buffer 402. .
- random numbers are generated in the range from 0 to 4 for each data, and data in which the random number is divisible by 5 may be stored in the transmission data buffer 402.
- the sample extraction unit 40601 stores the data in the transmission data buffer 402
- the sample extraction unit 40601 notifies the analysis window generation unit 403 of the pointer of the memory area.
- the data receiving means 3, the stream data generating means 401 of the data stream generating means 4, the sampling means 406 (sampling rate setting means 40602, the sample extracting means 40601), the analysis window generating means 403, and the stream data transmitting means 404 are For example, it is realized by a CPU of a computer that operates according to an analysis preprocessing program.
- the analysis preprocessing system includes program storage means (not shown) for storing the analysis preprocessing program, and the CPU reads the program, and in accordance with the program, the data receiving means 3 and the data stream generating means 4 generate stream data.
- the unit 401, the sampling unit 406, the analysis window generation unit 403, and the stream data transmission unit 404 may be operated. Each of these means may be realized by separate dedicated circuits.
- time-series data generation source 1, the data transmission means 2, and the time-series data analysis means 5 are also realized by a CPU that operates according to a program, for example.
- FIG. 11 is a flowchart illustrating an example of processing progress according to the first embodiment of this invention.
- Sampling rate setting means 40602 is assumed to receive a sampling rate in advance and store the sampling rate in sampling rate storage means 40603.
- each time-series data generation source 1 generates data and the data transmission means 2 transmits data to the pre-analysis processing system
- the data stream generation step (step S2) is a process in which the analysis preprocessing system (for example, server PC) that has received the data receives the data, samples the data, stores it in the transmission data buffer 402, and generates an analysis window.
- the time-series data analysis unit 5 analyzes the data is referred to as a time-series data reception analysis step (step S3).
- Steps S1, S2, and S3 are independent processes and are executed in parallel. That is, steps S1, S2, and S3 are executed asynchronously.
- each time-series data generation source 1 continuously generates data as time passes (step S101).
- Each time-series data generation source 1 may include the generation time (data generation time) in the data to be generated.
- Each time-series data generation source 1 sends data to the data transmission unit 2, and the data transmission unit 2 stores the data in a buffer (not shown) in order to transmit the data collectively (step S102).
- This buffer is a buffer for buffering data on the data transmission means 2 side. Further, the data transmission means 2 determines whether or not it is time to transmit the data accumulated in the buffer (step S103).
- step S103 the data transmission unit 2 combines the data and transmits it to the pre-analysis processing system 7 (step S104), and deletes the transmitted data from the buffer. (Step S105). If it is not time to transmit data, steps S101 and S102 are repeated.
- the time series data generation source 1 may execute the processes of steps S101, S102, S103, and S105.
- the data reception means 3 receives the data transmitted by the data transmission means 2 (step S201).
- the data receiving means 3 also includes a buffer (not shown), and temporarily stores the received data in the buffer. Then, the data in the buffer is input to the data stream generation means 4 asynchronously with the data reception timing. For this reason, step S2 can be performed asynchronously with step S1.
- the stream data generating unit 401 converts the format of the data input from the data receiving unit 3 and cuts out each piece of data from the combined data (step S202).
- the stream data generation unit 401 inputs the cut out individual data to the sampling unit 406.
- the sample extraction means 40601 of the sampling means 406 refers to the sampling rate stored in the sampling rate storage means 40603 and samples the given data according to the sampling rate.
- the sample extraction means 40601 stores the sampled data in the transmission data buffer and discards other data (step S203). Further, the sample extraction unit 40601 notifies the analysis window generation unit 403 of a pointer to the memory area in which the data is stored.
- the analysis window generation unit 403 determines whether or not a condition for generating the analysis window is satisfied (step S204). For example, in the case where it is specified that the analysis is performed in the top base window, it is determined whether or not the number of notifications determined by the window size has been received. Alternatively, when the analysis is specified to be performed in the time base window, it is determined whether or not the period determined by the window size has elapsed since the last analysis window generation. If the conditions for generating the analysis window are satisfied (Yes in step S204), a common window ID is added to each data to be included in the analysis window, and an analysis window transmission command is issued (step S205).
- the stream data transmission unit 404 transmits a data group (that is, an analysis window) to which a common window ID is assigned to the time-series data analysis unit 5 (step S206). Then, the stream data transmission unit 404 deletes the data transmitted in step S206 from the transmission data buffer 402 (step S207).
- a data group that is, an analysis window
- the stream data transmission unit 404 deletes the data transmitted in step S206 from the transmission data buffer 402 (step S207).
- the process of cutting out each piece of data and using it as an analysis window corresponds to the pre-processing of analysis.
- the time-series data analysis unit 5 receives the data (analysis window) transmitted by the stream data transmission unit 404 (step S301).
- the time-series data analysis unit 5 includes an analysis buffer (not shown), and temporarily stores the data transmitted by the stream data transmission unit 404 in the analysis buffer.
- the time-series data analysis means 5 analyzes the data stored in the analysis buffer asynchronously with the data reception timing (step S302). For this reason, step S2 and step S3 can also be performed asynchronously. Specifically, data analysis can be performed asynchronously with the operation in which the stream data transmission unit 404 transmits the analysis window.
- the time-series data analyzing unit 5 deletes the data that has been analyzed in step S302 from the buffer of the time-series data analyzing unit 5 (step S303).
- the data receiving means 3 when the data receiving means 3 receives the data generated by each time-series data generation source 1, the data is stored in the memory (transmission data buffer 402), not as a database or a file.
- the data can be sent to the time-series data analysis means 5 quickly. it can.
- not all the data received by the data receiving means 3 is stored in the transmission data buffer 402, but the sampled data is stored in the transmission data buffer 402. Therefore, even if there are a large number of time-series data generation sources 1 and a large amount of data is received, it is possible to prevent the data from overflowing in the analysis pre-processing system and to send the pre-processed data to the time-series data analysis means 5 Can send.
- sampling means 406 sample extraction means 40601 provided in the pre-analysis processing system does not cause the individual data transmission means 2 or the time series data generation source 1 to perform sampling, but the data transmission means 2 and the time series. Sampling is performed asynchronously with the data source 1. Therefore, it is not necessary to perform control for causing the data transmission means 2 or the time-series data generation source 1 to perform sampling individually.
- the analysis preprocessing system of the second embodiment of the present invention includes a data receiving means 3 and a data stream generating means 4 (see FIG. 1), and a time-series data generation source 1 is generated.
- the received data is received from the data transmission means 2, the data is preprocessed and sent to the time series data analysis means 5.
- the data stream generation unit 4 includes a stream data generation unit 401, a sampling unit 406, a transmission data buffer 402, and an analysis window generation unit 403. And stream data transmission means 404 (see FIG. 2).
- the operation of the sampling means 406 is different from that of the first embodiment.
- the sampling unit 406 performs sampling at a sampling rate designated from the outside.
- the sampling unit 406 calculates the predicted value of the input data amount and the usage amount of the transmission data buffer 402, and dynamically determines the sampling rate.
- FIG. 12 is a block diagram showing a configuration example of the sampling means 406 in the second embodiment.
- the same elements as those in the first embodiment are denoted by the same reference numerals as those in FIG. 10, and detailed description thereof is omitted.
- the sampling unit 406 in the second embodiment includes a sample extraction unit 40601, a sampling rate storage unit 40603, a sampling rate calculation unit 40605, a flow rate monitoring unit 40606, and a transmission data buffer usage amount measurement unit 40607.
- the sampling rate storage means 40603 is a memory for storing the calculated sampling rate.
- the sample extraction unit 40601 refers to the sampling rate, samples the data input from the stream data generation unit 401, and stores the sampled data in the transmission data buffer 402. However, in this embodiment, the sample extraction unit 40601 further notifies the flow rate calculation unit 40606 of the data amount input from the stream data generation unit 401 within a certain time interval.
- the flow rate calculation unit 40606 predicts the amount of data (number of data) input from the stream data generation unit 401 in the future from the amount of data (number of data) per fixed time input from the stream data generation unit 401. “Future” is, for example, a point in time after a predetermined time elapses from the time point when the calculation for predicting the data amount is performed. The value of the predetermined time may be determined in advance.
- the flow rate calculation unit 40606 is notified of the data amount per fixed time from the sample extraction unit 40601. This means that a set of t, y is notified.
- the flow rate calculation means 40606 stores the prediction result of the data amount and provides it to the sampling rate calculation means 40605.
- the transmission data buffer usage measurement means 40607 measures the amount of memory used in the transmission data buffer 402. For example, as illustrated in FIG. 9, it is assumed that the transmission data buffer 402 stores data in a list structure. In this case, the transmission data buffer usage measuring means 40607 counts the number of stored data by following the list. Then, the amount of memory used in the transmission data buffer 402 can be calculated by multiplying the number of data by the data size per data. However, this calculation is merely an example, and the transmission data buffer usage amount measuring unit 40607 may calculate the used memory amount by a calculation method according to the memory structure of the transmission data buffer 402.
- the sampling rate calculation means 40605 calculates the sampling rate with reference to the future data amount predicted by the flow rate monitoring means 40606 and the used memory amount calculated by the transmission data buffer usage amount measurement means 40607.
- the sampling rate calculation means 40605 stores in advance the maximum amount of memory that can be used in the transmission data buffer 402.
- the sampling rate calculation unit 40605 reads the predicted data number from the flow rate monitoring unit 40606, reads the current memory usage from the transmission data buffer usage measurement unit 40607, and calculates the sampling rate using these values.
- the sampling rate may be calculated using the following equation (1).
- R is the sampling rate.
- M is the maximum amount of memory that can be used.
- N is the amount of used memory currently used.
- D is the data size per one.
- F is the amount of data (number of data) sent in the future predicted by the flow rate monitoring means 40606.
- MN represents the amount of free memory in the transmission data buffer 402, and by dividing this by D, the number of data that can be stored in the free memory is obtained. Further, by dividing this by F, the maximum sampling rate that can prevent the transmission data buffer 402 from overflowing is obtained. Since the prediction of the flow rate monitoring means 40606 includes an error, in order to prevent data overflow, in Equation (1), (((MN) / D) / F) is multiplied by 0.8 as a coefficient. ing. The value of this coefficient is not limited to 0.8.
- Expression (1) is an expression for calculating the free space from the use amount of the transmission data buffer 402 and calculating sampling data from the relationship between the number of data that can be stored in the free space and the predicted data amount. It can be said.
- the sampling rate calculation means 40605 may determine the sampling rate by other methods. For example, the transmission data buffer usage measuring unit 40607 holds the usage of the transmission data buffer 402 for each fixed period as a history, and similarly, the flow rate monitoring unit 40606 predicts the future data volume for each fixed period and predicts the result. Is stored as a history.
- the sampling rate calculation means 40605 refers to the history of the usage amount of the transmission data buffer 402 and the history of the predicted data amount. If the usage amount and the predicted data amount of the transmission data buffer 402 tend to increase, the sampling rate is reduced. In the opposite case, the sampling rate may be changed by increasing the sampling rate.
- the sample extraction means 40601, the sampling rate calculation means 40605, the flow rate monitoring means 40606, and the transmission data buffer usage amount measurement means 40607 are realized by a CPU of a computer that operates according to the analysis preprocessing program, for example.
- the CPU may operate as the sample extraction unit 40601, the sampling rate calculation unit 40605, the flow rate monitoring unit 40606, the transmission data buffer usage amount measurement unit 40607, and other units according to the analysis preprocessing program.
- the sample extraction unit 40601, the sampling rate calculation unit 40605, the flow rate monitoring unit 40606, and the transmission data buffer usage amount measurement unit 40607 may be realized by separate dedicated circuits.
- FIG. 13 is a flowchart illustrating an example of the processing progress of sampling rate calculation.
- the sample extraction unit 40601 notifies the flow rate monitoring unit 40606 of the amount of data sent from the stream data generation unit 401 at a certain time interval. Then, the flow rate monitoring unit 40606 predicts the data amount sent from the stream data generating unit 401 in the future from the data amount at fixed time intervals (step S601).
- the transmission data buffer usage measurement unit 40607 measures the current memory usage in the transmission data buffer 402 (step S602).
- the sampling rate calculation means 40605 calculates the sampling rate by performing the calculation of Expression (1) using the predicted data amount and the current memory usage amount (step S603).
- the sampling rate calculation means 40605 dynamically calculates the sampling rate according to the change. For example, the flow force monitoring unit 40606 periodically obtains the predicted data amount, the transmission data buffer usage amount measuring unit 40607 also periodically measures the used memory amount, and the sampling rate when the predicted data amount or the used memory amount fluctuates. May be recalculated.
- time series data generation and transmission step (step S1), the data stream generation step (step S2), and the time series data reception analysis step (step S3) are the same as those in the first embodiment, and the operations shown in FIG. A similar operation may be performed.
- the sample extraction means 40601 uses the sampling rate calculated by the sampling rate calculation means 40605 when sampling data.
- the sampling rate is dynamically calculated from the predicted future data amount and the current memory usage amount, so that the data overflow in the transmission data buffer 402 is prevented and the waste in the transmission data buffer 402 is wasted. Free memory can be reduced.
- FIG. 1 The analysis preprocessing system according to the third embodiment of the present invention includes a data receiving means 3 and a data stream generating means 4 as in the first and second embodiments (see FIG. 1).
- the data generated by the time series data generation source 1 is received from the data transmission means 2, the data is preprocessed and sent to the time series data analysis means 5.
- FIG. 14 is an explanatory diagram illustrating a configuration example of the data stream generation unit 4 according to the third embodiment.
- the data stream generation unit 4 in this embodiment includes a filtering unit 407 in addition to the stream data generation unit 401, the sampling unit 406, the transmission data buffer 402, the analysis window generation unit 403, and the stream data transmission unit 404.
- the stream data generation unit 401, the transmission data buffer 402, the analysis window generation unit 403, and the stream data transmission unit 404 are the same as those in the first and second embodiments.
- the sampling unit 406 performs sampling on the data input from the filtering unit 407.
- the sampling means 406 may be the same as the sampling means (see FIG. 10) in the first embodiment or the sampling means (see FIG. 12) in the second embodiment. That is, the sampling unit 406 may sample data at a sampling rate input from the outside, as in the first embodiment. Alternatively, the amount of data sent in the future may be predicted and the amount of memory used may be measured, and sampling may be performed by calculating the sampling rate. However, in this embodiment, when the flow rate monitoring unit 40606 of the sampling unit 406 predicts the amount of data to be transmitted in the future, the amount of data input from the future filtering unit 407 may be predicted.
- the filtering unit 407 performs a filtering process on each piece of data cut out by the stream data generating unit 401 from the data received by the data receiving unit 3. In other words, the filtering unit 407 determines, for each data, whether each piece of data cut out by the stream data generation unit 401 satisfies a predetermined condition, and sends data satisfying the predetermined condition to the sampling unit 406. Input and discard data that does not satisfy the predetermined condition.
- the predetermined condition is a condition indicating that the data is useful for analysis.
- the predetermined condition for example, a condition that “the data content is different from any data already stored in the transmission buffer 402” may be used. Assume that data having the same contents as data already stored in the transmission data buffer 402 is stored in the transmission data buffer 402. In this case, the stream data transmission unit 404 transmits a plurality of data having the same content to the time-series data analysis unit 5. However, the time series data analysis means 5 may not require a plurality of data having the same contents when performing analysis.
- a sensor time-series data generation source 1 provided in each probe car generates data (see FIG. 4) including the position, speed, and vehicle ID of the probe car at regular time intervals, and time-series data analysis means Assume that 5 performs analysis on the data.
- the stopped probe car repeatedly generates data having the same content in position, speed, and vehicle ID.
- the analysis process of the time-series data analysis means 5 when the situation (position or speed) of a certain probe car changes, the changed contents are required and it is not necessary to refer to the data whose contents have not changed. Sometimes. In such a case, the data with the same contents of position, speed, and vehicle ID is redundant data and is not used for analysis.
- the data of the stopped vehicle is not necessary for calculating the average speed, and a plurality of such data is sent to the time-series data analysis means 5. There is no need.
- the filtering unit 407 stores data satisfying the condition that “the content of the data is different from any data already stored in the transmission buffer 402” in the transmission data buffer 402, and data that does not satisfy the condition (that is, The data having the same contents as the data already stored in the transmission data buffer 402 is discarded. As a result, it is possible to prevent redundant data from being sent to the time series analysis means 5.
- the predetermined condition is referred to as a first condition.
- the first condition is an example of a predetermined condition indicating that the data is useful for analysis, and other conditions may be used as will be described later.
- FIG. 15 is a block diagram illustrating a configuration example of the filtering unit 407.
- the filtering unit 407 includes a data selection unit 40701 and an identity determination unit 40702.
- the identity determination unit 40702 determines whether or not the contents of the data are the same between the data input from the stream data generation unit 401 and the data already stored in the transmission data buffer 402. judge.
- Each piece of data input from the stream data generation unit 401 is data to be subjected to filtering determination, and is hereinafter referred to as filtering determination target data.
- the time series data source 1 since the data contents are the same, it is essential that the time series data source 1 is the same. For example, in the case of data relating to the probe car illustrated in FIG. 4, it is essential that the vehicle IDs are the same. Data with different vehicle IDs are not data of the same content even if the latitude, longitude, and speed match.
- the same time-series data generation source 1 when it is assumed that the same time-series data generation source 1 is the same data, the date and time are different among the data generated with the passage of time. Therefore, when determining whether or not the contents are the same, whether or not the dates and times are the same may be ignored.
- the items included in the data there may be items such as date and time that can be ignored whether or not they are the same.
- the identity determination unit 40702 calculates the difference between the value included in the data stored in the transmission data buffer 402 and the value included in the filtering determination target data, and the difference is determined in advance. What is necessary is just to determine whether it is in the range. For example, regarding the speed, the difference between the speed in the data stored in the transmission data buffer 402 and the speed in the filtering determination target data is calculated, and if the difference is within the range of ⁇ 5 to +5, the speed Are determined to be the same.
- the unit of ⁇ 5, +5 shown in this example is “km / h”.
- latitude and longitude it is determined whether or not the difference in value between the data is within a predetermined range, and if it is within the range, it may be determined that the content is the same.
- the identity determination unit 40702 matches the ID of the time-series data generation source 1 (for example, the vehicle ID) between the filtering determination target data and the data stored in the transmission data buffer 402. If it is determined that the contents of the items (for example, latitude, longitude, and speed) are the same, the data may be determined to be the same. Also, when the IDs of the time-series data source 1 do not match or there are items that are determined not to have the same content in any of the other items (for example, latitude, longitude, or speed) What is necessary is just to determine that data is not the same content.
- the IDs of the time-series data generation source 1 for example, the vehicle ID
- the data selection means 40701 checks for each filtering determination target data whether or not the content of the filtering determination target data is determined not to be the same as any data in the transmission data buffer 402. Then, the data selection unit 40701 inputs the filtering determination target data to the sampling unit 406 or discards it according to the confirmation result.
- the filtering target data When it is determined that the content of the filtering determination target data is not the same as any data in the transmission data buffer 402, the filtering target data satisfies the first condition. In this case, the data selection unit 40701 inputs the filtering determination target data to the sampling unit 406.
- the data selection unit 40701 discards the filtering determination target data.
- the filtering unit 407 (data selection unit 40701, identity determination unit 40702) is realized by a CPU of a computer that operates in accordance with a pre-analysis processing program, for example.
- the CPU may operate as filtering means 407 (data selection means 40701, identity determination means 40702) or other means according to the analysis preprocessing program.
- the data selection means 40701 and the identity determination means 40702 may be realized by separate dedicated circuits.
- FIG. 16 is an explanatory diagram illustrating an example of processing progress of the third embodiment.
- the same processes as those in the first embodiment are denoted by the same reference numerals as those in FIG.
- the time series data generation / transmission step (step S1) and the time series data reception analysis step (step S3) are the same as those in the first embodiment.
- step S2 after the stream data generation unit 401 performs format conversion and cuts out each piece of data from a plurality of combined data (step S202), the filtering unit 407 The filtering process (step S208) is performed on each data, and the sampling unit 406 samples the filtering process result.
- the other points are the same as in the first embodiment.
- FIG. 17 is a flowchart showing an example of processing progress of the filtering process (step S208).
- the stream data generation unit 401 cuts out each piece of data (see step S202, FIG. 16)
- the stream data generation unit 401 inputs the data to the filtering unit 407.
- Each piece of data is filtering determination target data.
- the identity determination unit 40702 determines whether or not the content is the same with each piece of data stored in the transmission data buffer 402 for each filtering determination target data. (Step S701).
- the data selection unit 40701 inputs the filtering determination target data determined not to have the same content as any data in the transmission data buffer 402 to the sampling unit 406 (step S702). On the other hand, the filtering determination target data determined to have the same content as any data in the transmission data buffer 402 is discarded (step S702). By executing the process of step S702, data to be processed after the sampling process is selected.
- Sampling means 406 performs sampling processing (step S203) corresponding to the sampling rate on the data input from data selection means 40701.
- the sampling rate may be a value input from the outside as in the first embodiment, or may be a value calculated by the sampling means 406 as in the second embodiment.
- the filtering unit 407 discards redundant data that is not used in the analysis before the sampling process. Therefore, it is possible to prevent redundant data from being stored in the transmission data buffer 402. Then, the data discarded in the sampling process can be reduced by that amount, and the data can be stored in the transmission data buffer 402 as much as possible. That is, the transmission data buffer 402 can be used effectively.
- the condition (first condition) that “the data content is different from any data already stored in the transmission buffer 402” is used as the predetermined condition used in the filtering process. explained.
- the operation of the filtering means 407 is different, but the other means are the same as those of the third embodiment.
- the predetermined condition used in the filtering process a condition that “the content of data satisfies a predetermined standard” is used.
- This condition is referred to as a second condition.
- an error may be included in the contents included in the data. Even if the data includes an error, it can be effectively used for analysis if the data satisfies the criteria.
- a criterion for discriminating valid data that can be used for analysis is determined in advance, and the filtering unit 407 determines whether or not the content of the filtering determination target data satisfies this criterion, and satisfies the criterion. Discard no data.
- the data often includes position, speed, direction, and the like.
- these values include errors.
- the position for example, latitude and longitude
- GPS Global Positioning System
- the filtering unit 407 eliminates it.
- FIG. 18 is a block diagram illustrating a configuration example of the filtering unit 407 in the present modification.
- the filtering unit 407 in this modification includes a valid data defining unit 40713, a validity determining unit 40712, and a data selecting unit 40711.
- the valid data definition unit 40713 is a storage device that stores a reference for data contents that can be used effectively.
- FIG. 19 is an explanatory diagram illustrating an example of the criteria stored in the valid data definition unit 40713.
- the standard illustrated in FIG. 19 corresponds to the data illustrated in FIG. 4 and indicates the standard that should be satisfied by the date, vehicle ID, latitude, longitude, and speed.
- “Minimum” and “maximum” shown in FIG. 4 define the range of values of these items. If the value of an item included in the data is included in the range from “minimum” to “maximum”, the value of the item is valid. For example, in the example shown in FIG.
- the date and time are valid if they are included in the range from “one day before the current time” to “the current time”.
- the vehicle ID is valid if it is included in the range of “CID0001” to “CID9999”.
- the numerical value range may be defined.
- latitude it is effective if it falls within the range of 34.000 to 36.000.
- longitude it is effective if it falls within the range of 134.000 to 136.000.
- the speed it is effective if it is within the range of 0 to 120.
- “minimum” and “maximum” are defined, but only one of them may be defined.
- the “difference” shown in FIG. 19 is a standard that defines the relationship with the immediately preceding data (the immediately preceding data with the same time-series data generation source). For example, in the example shown in FIG. 19, the date and time are valid if the date and time difference from the immediately preceding data with the same vehicle ID is within one hour. For the vehicle ID, “difference” is not defined. Regarding the latitude, it is effective if the difference in latitude from the immediately preceding data with the same vehicle ID is 0.01 or less. Regarding the longitude, it is effective if the difference in longitude from the immediately preceding data with the same vehicle ID is 0.01 or less. Regarding the speed, it is effective if the difference in speed from the immediately preceding data with the same vehicle ID is 120 or less.
- the standards defined by “Minimum” and “Maximum” are absolute standards that should be satisfied by the items included in the data. “Difference” is a relative standard that items included in data should satisfy in relation to other data. In the example shown in FIG. 19, an absolute reference (minimum, maximum) and a relative reference (difference) are set, but only one of them may be set.
- the validity determination unit 40712 When the filtering determination target data is input from the stream data generation unit 401, the validity determination unit 40712 satisfies each criterion stored in the effective data definition unit 40713 for each item in the filtering determination target data. It is determined whether or not. For example, assume that the criteria illustrated in FIG. 19 are stored. The validity determination unit 40712 determines whether the date, vehicle ID, latitude, longitude, and speed in the filtering determination target data belong to a range from the minimum value to the maximum value. Further, for each of the date, latitude, longitude, and speed, the difference from the value in the immediately preceding filtering determination target data is calculated, and it is determined whether or not the calculation result satisfies the standard defined as “difference”.
- the effectiveness determination means 40712 determines the effectiveness of certain filtering determination target data, and if the filtering determination target data is generated at the same time-series data generation source, This is stored until the filtering determination target data is input.
- the relative reference may be determined with reference to the immediately preceding data stored in the transmission data buffer 402.
- the data selection unit 40711 confirms the determination result by the validity determination unit 40712 for each filtering determination target data. Then, the data selection unit 40711 inputs the filtering determination target data to the sampling unit 406 or discards it according to the confirmation result.
- the filtering target data satisfies the second condition.
- the data selection unit 40711 inputs the filtering determination target data to the sampling unit 406.
- the data selection unit 40711 discards the filtering determination target data. For example, if it is determined that any item does not satisfy the absolute criterion or the relative criterion, the data selection unit 40711 discards the filtering determination target data.
- the data selection unit 40711 and the validity determination unit 40712 of the filtering unit 407 of the present modification are realized by, for example, a CPU of a computer that operates according to a pre-analysis processing program.
- the CPU may operate as the data selection unit 40711, the validity determination unit 40712, and other units according to the analysis preprocessing program.
- the data selection means 40711 and the identity determination means 40712 may be realized by separate dedicated circuits.
- FIG. 20 is a flowchart illustrating an example of processing progress of the filtering process in the present modification.
- the validity determination unit 40712 determines whether each item in the filtering determination target data satisfies an absolute criterion (step S711). . For example, when the standard illustrated in FIG. 19 is determined, it is determined whether date / time, vehicle ID, latitude, longitude, and speed are included in a range from a minimum value to a maximum value.
- the validity determination unit 40712 determines whether each item in the filtering determination target data satisfies the relative standard. Is determined (step S713). For example, with respect to time, latitude, longitude, and speed, a difference from the previous filtering determination target data having the same vehicle ID is calculated, and the difference satisfies a predetermined standard ("difference" illustrated in FIG. 19). It is determined whether or not.
- the data selection means 40711 confirms the determination result regarding the absolute reference and the determination result regarding the relative reference. Then, in the determination regarding the absolute reference (step S711) or the determination regarding the relative reference (step S713), when any item is determined not to satisfy the reference (No in step S712 or No in step S714). ), The data selection means 40711 discards the filtering determination target data (step S716). In addition, when each item is determined to satisfy the criterion in the determination regarding the absolute criterion (step S711) and the determination regarding the relative criterion (step S713) (Yes in step S714), the data selection unit 40711 The filtering determination target data is input to the sampling unit 406 (step S715). As a result, data to be processed after the sampling process is selected.
- step S203 see FIG. 16
- the time series data generation source 1 In the process until the time series data generation source 1 generates data and the data reception means 3 receives the data, the time series data generation source 1 is duplicated, and the data reception means 3 receives a plurality of the same data.
- FIG. 21 is an explanatory diagram showing a specific example of this situation.
- the time-series data generation source 1 is a sensor provided in the probe car, and the data transmission means 2a and 2b are base stations that relay data between the time-series data generation source 1 and the data reception means 3. .
- the base station is provided for each area, but is arranged so that corresponding areas partially overlap each other.
- the base stations 2a and 2b corresponding to the areas receive the same data. Since both the base stations 2a and 2b transmit the received data to the pre-analysis processing system, the data receiving means 3 receives a plurality of the same data. The data replicated in this way is unnecessary in the analysis by the time series data analysis means 5 and is excluded by the filtering means 407.
- FIG. 22 is a block diagram illustrating a configuration example of the filtering unit 407 when the third condition is used.
- the filtering unit 407 in this modification includes a processed data storage unit 40723, an effectiveness determination unit 40722, and a data selection unit 40721.
- the processed data storage unit 40723 is a storage device that stores data identification information for identifying each data input from the stream data generation unit 401.
- FIG. 23 shows an example of data identification information stored in the processed data storage unit 40723.
- the combination of the date and time and the ID of the time series data generation source (for example, vehicle ID) may be used as the data identification information.
- the first record in FIG. 23 means that the data generated by the probe car “CID0001” on the date “2008/7/20 12:00:00” has already been received.
- the validity determination unit 40722 refers to the data identification information stored in the processed data storage unit 40723, and the filtering determination target data is still input. It is determined whether or not the data has not been received. If the filtering determination target data is data that has not yet been input, the validity determination unit 40722 processes the data identification information (for example, the combination of the date and vehicle ID) of the filtering determination target data, and the processed data storage unit 40723.
- the data identification information for example, the combination of the date and vehicle ID
- the data selection unit 40721 confirms the determination result by the validity determination unit 40722 for each filtering determination target data. Then, the data selection unit 40721 inputs the filtering determination target data to the sampling unit 406 or discards it according to the confirmation result.
- the filtering determination target data is data that has not been input yet, that means that the filtering determination target data has been input for the first time, and the third condition is satisfied.
- the data selection unit 40721 inputs the filtering determination target data to the sampling unit 406.
- the data selection unit 40721 discards the filtering determination target data.
- the data selection unit 40721 and the validity determination unit 40722 of the filtering unit 407 of the present modification are realized by a CPU of a computer that operates according to an analysis preprocessing program, for example.
- the CPU may operate as the data selection unit 40721, the validity determination unit 40722, and other units according to the analysis preprocessing program.
- the data selection means 40721 and the validity determination means 40722 may be realized by separate dedicated circuits.
- FIG. 24 is a flowchart illustrating an example of processing progress of the filtering process in the present modification.
- the validity determination unit 40722 determines whether the filtering determination target data is data that has not yet been input (step S721). Specifically, it is determined whether or not data identification information (for example, a combination of date and vehicle ID) of the input filtering determination target data is already stored in the processed data storage unit 40723. If no data identification information is stored (No in step S722), the filtering determination target data is data that has not been input yet (data that has been input for the first time). On the other hand, if the data identification information is stored (Yes in step S722), the filtering determination target data has already been input.
- data identification information for example, a combination of date and vehicle ID
- the validity determination unit 40722 additionally stores the data identification information of the filtering determination target data in the processed data storage unit 40723 (step S722). S723).
- the data selection unit 40721 confirms the determination result of the validity determination unit 40722. If the input filtering determination target data has been input (Yes in step S722), the data selection unit 40721 discards the filtering determination target data (step S725). Further, if the input filtering determination target data is the first input data (No in step S722), the data selection unit 40721 inputs the filtering determination target data to the sampling unit 406 (step S724). As a result, data to be processed after the sampling process is selected.
- step S203 see FIG. 16
- the filtering unit 407 is configured to combine a plurality of conditions from the first to third conditions described above, input only data that satisfies the plurality of conditions to the sampling unit 406, and discard other data. It may be. For example, only data that satisfies the first and second conditions may be input to the sampling unit 406 and other data may be discarded.
- the method of combining conditions is not particularly limited.
- Embodiment 4 FIG.
- the analysis preprocessing system includes a data receiving means 3 and a data stream generating means 4 as in the first, second and third embodiments (see FIG. 1).
- the data generated by the time series data generation source 1 is received from the data transmission means 2, the data is preprocessed and sent to the time series data analysis means 5.
- FIG. 25 is an explanatory diagram illustrating a configuration example of the data stream generation unit 4 according to the fourth embodiment.
- the data stream generation unit 4 in this embodiment includes a switching unit 409 in addition to the stream data generation unit 401, the sampling unit 406, the filtering unit 407, the transmission data buffer 402, the analysis window generation unit 403, and the stream data transmission unit 404.
- the analysis preprocessing system according to the fourth embodiment performs either one of filtering processing or sampling processing by switching the switching unit 409.
- the transmission data buffer 402, the analysis window generation unit 403, and the stream data transmission unit 404 are the same as those in the first to third embodiments.
- the switching unit 409 controls the stream data generation unit 401, the filtering unit 407, and the sampling unit 406 to operate so as to perform any one of the filtering process and the sampling process.
- the switching unit 409 causes the stream data generation unit 401 to input the cut out individual data to the sampling unit 406 and causes the sampling unit 406 to sample the data. At this time, the filtering unit 407 is not operated.
- the switching unit 409 causes the stream data generation unit 401 to input the cut out individual data to the filtering unit 407 and causes the filtering unit 407 to filter the data. At this time, the sampling unit 407 is not operated.
- the switching unit 409 switches, for example, whether to perform sampling processing or filtering processing according to a switching instruction input from the outside.
- the switching instruction may be input via an input device (not shown) such as a keyboard, for example. Alternatively, it may be input via a communication network.
- the stream data generation unit 401 converts the format of the data received by the data reception unit 3 and cuts out individual data (for example, see FIG. 8), as in the first embodiment.
- the switching unit 409 instructs the sampling process
- the data is input to the sampling unit 406, and when the switching unit 409 instructs the filtering process, the data is input to the filtering unit 407.
- the sampling unit 406 samples the data input from the stream data generation unit 401 when the switching unit 409 instructs the sampling process.
- the configuration of the sampling means 406 may be the same as that of the first embodiment (see FIG. 10) or the same as that of the second embodiment (see FIG. 12). That is, the sampling unit 406 may sample data at a sampling rate input from the outside, as in the first embodiment. Alternatively, as in the second embodiment, the sampling unit 406 itself may calculate the sampling rate and perform sampling.
- the sampling unit 406 is controlled by the switching unit 409 so as not to perform an operation.
- the filtering unit 407 performs filtering on the data input from the stream data generation unit 401 when the switching unit 409 instructs the filtering process.
- the filtering unit 407 may have the same configuration as that of the third embodiment, or may have the same configuration as that of each modification of the third embodiment. That is, the filtering unit 407 has the same configuration as that shown in FIG. 15 and may perform filtering using a condition that “the content of the data is different from any data already stored in the transmission buffer 402”. .
- the filtering unit 407 has the same configuration as that in FIG. 18 and may perform filtering using a condition that “the content of data satisfies a predetermined criterion”.
- the filtering unit 407 has the same configuration as that in FIG. 22, and filtering may be performed using a condition that “it is not a copy of any data already input from the stream data generation unit 401”.
- the filtering unit 407 causes the transmission data buffer 402 to store data that satisfies the condition.
- the switching means 409 is realized by, for example, a CPU of a computer that operates according to the analysis preprocessing program.
- the CPU may operate as the switching unit 409 or other units according to the analysis preprocessing program.
- the switching unit 409 may be realized as a dedicated circuit.
- the analysis preprocessing system operates in the same manner as in the first or second embodiment (see FIG. 11).
- the filtering unit 407 performs filtering instead of step S203 shown in FIG.
- the stream data generation unit 401 inputs each data to the filtering unit 407, and the data selection unit 40701 (or the data selection unit 40711, 40721) of the filtering unit 407 sends the data that satisfies the condition to the transmission data buffer 402. Remember. Then, the data that does not satisfy the condition is discarded.
- the sampling process or the filtering process is performed on each piece of data cut out by the stream data generation unit 401, the overflow of data in the transmission data buffer can be prevented.
- the method of reducing the number of data is executed by filtering. Can be switched.
- the case where the time series data generation source 1 provided in the probe car generates data, and performs pre-processing for creating an analysis window by sampling or the like is performed on the data.
- Such an analysis window can be used for, for example, generating warning information using a near-miss map in addition to the generation of traffic jam information.
- the present invention can be used for an analysis in which a person possesses a sensor serving as the time-series data generation source 1 and warns the person using a near-miss map.
- the type of data is not limited to the data used for the analysis as described above, and the present invention can be applied to preprocessing for various data to be analyzed.
- the pre-analysis processing system of the present embodiment includes a data receiving unit 3 and a data stream generating unit 4.
- FIG. 26 is a block diagram illustrating a configuration example of the data stream generation unit 4 in the embodiment in which sampling is not performed.
- the data stream generation unit 4 includes a stream data generation unit 401, a transmission data buffer 402, an analysis window generation unit 403, and a stream data transmission unit 404. Each of these means is the same as in the first embodiment.
- the sampling unit 406 is not provided, and the stream data generation unit 401 stores all the extracted data in the transmission data buffer 402.
- the stream data generation unit 401 notifies the analysis window generation unit 403 of, for example, a pointer to the stored memory area as a notification to that effect.
- step S203 (sampling processing) is not performed in the data stream generation step (see step S2, FIG. 11), but the other points are the same as those in the first embodiment.
- the data can be sent to the time-series data analysis means 5 more quickly than when the data is stored as a database or a file.
- FIG. 27 is an explanatory diagram showing the minimum configuration of the present invention.
- the analysis preprocessing system of the present invention includes data acquisition means 71, data cutout means 72, buffer 74, sampling means 73, analysis data determination means 75, and analysis data output means 76.
- Data acquisition means 71 (for example, data reception means 3) acquires a data group generated by a plurality of data generation sources.
- the data cutout unit 72 (for example, the stream data generation unit 401) cuts out individual data from the data group acquired by the data acquisition unit 71.
- the buffer 74 (for example, the transmission data buffer 402) stores data used for analysis.
- Sampling means 73 samples a part of the extracted data and stores it in the buffer 74.
- Analysis data determination means 75 determines an analysis data group (for example, analysis window), which is a set of data used for analysis, from the data stored in the buffer 74.
- the analysis data output means 76 (for example, the stream data transmission means 404) sends the analysis data group to the data analysis means for analyzing the data (for example, the time series data analysis means 5).
- the data can be passed to the means for analyzing the data at high speed while preventing the data from overflowing.
- the sampling means 73 uses the usage amount of the buffer 74 and the prediction means (for example, the flow rate monitoring means 40606) for predicting the data amount to be given in the future from the results of the given data amount every fixed time.
- a buffer usage measuring means for measuring for example, transmission data buffer usage measuring means 40607)
- a sampling rate calculating means for calculating a sampling rate based on the predicted data amount and the buffer usage (for example, sampling rate calculation)
- a sample extraction unit for example, sample extraction unit 40601) for sampling data in accordance with the sampling rate.
- the sampling rate can be dynamically determined according to the usage amount of the buffer 74 and the predicted data amount.
- the sampling rate calculation means calculates the free capacity of the buffer 74 from the use amount of the buffer 74, and the relationship between the number of data that can be stored in the free capacity and the predicted data amount.
- a configuration for calculating sampling data is disclosed.
- the sampling means stores sampling rate storage means (for example, sampling rate storage means 40603) for storing a sampling rate inputted from the outside, and sample extraction means (for sampling data according to the sampling rate) (for example, a configuration having a sample extracting means 40601) is disclosed.
- a configuration including filtering means (for example, filtering means 407) for discarding data that does not satisfy the above condition is disclosed.
- the filtering means includes a reference storage means (for example, valid data definition means 40713) for storing a reference indicating that the content included in the data is valid, and data cut out by the data cutout means 72.
- a reference determination means for example, validity determination means 40712 for determining whether or not the content of the data satisfies the standard, and discarding the data whose data content does not satisfy the standard and data satisfying the standard
- a configuration including data selection means (for example, data selection means 40711) for inputting the signal to the sampling means 73 is disclosed.
- the filtering means includes data identification information storage means (for example, processed data storage means 40723) that stores data identification information of each data input from the data cutout means 72, and data cutout means 72.
- data identification information storage means for example, processed data storage means 40723 that stores data identification information of each data input from the data cutout means 72, and data cutout means 72.
- filtering means for each piece of data cut out by the data cutout unit, it is determined whether or not a predetermined condition is satisfied, data satisfying the predetermined condition is stored in the buffer 74, and the predetermined condition is stored.
- Filtering means for example, filtering means 407 that discards data that does not satisfy the above-mentioned criteria
- switching means for example, switching means 409 that controls whether the data cut out by the data cutout means 72 is input to the sampling means 73 or the filtering means.
- the above embodiment discloses a configuration in which the analysis data determining means 75 determines a set of data stored in the buffer 74 within a certain period as an analysis data group every certain period.
- the above embodiment discloses a configuration in which the analysis data determination means 75 determines a set of a predetermined number of data as an analysis data group every time the number of data stored in the buffer 74 reaches a predetermined number. Yes.
- the above embodiment discloses a configuration in which the analysis data output means 76 deletes each data belonging to the analysis data group sent to the data analysis means from the buffer 74.
- the above embodiment includes data analysis means for analyzing data, the data analysis means holds the analysis data group output by the analysis data output means 76, and deletes the analysis data group that has been analyzed.
- the data analysis means holds the analysis data group output by the analysis data output means 76, and deletes the analysis data group that has been analyzed.
- a data acquisition unit that acquires a data group generated by a plurality of data generation sources, a data extraction unit that extracts individual data from the data group acquired by the data acquisition unit, and a buffer that stores data used for analysis
- a sampling unit that samples a portion of the extracted data and stores it in the buffer, and analysis data that defines an analysis data group that is a set of data used for analysis from the data stored in the buffer
- An analysis preprocessing system comprising: a determination unit; and an analysis data output unit that sends an analysis data group to a data analysis unit that analyzes data.
- a prediction unit that predicts a data amount to be given in the future based on a record of a given amount of data every predetermined time, a buffer usage measurement unit that measures a buffer usage, and a predicted data amount
- An analysis preprocessing system comprising: a sampling rate calculation unit that calculates a sampling rate based on a buffer usage amount; and a sample extraction unit that samples data according to the sampling rate.
- sampling rate calculation unit calculates the free space of the buffer from the usage amount of the buffer, and calculates sampling data from the relationship between the number of data that can be stored in the free space and the predicted data amount system.
- sampling unit includes a sampling rate storage unit that stores a sampling rate input from the outside, and a sample extraction unit that samples data according to the sampling rate.
- a content match / mismatch determination unit for determining whether or not the filtering unit satisfies a condition that the data content is different from any data already stored in the buffer for each data cut out by the data cutout unit
- An analysis preprocessing system having a data selection unit that discards data that does not satisfy the condition and inputs data that satisfies the condition to the sampling unit.
- a reference storage unit that stores a criterion indicating that the content included in the data is valid, and whether or not the data content satisfies the criterion for each piece of data extracted by the data extraction unit
- An analysis preprocessing system comprising: a reference determination unit for determining; and a data selection unit that discards data whose data content does not satisfy the criterion and inputs data that satisfies the criterion to the sampling unit.
- a data identification information storage unit that stores data identification information of each piece of data input from the data extraction unit, and a data identification information of the data when the data is input from the data extraction unit. It is determined whether or not the information is stored in the information storage unit. If not stored, the data determination information is stored in the data identification information storage unit, and the data identification information is stored in the data identification information storage unit.
- An analysis preprocessing system comprising: a data selection unit that discards data determined to have been stored and inputs data determined to have not been stored in the data identification information storage unit to the sampling unit.
- An analysis pre-processing system comprising: a filtering unit that performs control, and a switching unit that controls whether the data cut out by the data cutout unit is input to the sampling unit or the filtering unit.
- a data analysis unit for analyzing data is provided, the data analysis unit holds the analysis data group output by the analysis data output unit, and deletes the analysis data group after the analysis, thereby the analysis data output unit Is an analysis preprocessing system that performs analysis asynchronously.
- Data acquisition means for acquiring a data group generated by a plurality of data generation sources, data cutout means for cutting out individual data from the data group acquired by the data acquisition means, and a buffer for storing data used for analysis And sampling means for sampling a part of the extracted data and storing it in a buffer, and analysis data for defining an analysis data group that is a set of data used for analysis from the data stored in the buffer
- An analysis preprocessing system comprising: a determination unit; and an analysis data output unit that sends an analysis data group to a data analysis unit that analyzes data.
- the present invention is preferably applied to an analysis preprocessing system that collects data to be collected for analysis.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer And Data Communications (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
実施形態1.
図1は、本発明の第1の実施形態の解析前処理システムの例を示すブロック図である。本発明の解析前処理システム7は、時系列データ発生源1が発生させたデータを受信するデータ受信手段3と、受信したデータを加工して時系列データ解析手段5に送るデータストリーム生成手段4とを備える。
図11は、本発明の第1の実施形態の処理経過の例を示すフローチャートである。サンプリングレート設定手段40602は、予めサンプリングレートを入力され、そのサンプリングレートをサンプリングレート記憶手段40603に記憶させているものとする。
本発明の第2の実施形態の解析前処理システムも第1の実施形態と同様に、データ受信手段3とデータストリーム生成手段4とを備え(図1参照)、時系列データ発生源1が発生させたデータをデータ送信手段2から受信すると、データの前処理を行い、時系列データ解析手段5に送る。
本発明の第3の実施形態の解析前処理システムは、第1および第2の実施形態と同様に、データ受信手段3とデータストリーム生成手段4とを備える(図1参照)。そして、時系列データ発生源1が発生させたデータをデータ送信手段2から受信すると、データの前処理を行い、時系列データ解析手段5に送る。
本発明の第4の実施形態の解析前処理システムは、第1、第2および第3の実施形態と同様に、データ受信手段3とデータストリーム生成手段4とを備える(図1参照)。そして、時系列データ発生源1が発生させたデータをデータ送信手段2から受信すると、データの前処理を行い、時系列データ解析手段5に送る。
2 データ送信手段
3 データ受信手段
4 データストリーム生成手段
5 時系列データ解析手段
7 解析前処理システム
401 ストリームデータ生成手段
402 送信データバッファ
403 解析ウィンドウ生成手段
404 ストリームデータ送信手段
406 サンプリング手段
407 フィルタリング手段
40601 標本抽出手段
40602 サンプリングレート設定手段
40603 サンプリングレート記憶手段
40605 サンプリングレート計算手段
40606 流量監視手段
40607 送信データバッファ利用量測定手段
40701 データ選別手段
40702 同一性判定手段
40711,40721 データ選別手段
40712,40722 有効性判定手段
40713 有効データ定義手段
40723 処理済みデータ記憶手段
Claims (18)
- 複数のデータ発生源で生成されたデータ群を取得するデータ取得手段と、
データ取得手段が取得したデータ群から個々のデータを切り出すデータ切り出し手段と、
解析に用いられるデータを記憶するバッファと、
切り出されたデータから一部のデータをサンプリングし、前記バッファに記憶させるサンプリング手段と、
前記バッファに記憶されたデータの中から、解析に用いられるデータの集合である解析データ群を定める解析用データ決定手段と、
データを解析するデータ解析手段に解析データ群を送る解析用データ出力手段とを備える
ことを特徴とする解析前処理システム。 - サンプリング手段は、データを無作為にサンプリングする請求項1に記載の解析前処理システム。
- サンプリング手段は、
与えられるデータ量の一定時間毎の実績から将来与えられるデータ量を予測する予測手段と、
バッファの使用量を計測するバッファ使用量計測手段と、
予測されたデータ量と、バッファの使用量とに基づいて、サンプリングレートを計算するサンプリングレート計算手段と、
前記サンプリングレートに応じてデータをサンプリングする標本抽出手段とを有する
請求項1または請求項2に記載の解析前処理システム。 - サンプリングレート計算手段は、バッファの使用量からバッファの空き容量を計算し、前記空き容量に記憶させることができるデータ数と予測されたデータ量との関係からサンプリングデータを計算する
請求項3に記載の解析前処理システム。 - サンプリング手段は、
外部から入力されたサンプリングレートを記憶するサンプリングレート記憶手段と、
前記サンプリングレートに応じてデータをサンプリングする標本抽出手段とを有する
請求項1または請求項2に記載の解析前処理システム。 - データ切り出し手段が切り出したデータ毎に、所定の条件を満たしているか否かを判定し、所定の条件を満たしているデータをサンプリング手段に入力し、所定の条件を満たしていないデータを破棄するフィルタリング手段を備える
請求項1から請求項5のうちのいずれか1項に記載の解析前処理システム。 - フィルタリング手段は、
データ切り出し手段が切り出したデータ毎に、既にバッファに記憶されているいずれのデータともデータの内容が異なるという条件を満たしているか否かを判定する内容一致不一致判定手段と、
前記条件を満たしてないデータを破棄し、前記条件を満たすデータをサンプリング手段に入力するデータ選別手段とを有する
請求項6に記載の解析前処理システム。 - フィルタリング手段は、
データに含まれる内容が有効であることを示す基準を記憶する基準記憶手段と、
データ切り出し手段が切り出したデータ毎に、データの内容が前記基準を満たしているか否かを判定する基準判定手段と、
データの内容が基準を満たしていないデータを破棄し、基準を満たしているデータをサンプリング手段に入力するデータ選別手段とを有する
請求項6または請求項7に記載の解析前処理システム。 - フィルタリング手段は、
データ切り出し手段から入力された各データのデータ識別情報を記憶するデータ識別情報記憶手段と、
データ切り出し手段からデータが入力されたときに当該データのデータ識別情報がデータ識別情報記憶手段に記憶されているか否かを判定し、記憶されていないときには当該データのデータ識別情報をデータ識別情報記憶手段に記憶させる重複判定手段と、
データ識別情報がデータ識別情報記憶手段に記憶されていたと判定されたデータを破棄し、データ識別情報がデータ識別情報記憶手段に記憶されていなかったと判定されたデータをサンプリング手段に入力するデータ選別手段とを有する
請求項6から請求項8のうちのいずれか1項に記載の解析前処理システム。 - データ切り出し手段が切り出したデータ毎に、所定の条件を満たしているか否かを判定し、所定の条件を満たしているデータをバッファに記憶させ、所定の条件を満たしていないデータを破棄するフィルタリング手段と、
データ切り出し手段が切り出したデータをサンプリング手段とフィルタリング手段のどちらに入力させるかを制御する切替手段とを備える
請求項1から請求項5のうちのいずれか1項に記載の解析前処理システム。 - 解析用データ決定手段は、一定期間毎に、前記一定期間内にバッファに記憶されたデータの集合を解析データ群として定める
請求項1から請求項10のうちのいずれか1項に記載の解析前処理システム。 - 解析用データ決定手段は、バッファに記憶されたデータ数が所定個に達する毎に、前記所定個のデータの集合を解析データ群として定める
請求項1から請求項10のうちのいずれか1項に記載の解析前処理システム。 - 解析用データ出力手段は、データ解析手段に送った解析データ群に属する各データをバッファから削除する
請求項1から請求項12のうちのいずれか1項に記載の解析前処理システム。 - データを解析するデータ解析手段を備え、
前記データ解析手段は、解析用データ出力手段が出力した解析データ群を保持し、解析を終えた解析データ群を削除することで解析用データ出力手段とは非同期に解析を行う
請求項1から請求項13のうちのいずれか1項に記載の解析前処理システム。 - 複数のデータ発生源で生成されたデータ群を取得し、
取得したデータ群から個々のデータを切り出し、
切り出したデータから一部のデータをサンプリングし、バッファに記憶させ、
前記バッファに記憶されたデータの中から、解析に用いられるデータの集合である解析データ群を定め、
データを解析するデータ解析手段に解析データ群を送る
ことを特徴とする解析前処理方法。 - データをサンプリングする際に、データを無作為にサンプリングする請求項15に記載の解析前処理方法。
- コンピュータに、
複数のデータ発生源で生成されたデータ群を取得するデータ取得処理、
データ取得処理で取得したデータ群から個々のデータを切り出すデータ切り出し処理、
切り出されたデータから一部のデータをサンプリングし、バッファに記憶させるサンプリング処理、および、
前記バッファに記憶されたデータの中から、解析に用いられるデータの集合である解析データ群を定める解析用データ決定処理、
データを解析するデータ解析手段に解析データ群を送る解析用データ出力処理
を実行させるための解析前処理プログラム。 - コンピュータに、
サンプリング処理、データを無作為にサンプリングさせる
請求項17に記載の解析前処理プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011500527A JPWO2010095457A1 (ja) | 2009-02-20 | 2010-02-19 | 解析前処理システム、解析前処理方法および解析前処理プログラム |
US13/148,835 US20110320650A1 (en) | 2009-02-20 | 2010-02-19 | Analysis preprocessing system, analysis preprocessing method and analysis preprocessing program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009038414 | 2009-02-20 | ||
JP2009-038414 | 2009-02-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2010095457A1 true WO2010095457A1 (ja) | 2010-08-26 |
Family
ID=42633744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/001106 WO2010095457A1 (ja) | 2009-02-20 | 2010-02-19 | 解析前処理システム、解析前処理方法および解析前処理プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20110320650A1 (ja) |
JP (1) | JPWO2010095457A1 (ja) |
WO (1) | WO2010095457A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013174940A (ja) * | 2012-02-23 | 2013-09-05 | Fujitsu Ltd | 結合装置、結合プログラムおよび結合方法 |
WO2017017748A1 (ja) * | 2015-07-27 | 2017-02-02 | 株式会社日立製作所 | 計算機システム及びサンプリング方法 |
JP2017161973A (ja) * | 2016-03-07 | 2017-09-14 | 三菱電機インフォメーションネットワーク株式会社 | データ格納装置及びデータ格納プログラム |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004058747A (ja) * | 2002-07-26 | 2004-02-26 | Toyoda Mach Works Ltd | 車両用操舵制御システム |
JP2005006081A (ja) * | 2003-06-12 | 2005-01-06 | Denso Corp | 画像サーバ、画像収集装置、および画像表示端末 |
JP2005149465A (ja) * | 2003-10-21 | 2005-06-09 | Matsushita Electric Ind Co Ltd | 交通情報の生成方法と装置 |
WO2005093688A1 (ja) * | 2004-03-25 | 2005-10-06 | Xanavi Informatics Corporation | ナビゲーション装置の交通情報収集システム |
JP2007241987A (ja) * | 2006-02-07 | 2007-09-20 | Matsushita Electric Ind Co Ltd | 交通情報生成方法及び交通情報生成装置 |
JP2008512662A (ja) * | 2004-09-10 | 2008-04-24 | コタレス・リミテッド | オブジェクトの将来の動きを予測するための装置および方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE8303785L (sv) * | 1983-07-01 | 1985-01-02 | Jan Ludwik Liszka | System for driftkontroll av en maskin |
US7570305B2 (en) * | 2004-03-26 | 2009-08-04 | Euresys S.A. | Sampling of video data and analyses of the sampled data to determine video properties |
JP4949764B2 (ja) * | 2006-08-02 | 2012-06-13 | 株式会社日立ハイテクノロジーズ | 自動分析装置のオーダリング方法 |
US20080103631A1 (en) * | 2006-11-01 | 2008-05-01 | General Electric Company | Method and system for collecting data from intelligent electronic devices in an electrical power substation |
US20100115157A1 (en) * | 2008-11-05 | 2010-05-06 | General Electric Company | Modular data collection module with standard communication interface |
-
2010
- 2010-02-19 WO PCT/JP2010/001106 patent/WO2010095457A1/ja active Application Filing
- 2010-02-19 JP JP2011500527A patent/JPWO2010095457A1/ja active Pending
- 2010-02-19 US US13/148,835 patent/US20110320650A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004058747A (ja) * | 2002-07-26 | 2004-02-26 | Toyoda Mach Works Ltd | 車両用操舵制御システム |
JP2005006081A (ja) * | 2003-06-12 | 2005-01-06 | Denso Corp | 画像サーバ、画像収集装置、および画像表示端末 |
JP2005149465A (ja) * | 2003-10-21 | 2005-06-09 | Matsushita Electric Ind Co Ltd | 交通情報の生成方法と装置 |
WO2005093688A1 (ja) * | 2004-03-25 | 2005-10-06 | Xanavi Informatics Corporation | ナビゲーション装置の交通情報収集システム |
JP2008512662A (ja) * | 2004-09-10 | 2008-04-24 | コタレス・リミテッド | オブジェクトの将来の動きを予測するための装置および方法 |
JP2007241987A (ja) * | 2006-02-07 | 2007-09-20 | Matsushita Electric Ind Co Ltd | 交通情報生成方法及び交通情報生成装置 |
Non-Patent Citations (2)
Title |
---|
KOJI KIDA ET AL.: "Data-stream Shori ni yoru Daikibo Probe Car System no Kaihatsu to Hyoka", IPSJ SIG NOTES, vol. 2008, no. 83, 3 September 2008 (2008-09-03), pages 1 - 8 * |
NOBUTATSU NAKAMURA ET AL.: "Data-stream Shori Kiban o Mochiita Kosoku Probe Joho Shushu, Bunseki", NEC TECHNICAL JOURNAL, vol. 61, no. L, 25 January 2008 (2008-01-25), pages 40 - 43 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013174940A (ja) * | 2012-02-23 | 2013-09-05 | Fujitsu Ltd | 結合装置、結合プログラムおよび結合方法 |
WO2017017748A1 (ja) * | 2015-07-27 | 2017-02-02 | 株式会社日立製作所 | 計算機システム及びサンプリング方法 |
JP2017161973A (ja) * | 2016-03-07 | 2017-09-14 | 三菱電機インフォメーションネットワーク株式会社 | データ格納装置及びデータ格納プログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2010095457A1 (ja) | 2012-08-23 |
US20110320650A1 (en) | 2011-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2010095458A1 (ja) | 解析前処理システム、解析前処理方法および解析前処理プログラム | |
EP3506103B1 (en) | In-vehicle apparatus and log collection system | |
CN110868336A (zh) | 数据管理方法、装置和计算机可读存储介质 | |
EP3660615B1 (en) | System and method for monitoring an on-board recording system | |
US20150106342A1 (en) | System and method of detecting cache inconsistencies | |
EP3282643A1 (en) | Method and apparatus of estimating conversation in a distributed netflow environment | |
Huang et al. | Air quality forecast monitoring and its impact on brain health based on big data and the internet of things | |
US11139999B2 (en) | Method and apparatus for processing signals from messages on at least two data buses, particularly CAN buses; preferably in a vehicle; and system | |
WO2010095457A1 (ja) | 解析前処理システム、解析前処理方法および解析前処理プログラム | |
CN107357804A (zh) | 互联网金融海量日志的分析系统及方法 | |
WO2019113677A1 (en) | Snapshots buffering service | |
CN114780810B (zh) | 数据处理方法、装置、存储介质及电子设备 | |
JP6809011B2 (ja) | 制御システムの遠隔監視を行う装置およびシステム | |
KR100901696B1 (ko) | 보안 이벤트의 컨텐츠에 기반한 보안 이벤트 샘플링 장치및 방법 | |
CN113938306B (zh) | 一种基于数据清洗规则的可信认证方法及系统 | |
CN113282920B (zh) | 日志异常检测方法、装置、计算机设备和存储介质 | |
CN108476151A (zh) | 用于捕捉和显示在本地控制网络(lcn)中的分组和其他消息的系统和方法 | |
WO2010095459A1 (ja) | 解析前処理システム、解析前処理方法および解析前処理プログラム | |
JP4829194B2 (ja) | ネットワーク解析システム | |
CN111506672B (zh) | 实时分析环保监测数据的方法、装置、设备及存储介质 | |
US20170098010A1 (en) | Data integration apparatus and data integration method | |
CN117215258A (zh) | 一种基于Flink的数控机床实时状态监控系统及方法 | |
CN116614418A (zh) | 一种基于云计算平台的服务器保护方法 | |
KR20100098241A (ko) | 봇넷 행동 패턴 분석 시스템 및 방법 | |
JP5439871B2 (ja) | データ圧縮方法、装置、およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10743586 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13148835 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011500527 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10743586 Country of ref document: EP Kind code of ref document: A1 |