WO2022180681A1 - データ生成システム、データ生成方法およびデータ生成プログラム - Google Patents
データ生成システム、データ生成方法およびデータ生成プログラム Download PDFInfo
- Publication number
- WO2022180681A1 WO2022180681A1 PCT/JP2021/006863 JP2021006863W WO2022180681A1 WO 2022180681 A1 WO2022180681 A1 WO 2022180681A1 JP 2021006863 W JP2021006863 W JP 2021006863W WO 2022180681 A1 WO2022180681 A1 WO 2022180681A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- market
- test
- complementary
- integrated
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 47
- 238000012360 testing method Methods 0.000 claims abstract description 197
- 230000000295 complement effect Effects 0.000 claims abstract description 114
- 238000004364 calculation method Methods 0.000 claims abstract description 52
- 238000000605 extraction Methods 0.000 claims abstract description 36
- 239000000284 extract Substances 0.000 claims abstract description 34
- 238000011161 development Methods 0.000 claims description 25
- 238000012545 processing Methods 0.000 claims description 25
- 230000006866 deterioration Effects 0.000 claims description 8
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 230000018109 developmental process Effects 0.000 description 22
- 238000003860 storage Methods 0.000 description 21
- 238000004458 analytical method Methods 0.000 description 16
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000010354 integration Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 230000010365 information processing Effects 0.000 description 3
- 230000001502 supplementing effect Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C5/00—Registering or indicating the working of vehicles
- G07C5/008—Registering or indicating the working of vehicles communicating information to a remotely located station
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
Definitions
- the present invention relates to a data generation system, a data generation method, and a data generation program that generate new data by linking a plurality of data.
- Patent Literature 1 describes a predictive diagnosis device that diagnoses signs of abnormalities in equipment.
- the device described in Patent Literature 1 obtains speed, external environment, acceleration, GPS data, and the like from sensors installed in the automobile. Then, the driving data immediately after the vehicle is shipped is used as so-called teacher data in the normal state of the vehicle.
- Non-Patent Document 1 describes a model-free analysis technique that can accurately determine the state of a system by comparing features mechanically extracted from time-series data.
- Patent Literature 2 describes invariant analysis that automatically extracts relationships between sensors based on time-series data from a plurality of sensors based on machine learning.
- the data obtained in the tests during development can be said to be highly accurate data due to the accuracy and variety of sensor values.
- the development cost there are limits to the test period and patterns that can be considered.
- a data generation system is a data generation system that generates new data using market data collected from mass-produced vehicles and test data used for testing vehicles in the development stage, the system comprising market data and , a feature extracting means for extracting features of at least one of the test data, and a data selection means for selecting one or more of the other data including features corresponding to the features of the extracted one of the data; Complementary data calculation means for calculating complementary data that complements the market data or test data from the data of the other and the selected other data, and integrate the calculated complementary data into at least one or both of the market data and the test data and integrated data generation means for generating the integrated data.
- a data generation method is a data generation method for generating new data using market data collected from mass-produced vehicles and test data used for vehicle testing in the development stage.
- Data and the characteristics of the data are extracted from at least one of the data of the test data, the computer selects one or more other data containing characteristics corresponding to the characteristics of the extracted one data, the computer Supplementary data that complements market data or test data is calculated from one data and the selected other data, and the computer integrates the calculated supplementary data with at least one or both of market data and test data. It is characterized by generating integrated data.
- a data generation program is a data generation program applied to a computer that generates new data using market data collected from mass-produced vehicles and test data used for vehicle testing in the development stage.
- a computer performs feature extraction processing for extracting features of at least one of market data and test data, and selecting one or more pieces of data including features corresponding to features of the extracted one of the data data selection processing, complementary data calculation processing for calculating complementary data that complements market data or test data from one data and the other selected data, and the calculated complementary data, market data and test data It is characterized by realizing integrated data generation processing for generating integrated data integrated in at least one or both of.
- highly accurate test data can be created so as to increase the coverage of test patterns while suppressing development costs.
- FIG. 1 is a block diagram showing a configuration example of an embodiment of a data generation system according to the present invention
- FIG. It is a flow chart which shows an example of operation of a data generation system.
- FIG. 4 is an explanatory diagram showing an example of learning data; It is explanatory drawing which shows the example of market data.
- 1 is a block diagram showing an overview of a data generation system according to the present invention
- FIG. 1 is a schematic block diagram showing a configuration of a computer according to at least one embodiment
- FIG. 1 is a block diagram showing a configuration example of one embodiment of a data generation system according to the present invention.
- the data generation system 100 of this embodiment includes a storage unit 10, a market data acquisition unit 20, a feature extraction unit 30, a data selection unit 40, a complementary data calculation unit 50, and an integrated data generation unit 60.
- a storage unit 10 for storing data
- a feature extraction unit 30 for extracting data from a market data acquisition unit
- a feature extraction unit 30 for a feature extraction of the data acquisition unit
- a data selection unit 40 for selecting the data from a data acquisition unit
- a complementary data calculation unit 50 includes a complementary data generation unit 60.
- the storage unit 10 stores various types of information used for processing by the data generation system 100 of this embodiment. Specifically, the storage unit 10 stores data collected from mass-produced vehicles (hereinafter referred to as market data) and data used for vehicle testing in the development stage (hereinafter referred to as test data). do.
- market data mass-produced vehicles
- test data data used for vehicle testing in the development stage
- a mass-produced vehicle is a vehicle that has completed the development stage and is mass-produced for sale to the market, and is a vehicle that is actually operated and driven by consumers.
- market data is data obtained from mass-produced vehicles, so it is possible to collect a large amount of normal data.
- the defect data acquired from mass-produced vehicles is generally small.
- the amount of test data is smaller than that of market data from the viewpoint of development costs (for example, confirmed tests should not be performed multiple times).
- test data is generally high due to the large number of types of sensors installed in vehicles and the environment in which data can be reliably collected.
- Test data can also be created for each test, such as unit test, integration test, and running test.
- the accuracy of market data is generally lower than that of test data because the types of sensors installed in mass-produced vehicles are fewer than at the time of development, and data may be missing depending on communication conditions.
- market data includes telematics data sent from connected cars as data that is constantly collected, and data that is collected at specific times and is extracted from the ECU (Engine Control Unit) in the event of a failure.
- ECU Engine Control Unit
- DTC Diagnostic Trouble Code
- the driving data is time-series data of sensor values obtained from various vehicle parts such as OBC (On-Board Charger) / CAN (Controller Area Network)-Bus, GPS (Global positioning system) data, telematics data, etc.
- the image data is an image captured by a drive recorder (for example, a forward image).
- failure data includes failure reports (failure parts, details, causes, countermeasures, etc.) when the vehicle is brought to a dealer.
- the acquired video and failure data may not always be linked, but there are cases where a link can be established by associating the video with driving data.
- As a method of associating the video with the travel data there is a method of tagging the video.
- test data one of the characteristics of the test data is that the number and accuracy of the sensors used for testing are high, and it is easy to acquire the driving conditions etc. at the time of data acquisition.
- test data due to limited time and resources, it is difficult to cover all conditions with test data.
- problems may occur in terms of comprehensiveness in the integration test and running test.
- test data includes the same data as the driving data, and includes more items with higher precision than the market data.
- video data it is possible to obtain not only the forward video imaged by the drive recorder, but also the video imaged by the multidirectional camera and the in-vehicle video image.
- test data is often created based on test specifications that are complete from the perspective of test scenarios.
- Items included in the test specification include version (model number), individual number, inspection target (unit (part), combination (assembly), integration (vehicle), inspection viewpoint (function / non-function), Preconditions (conditions of other parts, driving environment, etc.), test procedures (control inputs, load inputs), expected results (normal/defective), judgment criteria (thresholds, etc.), judgment results (OK/NG), other items ( judgment reason, exception reason, etc.).
- the storage unit 10 stores market data acquired by the market data acquisition unit 20, which will be described later. Note that the storage unit 10 may store market data acquired and created by other methods. The storage unit 10 also stores test data created by a designer or the like.
- the market data acquisition unit 20 acquires market data collected from mass-produced vehicles and stores it in the storage unit 10.
- the market data acquisition unit 20 may acquire, for example, travel data and video data transmitted from a connected car having a communication function.
- the market data acquisition unit 20 may improve the quality of the acquired market data by performing data cleansing such as conversion into codes and deletion of outliers.
- the feature extraction unit 30 extracts features of at least one of market data and test data. That is, the feature extractor 30 may extract the features of market data or the features of test data.
- An object of the present invention is to make use of the respective advantages of market data and test data, and to supplement information lacking in one data with the other data, thereby creating more accurate data.
- the feature extraction unit 30 extracts the features of market data from the market data. It should be noted that the following processing can be similarly applied when extracting features from test data.
- the feature extractor 30 may, for example, extract the data item itself representing the feature of the data from the market data. Examples of such data items include vehicle individual numbers and vehicle types. Alternatively, the feature extraction unit 30 may calculate the correlation between data items indicating numerical data such as velocity and acceleration, and extract the correlation between the data items as a feature amount.
- the feature extraction unit 30 calculates a feature amount by synthesizing the relationship between the values of the sensors mounted on the vehicle and the relationship between the time-series changes in the values of the sensors, and extracts the feature amount from a plurality of You may extract as a feature of data.
- the feature extraction unit 30 may calculate such a feature amount using the model-free analysis technique described in Non-Patent Document 1, for example.
- the method by which the feature extraction unit 30 extracts features from data is not limited to the above method.
- the feature extraction unit 30 may extract the log pattern itself as a feature.
- the feature extraction unit 30 uses an invariant analysis technique as described in Patent Document 2 to extract relationships between past sensor data as features. You may
- the data selection unit 40 selects one or more pieces of data that contain features corresponding to the features of the extracted piece of data. For example, when features are extracted from market data, the data selector 40 selects one or more pieces of test data that match or are similar to the extracted features. On the other hand, when features are extracted from the test data, the data selector 40 selects one or more pieces of market data that match or are similar to the extracted features.
- the method by which the data selection unit 40 selects data is not particularly limited, and any method can be used as long as it enables selection of data with matching or similar characteristics.
- the data selection unit 40 may predetermine items to be compared between the market data and the test data, and select data whose contents match or are within a predetermined range. For example, when the individual number, the vehicle type, and the correlation value of numerical data as described above are defined as items to be compared, the data selection unit 40 may select data matching or similar to these items.
- the data selection unit 40 may perform predetermined weighting on the features of one data before comparing the features of the data.
- the data selection unit 40 calculates a weight value according to the degree of deterioration (for example, a weight value that greatly changes the extracted features as the distance traveled or the travel time increases). Then, the corresponding data may be selected by multiplying the features by the calculated weights and comparing them. Note that the method of calculating the weight value is arbitrary, and may be determined in advance according to the properties of the items.
- the data selection unit 40 may determine, for example, to set a weight of 0.8 for a certain feature of data acquired from a vehicle that has been running for 10 years.
- the data selection unit 40 may compare the pattern with the market data log or the test data log and select corresponding data. Further, for example, when the relationship between past sensor data described above is extracted as a feature, the data selection unit 40 compares the log of market data or the log of test data with the feature, and selects the corresponding data. You may
- a correspondence table for judging matching or similarity between items to be compared may be determined in advance, and the data selection unit 40 may select data whose content is determined in the correspondence table for items to be compared. . Further, when the relationship of chronological changes is extracted as a feature, the data selection unit 40 may select a plurality of data corresponding to the feature.
- the data selection unit 40 may narrow down and select test data of situations similar to the acquired market data from the selected test data.
- Test data of similar situations include, for example, test data with similar sensor values and test data with similar front shot images.
- the data selection unit 40 for example, data that is not measured in the acquired market data and is included in the selected test data (for example, rear shot video ), it becomes possible to use test data, which is closer to the market data, as complementary data.
- Complementary data calculation unit 50 calculates data (hereinafter referred to as supplementary data) that complements market data or test data from the data whose characteristics have been compared (one data) and the selected data (other data). calculate.
- Supplementary data here refers not only to supplemented data for missing data in either or both of market data and test data, but also refined data for existing data, It also includes new data generated to shorten the time interval of the data.
- the first mode is a mode of supplementing missing market data items with test data. A specific example will be described below.
- the data selection unit 40 selects test data similar to the characteristics indicating the driving scene of the market data.
- the complementary data calculation unit 50 identifies missing items in the market data, extracts items closest to the missing items in the market data from the selected test data items, and generates complementary data. good.
- the complementary data calculation unit 50 may calculate data to be complemented, for example, using values before and after test data collected in time series (for example, calculating an average value). In addition, the complementary data calculation unit 50 may calculate the data to be complemented by using a loss complementing method such as the multiple imputation method. Further, the complementary data calculation unit 50 may calculate data to be complemented under the same conditions, using data at similar points such as time, speed, data tendency, and the like. Further, as described above, the complementary data calculation unit 50 may generate the integrated data after correcting the data according to the degree of deterioration such as the travel distance. These methods can be similarly used in the methods exemplified below.
- the complementary data calculation unit 50 extracts the data of items missing in one of the data (eg, market data) from the selected other data (eg, test data), Complementary data that complements may be calculated.
- a second specific example is a method of complementing missing items in market data based on other correlations (for example, correlations of other sensors).
- the data selector 40 selects test data similar to the features extracted using a technique such as the model-free analysis described above, for example.
- the complementary data calculation unit 50 may generate complementary data that complements missing items in the market data from the correlation of the selected test data.
- the data selection unit 40 selects data containing similar features from past test data. Then, the complementary data calculation unit 50 extracts data according to the section to be complemented, normalizes the data, biases the detected data, etc., and calculates complementary data using the processed data. .
- the data selection unit 40 selects data that is relevant to the data to be complemented. Then, the complementary data calculation unit 50 calculates complementary data by predicting data to be complemented using the relationship from the selected data.
- the complementary data calculation unit 50 may generate complementary data by combining knowledge from the knowledge base.
- the second aspect is an aspect that uses test data to improve the accuracy of market data. For example, market data collected at 1-second intervals is used to generate integrated data at 0.1-second intervals using test data. and the method of generation. Also in this case, the data selection unit 40 selects, for example, test data similar to the characteristics indicating the driving scene of the market data, and the complementary data calculation unit 50 uses the selected test data to calculate the time intervals of each data. New data may be generated to make .
- the complementary data calculation unit 50 calculates complementary data at a time interval shorter than one data (eg, market data) was collected from the selected other data (eg, test data). good.
- the complementary data calculation unit 50 may change the method of calculating complementary data according to the characteristics of the data used. For example, assume that the characteristics of data to be used are classified into normal data and abnormal data. Since it can be said that the abnormal data should be emphasized more than the normal data, the complementary data calculation unit 50 calculates the complementary data for the abnormal data in more detail than the complementary data for the normal data. can be calculated to
- the type of data to be calculated is made to represent more detailed information (for example, normal data is int type and abnormal data is double type), and the time interval of created data is shortened. (for example, normal data is set at 1-second intervals, and abnormal-time data is set at 0.1 second intervals).
- the integrated data generation unit 60 generates data (hereinafter referred to as integrated data) by integrating the calculated complementary data with at least one or both of the market data and the test data.
- the integrated data generation unit 60 may generate integrated data in which the missing part is filled by integrating the complementary data with the market data. Further, for example, when new data is generated as supplementary data so as to shorten the time interval of each data, the integrated data generation unit 60 inserts the generated data into the existing market data, Integrated data with short intervals may be generated.
- the market data acquisition unit 20, the feature extraction unit 30, the data selection unit 40, the complementary data calculation unit 50, and the integrated data generation unit 60 are computer processors (e.g., CPU (Central Processing Unit)).
- CPU Central Processing Unit
- the program is stored in the storage unit 10, the processor reads the program, and according to the program, the market data acquisition unit 20, the feature extraction unit 30, the data selection unit 40, the complementary data calculation unit 50, and the integrated data generation unit 60 may operate as Also, the functions of the data generation system 100 may be provided in a SaaS (Software as a Service) format.
- SaaS Software as a Service
- the market data acquisition unit 20, the feature extraction unit 30, the data selection unit 40, the complementary data calculation unit 50, and the integrated data generation unit 60 may each be realized by dedicated hardware. Also, part or all of each component of each device may be implemented by general-purpose or dedicated circuitry, processors, etc., or combinations thereof. These may be composed of a single chip, or may be composed of multiple chips connected via a bus. A part or all of each component of each device may be implemented by a combination of the above-described circuits and the like and programs.
- each component of the data generation system 100 is realized by a plurality of information processing devices, circuits, etc.
- the plurality of information processing devices, circuits, etc. may be centrally arranged, They may be distributed.
- the information processing device, circuits, and the like may be implemented as a form in which each is connected via a communication network, such as a client-server system, a cloud computing system, or the like.
- FIG. 2 is a flowchart showing an operation example of the data generation system 100 of this embodiment.
- the market data acquired by the market data acquisition unit 20 and the test data created by the designer or the like are stored in the storage unit 10 .
- the feature extraction unit 30 extracts features of at least one of market data and test data (step S11).
- the data selection unit 40 selects one or more pieces of data including features corresponding to the features of one piece of data (step S12).
- the complementary data calculation unit 50 calculates complementary data for complementing market data or test data from one data and the selected other data (step S13).
- the integrated data generation unit 60 then generates integrated data by integrating the calculated complementary data with at least one or both of the market data and the test data (step S14).
- the feature extraction unit 30 extracts the features of at least one of the market data and the test data
- the data selection unit 40 corresponds to the features of one of the data. Select one or more of the other data that contain features that Then, the complementary data calculation unit 50 calculates complementary data that complements the market data or the test data from one of the data and the selected other data, and the integrated data generation unit 60 converts the calculated complementary data into Generate integrated data that integrates at least one or both of market data and test data. Therefore, highly accurate test data can be created so as to increase the coverage of test patterns while suppressing development costs.
- an analysis device analyzes the characteristics of the market data.
- the analyzer analyzes, for example, the slope and average value of the data (for example, the average value of the slopes of the X and Y coordinates, etc.) and indexes these features.
- the analysis device uses accumulated learning data to train the feature extraction engine.
- the analysis device uses a trained feature extraction engine to generate binary feature data from the learning data.
- the generated feature data is stored in the storage unit 10 .
- FIG. 3 is an explanatory diagram showing an example of learning data.
- Data d1 and data d2 illustrated in FIG. 3 are part of the test data collected in time series during the driving test. For example, when the model-free analysis technique is used, binary data [0100] is generated as feature data from data d1, and binary data [1001] is generated as feature data from data d2. remembered. Note that this binary data is an example.
- FIG. 4 is an explanatory diagram showing an example of market data.
- the market data d3 exemplified in FIG. 4 has some data missing for some reason, and the data d32 is "None". Also, compared to the market data illustrated in FIG. 3, the market data illustrated in FIG. 4 does not include the X-axis velocity and the Y-axis velocity.
- the feature extraction unit 30 extracts features from market data.
- the feature extraction unit 30 may calculate the slope and average value of the data as described above from the market data and extract them as features. Further, the feature extraction unit 30 may extract binary format feature data from the market data illustrated in FIG. 4 using the feature extraction engine. For example, when the model-free analysis technique is used, the feature of the data d31 portion is converted to [0100], and the feature of the data d33 portion is converted to [1000].
- the data selection unit 40 selects test data to be used for calculating complementary data. Specifically, the data selection unit 40 matches the characteristics of the extracted market data with the characteristics of the test data, and selects test data with the highest degree of similarity. For example, when the average of the inclination of the X coordinate and the average of the inclination of the Y coordinate of each test data are calculated as a feature, the data selection unit 40 may select the test data having the closest inclination.
- the data selection unit 40 selects the test data using the data d31 for calculating the complementary data because there is data that matches the binary data [0100]. may be selected to
- the complementary data calculation unit 50 calculates complementary data. For example, the complementary data calculation unit 50 may select two points from the data d31 and use the data of the selected two points as the complementary data, or may calculate the average of the two points and use it as the complementary data. . Further, the complementary data calculation unit 50 may extract items (X-axis speed and Y-axis speed) that do not exist in the market data from the test data and use them as complementary data. In this way, the complementary data calculation unit 50 may complement the market data by using the data before and after the missing data of the market data and the data of similar parts of the test data.
- the integrated data generation unit 60 generates integrated data by integrating the calculated complementary data.
- the data generation system of this embodiment As a first application example, there is an application example in which a plurality of test data that match the characteristics of the target market data are selected and supplementary items are calculated. Specifically, when the market data acquisition unit 20 acquires market data collected from mass-produced vehicles, the data selection unit 40 selects a plurality of test data that match the features extracted by the feature extraction unit 30 .
- the complementary data calculation unit 50 extracts data corresponding to items to be complemented (for example, inclination, correlation, etc.) from the test data.
- Complementary data calculation unit 50 calculates a value to be supplemented (for example, average value, median value, mode value, etc.) from the extracted data.
- the integrated data generation unit 60 generates integrated data by integrating the calculated values with the market data.
- the feature extraction unit 30 extracts features used to identify the specified situation.
- the data selection unit 40 selects multiple pieces of market data that match the features extracted by the feature extraction unit 30 .
- the complementary data calculation unit 50 calculates representative data from the plurality of selected market data.
- Methods of calculating representative data include, for example, a method of using statistical data such as the median value, average value, and mode of each item, and a method of randomly specifying data.
- the complementary data calculation unit 50 calculates a value to be complemented, as in the first application example, and the integrated data generation unit 60 calculates the calculated value is integrated with market data to generate integrated data.
- a third application example is the application of making market data more detailed (rich).
- a sensor that was installed in the vehicle during testing may be removed from the production vehicle to reduce costs.
- a mass-produced vehicle is equipped with a sensor and a front camera for automatic driving.
- the vehicle during the test may be equipped with not only sensors and sensors and a front camera, but also a rear camera for testing automatic driving.
- the integrated data generation unit 60 integrates part of the market data of similar situations with data not included in the market data.
- the integrated data generation unit 60 integrates the video of the rear camera of the test data with the data of the market vehicle. This makes it easier to understand the driving conditions of mass-produced vehicles, making it possible to improve analysis accuracy. For example, since it is possible to create test data for simulation that shows a virtual surrounding situation, it is also possible to use this test data as a learning video for video analysis AI (Artificial Intelligence).
- AI Artificial Intelligence
- a fourth application example is an application example that reinforces the test data scenario.
- a scenario that could not be executed with test data is simulated using market data. For example, it is possible to create a new test scenario by extracting market data related to unexecuted test scenarios and constructing test data.
- FIG. 5 is a block diagram showing an overview of the data generation system according to the invention.
- the data generation system 80 is a data generation system (for example, the data generation system 100 ), a feature extracting means 81 (for example, a feature extracting unit 30) for extracting data features from at least one of market data and test data, and one of the extracted data (for example, market data)
- Data selection means 82 e.g., data selection unit 40
- Complementary data calculation means 83 for example, supplementary data calculation unit 50
- integration integrates the calculated supplementary data with at least one or both of market data and test data
- integrated data generating means 84 for example, the integrated data generating unit 60 for generating data.
- the feature extraction means 81 extracts the features of the market data from the market data
- the data selection means 82 selects a plurality of test data corresponding to the features of the extracted market data, and calculates complementary data.
- Means 83 calculates supplementary data that complements market data from the selected plurality of test data
- integrated data generating means 84 integrates the calculated data with market data to generate integrated data. good.
- the data selection means 82 may further select test data of situations similar to the market data from the selected test data.
- the feature extracting means 81 calculates a feature amount by synthesizing the relationship between the values of the sensors mounted on the vehicle and the relationship between the time-series changes in the values of the sensors, and extracts the calculated feature amount. You may extract as a feature of several data.
- the data selection means 82 may also select corresponding data by calculating a weighting value according to the degree of deterioration, multiplying the feature by the calculated weighting, and comparing them.
- the complementary data calculation means 83 may extract data of items missing in one of the data from the other selected data to calculate complementary data that complements the market data or test data.
- the complementary data calculation means 83 may calculate complementary data at time intervals shorter than the collection of one data from the selected other data.
- FIG. 6 is a schematic block diagram showing the configuration of a computer according to at least one embodiment.
- a computer 1000 comprises a processor 1001 , a main storage device 1002 , an auxiliary storage device 1003 and an interface 1004 .
- the data generation system 80 described above is implemented in the computer 1000 .
- the operation of each processing unit described above is stored in the auxiliary storage device 1003 in the form of a program (data generation program).
- the processor 1001 reads out the program from the auxiliary storage device 1003, develops it in the main storage device 1002, and executes the above processing according to the program.
- the secondary storage device 1003 is an example of a non-transitory tangible medium.
- Other examples of non-transitory tangible media include magnetic disks, magneto-optical disks, CD-ROMs (Compact Disc Read-only memory), DVD-ROMs (Read-only memory), connected via interface 1004, A semiconductor memory etc. are mentioned.
- the computer 1000 receiving the distribution may develop the program in the main storage device 1002 and execute the above process.
- the program may be for realizing part of the functions described above.
- the program may be a so-called difference file (difference program) that implements the above-described functions in combination with another program already stored in the auxiliary storage device 1003 .
- a data generation system that generates new data using market data collected from mass-produced vehicles and test data used for vehicle testing in the development stage, a feature extracting means for extracting features of at least one of the market data and the test data; data selection means for selecting one or more of the other data containing features corresponding to the features of the extracted one data; Complementary data calculation means for calculating complementary data that complements the market data or the test data from the one data and the selected other data; and integrated data generating means for generating integrated data in which the calculated complementary data is integrated with at least one or both of the market data and the test data.
- the feature extraction means extracts the features of the market data from the market data
- the data selection means selects a plurality of test data corresponding to the characteristics of the extracted market data
- Complementary data calculation means calculates supplementary data that complements market data from a plurality of selected test data
- the data generation system according to Appendix 1, wherein the integrated data generating means generates integrated data by integrating the calculated complementary data with the market data.
- Appendix 3 The data generation system according to Appendix 2, wherein the data selection means further selects test data of situations similar to the market data from the selected test data.
- the feature extracting means calculates a feature amount by synthesizing the relationship between the value of each sensor mounted on the vehicle and the relationship between the time-series change in the value of the sensor, and the calculated feature amount is extracted as a feature of the plurality of data.
- the data generation system according to any one of appendices 1 to 3.
- the data selection means calculates a weight value corresponding to the degree of deterioration, multiplies the calculated weight to the feature, and compares the features to select corresponding data Any one of appendices 1 to 4 A data generation system as described in 1.
- Complementary data calculation means extracts data of items missing in one of the data from the other selected data, and calculates supplementary data that complements market data or test data.
- a data generation system according to any one of
- Supplementary data calculation means calculates supplementary data at a time interval shorter than one data is collected from the selected other data Data generation system.
- the computer extracts the characteristics of the market data from the market data, The computer selects a plurality of test data corresponding to features of the extracted market data, The computer calculates complementary data that complements the market data from the selected plurality of test data, The data generation method according to appendix 8, wherein the computer generates integrated data by integrating the calculated complementary data with the market data.
- a program storage medium that stores a data generation program applied to a computer that generates new data using market data collected from mass-produced vehicles and test data used for vehicle testing in the development stage.
- a program storage medium for storing a data generation program for realizing integrated data generation processing for generating integrated data in which the calculated complementary data is integrated with at least one or both of the market data and the test data.
- the present invention is suitably applied to a data generation system that links multiple pieces of data to generate new data.
- the present invention can be applied to solutions using linked data.
- solutions using linked data include failure sign detection, failure cause identification, deterioration prediction, and defect prediction. It is also possible to contribute to the development of simulators by collecting data under various environments and generating simulator data. In addition, by generating data based on aging deterioration data and failure data of market vehicles, it is possible to feed back these data to development.
- storage unit 20 market data acquisition unit 30 feature extraction unit 40 data selection unit 50 complementary data calculation unit 60 integrated data generation unit 100 data generation system
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Traffic Control Systems (AREA)
Abstract
Description
前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出する特徴抽出手段と、
抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定するデータ選定手段と、
前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出する補完データ算出手段と、
算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する統合データ生成手段とを備えた
ことを特徴とするデータ生成システム。
データ選定手段は、抽出された市場データの特徴に対応するテストデータを複数選定し、
補完データ算出手段は、選定された複数のテストデータから、市場データを補完する補完データを算出し、
統合データ生成手段は、算出された前記補完データを、前記市場データに統合した統合データを生成する
付記1記載のデータ生成システム。
付記2記載のデータ生成システム。
付記1から付記3のうちのいずれか1つに記載のデータ生成システム。
付記1から付記4のうちのいずれか1つに記載のデータ生成システム。
付記1から付記5のうちのいずれか1つに記載のデータ生成システム。
付記1から付記6のうちのいずれか1つに記載のデータ生成システム。
コンピュータが、前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出し、
前記コンピュータが、抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定し、
前記コンピュータが、前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出し、
前記コンピュータが、算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する
ことを特徴とするデータ生成方法。
前記コンピュータが、抽出された市場データの特徴に対応するテストデータを複数選定し、
前記コンピュータが、選定された複数のテストデータから、市場データを補完する補完データを算出し、
前記コンピュータが、算出された前記補完データを、前記市場データに統合した統合データを生成する
付記8記載のデータ生成方法。
前記コンピュータに、前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出する特徴抽出処理、
抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定するデータ選定処理、
前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出する補完データ算出処理、および、
算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する統合データ生成処理
を実現させるためのデータ生成プログラムを記憶するプログラム記憶媒体。
特徴抽出処理で、市場データから当該市場データの特徴を抽出させ、
データ選定処理で、抽出された市場データの特徴に対応するテストデータを複数選定させ、
補完データ算出処理で、選定された複数のテストデータから、市場データを補完する補完データを算出させ、
統合データ生成処理で、算出された前記補完データを、前記市場データに統合した統合データを生成させる
ためのデータ生成プログラムを記憶する付記10記載のプログラム記憶媒体。
前記コンピュータに、前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出する特徴抽出処理、
抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定するデータ選定処理、
前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出する補完データ算出処理、および、
算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する統合データ生成処理
を実現させるためのデータ生成プログラム。
特徴抽出処理で、市場データから当該市場データの特徴を抽出させ、
データ選定処理で、抽出された市場データの特徴に対応するテストデータを複数選定させ、
補完データ算出処理で、選定された複数のテストデータから、市場データを補完する補完データを算出させ、
統合データ生成処理で、算出された前記補完データを、前記市場データに統合した統合データを生成させる
付記12記載のデータ生成プログラム。
20 市場データ取得部
30 特徴抽出部
40 データ選定部
50 補完データ算出部
60 統合データ生成部
100 データ生成システム
Claims (11)
- 量産車から収集される市場データと、開発段階で車両のテストに用いられるテストデータとを用いて新たなデータを生成するデータ生成システムであって、
前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出する特徴抽出手段と、
抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定するデータ選定手段と、
前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出する補完データ算出手段と、
算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する統合データ生成手段とを備えた
ことを特徴とするデータ生成システム。 - 特徴抽出手段は、市場データから当該市場データの特徴を抽出し、
データ選定手段は、抽出された市場データの特徴に対応するテストデータを複数選定し、
補完データ算出手段は、選定された複数のテストデータから、市場データを補完する補完データを算出し、
統合データ生成手段は、算出された前記補完データを、前記市場データに統合した統合データを生成する
請求項1記載のデータ生成システム。 - データ選定手段は、選定したテストデータの中から、市場データに類似するシチュエーションのテストデータをさらに選定する
請求項2記載のデータ生成システム。 - 特徴抽出手段は、車両に搭載された各センサの値の関係性と、当該センサの値の時系列の変化との関連性を合成した特徴量を算出し、算出された特徴量を複数のデータの特徴として抽出する
請求項1から請求項3のうちのいずれか1項に記載のデータ生成システム。 - データ選定手段は、劣化度合いに応じた重み値を算出し、算出した重みを特徴に乗じて比較することにより、対応するデータを選定する
請求項1から請求項4のうちのいずれか1項に記載のデータ生成システム。 - 補完データ算出手段は、一方のデータに欠落した項目のデータを、選定された他方のデータから抽出して、市場データまたはテストデータを補完する補完データを算出する
請求項1から請求項5のうちのいずれか1項に記載のデータ生成システム。 - 補完データ算出手段は、一方のデータが収集されたよりも短い時間間隔の補完データを、選定された他方のデータから算出する
請求項1から請求項6のうちのいずれか1項に記載のデータ生成システム。 - 量産車から収集される市場データと、開発段階で車両のテストに用いられるテストデータとを用いて新たなデータを生成するデータ生成方法であって、
コンピュータが、前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出し、
前記コンピュータが、抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定し、
前記コンピュータが、前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出し、
前記コンピュータが、算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する
ことを特徴とするデータ生成方法。 - コンピュータが、市場データから当該市場データの特徴を抽出し、
前記コンピュータが、抽出された市場データの特徴に対応するテストデータを複数選定し、
前記コンピュータが、選定された複数のテストデータから、市場データを補完する補完データを算出し、
前記コンピュータが、算出された前記補完データを、前記市場データに統合した統合データを生成する
請求項8記載のデータ生成方法。 - 量産車から収集される市場データと、開発段階で車両のテストに用いられるテストデータとを用いて新たなデータを生成するコンピュータに適用されるデータ生成プログラムを記憶するプログラム記憶媒体であって、
前記コンピュータに、前記市場データ、および、前記テストデータの少なくとも一方のデータから当該データの特徴を抽出する特徴抽出処理、
抽出された前記一方のデータの特徴に対応する特徴を含む他方のデータを一つ以上選定するデータ選定処理、
前記一方のデータと、選定された前記他方のデータから、前記市場データまたは前記テストデータを補完する補完データを算出する補完データ算出処理、および、
算出された前記補完データを、前記市場データおよび前記テストデータの少なくとも一方若しくは両方に統合した統合データを生成する統合データ生成処理
を実現させるためのデータ生成プログラムを記憶するプログラム記憶媒体。 - コンピュータに、
特徴抽出処理で、市場データから当該市場データの特徴を抽出させ、
データ選定処理で、抽出された市場データの特徴に対応するテストデータを複数選定させ、
補完データ算出処理で、選定された複数のテストデータから、市場データを補完する補完データを算出させ、
統合データ生成処理で、算出された前記補完データを、前記市場データに統合した統合データを生成させる
ためのデータ生成プログラムを記憶する請求項10記載のプログラム記憶媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/275,949 US20240119469A1 (en) | 2021-02-24 | 2021-02-24 | Data generation system, data generation method, and data generation program |
PCT/JP2021/006863 WO2022180681A1 (ja) | 2021-02-24 | 2021-02-24 | データ生成システム、データ生成方法およびデータ生成プログラム |
JP2023501710A JPWO2022180681A1 (ja) | 2021-02-24 | 2021-02-24 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/006863 WO2022180681A1 (ja) | 2021-02-24 | 2021-02-24 | データ生成システム、データ生成方法およびデータ生成プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022180681A1 true WO2022180681A1 (ja) | 2022-09-01 |
Family
ID=83047843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/006863 WO2022180681A1 (ja) | 2021-02-24 | 2021-02-24 | データ生成システム、データ生成方法およびデータ生成プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240119469A1 (ja) |
JP (1) | JPWO2022180681A1 (ja) |
WO (1) | WO2022180681A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002215646A (ja) * | 2001-01-22 | 2002-08-02 | Nec Corp | 欠損データ補完方法及び欠損データ補完システム |
US20200193324A1 (en) * | 2018-12-14 | 2020-06-18 | The Regents Of The University Of Michigan | System and method for unifying heterogenous datasets using primitives |
-
2021
- 2021-02-24 WO PCT/JP2021/006863 patent/WO2022180681A1/ja active Application Filing
- 2021-02-24 JP JP2023501710A patent/JPWO2022180681A1/ja active Pending
- 2021-02-24 US US18/275,949 patent/US20240119469A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002215646A (ja) * | 2001-01-22 | 2002-08-02 | Nec Corp | 欠損データ補完方法及び欠損データ補完システム |
US20200193324A1 (en) * | 2018-12-14 | 2020-06-18 | The Regents Of The University Of Michigan | System and method for unifying heterogenous datasets using primitives |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022180681A1 (ja) | 2022-09-01 |
US20240119469A1 (en) | 2024-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102375452B (zh) | 改善故障代码设定和隔离故障的事件驱动的数据挖掘方法 | |
US10713866B2 (en) | Vehicle operation data collection apparatus, vehicle operation data collection system, and vehicle operation data collection method | |
JPH10510385A (ja) | ソフトウエア品質のアーキテクチャに基づく分析のための方法およびシステム | |
CN113010389A (zh) | 一种训练方法、故障预测方法、相关装置及设备 | |
CN113343461A (zh) | 自动驾驶车辆的仿真方法、装置、电子设备及存储介质 | |
CN110633905A (zh) | 智能车云平台可靠性计算方法 | |
Groh et al. | Towards a scenario-based assessment method for highly automated driving functions | |
CN117057142B (zh) | 一种基于数字孪生的车辆测试数据处理方法及系统 | |
WO2022180681A1 (ja) | データ生成システム、データ生成方法およびデータ生成プログラム | |
US20220327042A1 (en) | Method for testing a product | |
CN115480944A (zh) | 车载娱乐终端的黑屏故障分析方法、装置、车辆及介质 | |
US11262738B2 (en) | Device and method for measuring, simulating, labeling and evaluating components and systems of vehicles | |
CN115248993A (zh) | 一种仿真场景模型真实性检测方法、装置及存储介质 | |
CN113987751A (zh) | 一种方案筛选方法、装置、电子设备及存储介质 | |
CN113704085A (zh) | 用于检查技术系统的方法和设备 | |
Cao et al. | Application oriented testcase generation for validation of environment perception sensor in automated driving systems | |
Kalkar et al. | Machine Learning Based Instrument Cluster Inspection Using Camera | |
CN110553849A (zh) | 一种行车状况评估系统以及评估方法 | |
US10157166B2 (en) | Method and system for measuring the performance of a diagnoser | |
US20230005308A1 (en) | Fault sign detection device, fault sign detection system, fault sign method, and fault sign detection program | |
WO2021111727A1 (ja) | 故障診断装置、故障診断システム、故障診断方法および故障診断プログラム | |
US20230234603A1 (en) | Information processing system, information processing method, and program | |
CN118013402A (zh) | 模型训练方法、异常数据识别方法、装置、设备和介质 | |
US20240037022A1 (en) | Method and system for the analysis of test procedures | |
US20220147875A1 (en) | Removing less informative samples in sequential data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21927784 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023501710 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18275949 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21927784 Country of ref document: EP Kind code of ref document: A1 |