CN113704085A

CN113704085A - Method and device for checking a technical system

Info

Publication number: CN113704085A
Application number: CN202110539523.3A
Authority: CN
Inventors: 尹智洙; J·孙斯
Original assignee: Robert Bosch GmbH
Current assignee: Robert Bosch GmbH
Priority date: 2020-05-20
Filing date: 2021-05-18
Publication date: 2021-11-26
Also published as: DE102020206327A1

Abstract

Method and device for checking technical systems, characterized by the following features: obtaining simulation results in a simulation of the system, in validation attempts on the system, detecting measurement data corresponding to the simulation results, forming classifiers for a plurality of catalogues of test cases of the system, respectively, based on the simulation results and the measurement data, classifying other test cases and potentially considered validation measurements, respectively, by the classifiers according to whether the simulation in the respective other tests is reliable, unreliable or untudkable, detecting other test cases and potentially considered validation measurements in which the simulation is unreliable or untudkable, and prioritizing candidates for possible other validation attempts based on improvement potentials of the classifiers associated with the detected other test cases.

Description

Method and device for checking a technical system

Technical Field

The invention relates to a method for checking a technical system. Furthermore, the invention relates to a corresponding device, a corresponding computer program and a corresponding storage medium.

Background

In software technology, the use of models for automating test activities and generating test artifacts (testartifacts) during testing is summarized by the superordinate concept "model-based testing" (MBT). For example, it is well known to generate test cases from a model describing the target behavior of a system under test.

In particular, embedded systems (embedded systems) rely on the deterministic input signals of sensors and in turn stimulate their environment by means of signals output to very different actuators. Thus, during the verification and early development phases of such a system, a model (model in the loop, MiL), software (software in the loop, SiL), processor (processor in the loop, PiL) or entire hardware (hardware in the loop, HiL) of the system is simulated in a conditioning loop along with the environmental model. In vehicle technology, a simulator corresponding to this principle for checking an electronic control unit as a function of a test phase and a test object is sometimes referred to as a component test stand, a module test stand or an integration test stand.

DE10303489a1 discloses a method of this type for testing the software of a control unit of a vehicle, a power tool or a robot system, in which a control section controllable by the control unit is at least partially simulated by a test system by generating an output signal from the control unit and transmitting the output signal of the control unit via a first connection to a first hardware module, and by transmitting the signal of a second hardware module as an input signal via a second connection to the control unit, wherein the output signal is provided as a first control value in the software and is additionally transmitted in real time to the test system via a communication interface relative to the control section.

Such simulations have found wide application in various technical fields and are used, for example, to check the suitability of embedded systems in power tools, motor control devices for driving, steering and braking systems, camera systems, systems with artificial intelligence and machine learning components, robotic systems or autonomous vehicles in their early development stages. However, the simulation model results according to the prior art are only introduced to a limited extent into the release decision due to a lack of confidence in the reliability of the simulation model results according to the prior art.

Disclosure of Invention

The invention provides a method for checking a technical system, a corresponding device, a corresponding computer program and a corresponding storage medium according to the independent claims.

The solution according to the invention is based on the recognition that: the quality of a simulation model is decisive for the correct predictability of the test results that can be achieved with said simulation model. In the MBT domain, the sub-discipline of validation (Validierung) has the task of comparing real measurements with simulation results. Various measures, values or other comparators are used for this purpose, which should logically connect the signals to one another and are referred to below collectively as Signal Measures (SM). An example of such a signal metric is a metric that compares magnitude, phase shift, and correlation. Some signal metrics are defined by a correlation standard, for example according to ISO 18571.

Generally, uncertainty quantification techniques support the estimation of simulation and model quality. The result of evaluating the model quality for a particular input X (which may be a parameter or a scene) using a signal metric or, more generally, an uncertainty quantification method is referred to below as a simulation model error metric (abbreviated error metric) smerrx. In order to generalize (interpolate and extrapolate) SMerrorX for previously unexplored inputs, parameters or scene X, a machine learning model, for example based on a so-called gaussian process, may be used.

During validation (verizoning), the test objects (system under test, SUT, system under test) are typically checked based on requirements, specifications, or performance indicators. It should be noted that the boolean requirements or specifications may generally be converted to quantitative measurements by using formal meanings such as Signal Temporal Logic (STL). This formal sense may be used as the basis for quantitative semantics, which represent generalization of verification as long as positive values indicate that requirements are met and negative values indicate that requirements are violated. Hereinafter, such requirements, specifications, or performance metrics are collectively referred to as "quantitative requirements" (QSpec).

Such quantitative requirements may be checked based on the real SUT or a model of the real SUT (i.e., "virtual SUT"). To perform the verification, the catalog is compiled with the test cases that the SUT must satisfy to determine if the SUT has the desired performance and safety characteristics. Such test cases may be parameterized to encompass any number of individual tests.

In this context, the proposed solution takes into account the need for reliable (bellotbar) test results to guarantee the performance and safety characteristics of SUT. It is in the case of performing tests based on a simulation of the system or sub-components (rather than a real system) that ensures that the simulation results are trustworthy.

A significant challenge of model validation is to determine the true measurements and the stimuli for the corresponding simulation, which provides the highest degree of information about the accuracy of the model. Furthermore, for simulation-based testing, a specification is also needed as to the extent to which the simulated system meets the requirements set forth.

Based on the error measures outlined above, a classifier can be formed which, for a parameterized set of test cases (hereinafter referred to as "test catalogue"), determines the reliability of the simulation in each test case considered and classifies it accordingly. Such a procedure is based on the idea of detecting test cases that are classified as unreliable and that are examined in practice. After the tests have been performed on the system, the obtained knowledge can be used in various ways, for example for updating the classifier, for improving parameterization or for optimizing the model. The limitations of this solution arise from the fact that: each set of test cases must be classified and all cases that are not reliable are considered candidates for experimental measurements unless they are used to improve the model, parameterization or meta-model.

In this context, the solution according to the invention has the advantage of opening up knowledge drawn from various sets of parameterized test cases. By combining the information provided by multiple classifiers, the number of test cases recommended for a true test is reduced in this way. Furthermore, the proposed solution mainly takes into account test cases that are neither classified as reliable nor as unreliable. In this way, the number of real tests can be reduced without affecting the model, the parameterization of the model or possibly the parameterization of the validation model or meta-model.

Such tests can be applied in very different fields. For example, functional safety of automation systems, for example for the automation of driving functions (autonomous driving), should be taken into account.

Advantageous developments and improvements of the basic idea specified in the independent claims are possible by means of the measures listed in the dependent claims. An automated, computer-implemented test environment can thus be set up to automatically improve the quality of the hardware or software product under test to a great extent.

Drawings

Embodiments of the invention are illustrated in the drawings and are explained in more detail in the following description.

Fig. 1 shows a database for simulation and experimental measurements.

Fig. 2 shows a diagram of the proposed algorithm.

Figure 3 shows the calculation of the desired improvement in detail.

Fig. 4 shows a variant based on high-accuracy classification.

Fig. 5 shows a variant for calculating the desired improvement.

Fig. 6 schematically shows a workstation.

Detailed Description

Fig. 1 shows a database 27 with a plurality of

directories

14, 15 or parameterized test case sets of the system to be examined. The test cases contained in each

directory

14, 15 are classified according to whether the simulation in the respective test case is reliable 20, unreliable 21 or (according to an alternative embodiment) not determinable 22.

It should be noted that this scheme is independent of the classifier used for the classification, which may be formed in a variety of ways. However, in the application example, reference is made to different implementation possibilities.

The simulation results 11 obtained in the simulation of the system and the measurement data 13 detected in the corresponding validation attempts 12 on the system are stored in a common database 27. In addition, the database 27 contains candidates 24 for both possible validation experiments 26 and other tests. All simulated and actual tests are performed on the same product model. The processing and version control of the variants of the test and simulation results 11 is not the subject of this description.

As shown in fig. 2, based on these simulation results 11 and measurement data 13, initially

classifiers

16, 17 are formed 19 for each

catalog

14, 15. The other test cases are classified 18 by these

classifiers

16, 17 depending on whether the simulation in the respective test case is reliable 20, unreliable 21 or not determinable 22. Test cases in which the simulation is unreliable 21 or inconclusive 22 are detected in the common database 27.

By comparing 28 the

classifiers

16, 17 in pairs in the respective test case examined, the improvement potential 23 of the

classifiers

16, 17, respectively, associated with the test case under investigation can now be calculated. Based on this improvement potential 23, the candidates for possible other confirmation attempts 12 are prioritized 25. Finally, a corresponding new measurement is performed on the system and the

classifier

16, 17 is updated (fortbulin) 24 at least partly on the basis of the new measurement data 13. The steps mentioned can be repeated arbitrarily.

The prioritization 25 can be done in different ways depending on the characteristics of the

classifiers

16, 17 and the resulting uncertainty for the classification 18. According to fig. 3, one of the test cases classified by the first classifier 16 is selected 29 and simulated for this purpose. Based on the output 30 of the simulation, the requirements QSpec put forward on the system for the test case are evaluated 31 and the second classifier 17 is updated 32.

The

uncertainties

37, 38 in the classification of the second test case corresponding to this test case are then determined for the original classifier 17 and the classifier 33 updated in this way, respectively. On this basis, the improvement potential 23 of the second classifier 17 can finally be calculated by comparing 28 the uncertainty 37 of the original classifier 17 with the uncertainty 38 of the updated classifier 33. In the context of this process, the potential validation measure can also be used as a comparison reference without using the original classifier 17.

In the variant shown in fig. 4, instead of investigating the individual test cases, the test cases are divided into groups 34, which are characterized by their information content, in order to calculate the improvement potential 23 by pairwise comparison 28 of the

classifiers

16, 17 for these groups 34.

Instead of a pair-wise comparison 28 of individual test cases or groups 34, the process shown in fig. 5 is also desirable. The second classifier 17 is updated 32 here by means of the simulated output 30, which is based only on the candidate 24 under consideration, in order to derive the improvement potential 23 of the second comparator from the increase 23 of the knowledge acquisition 36 achieved by the update 32 relative to the knowledge acquisition 35 achieved by the original second classifier 17. Only if such an increase 23 is actually expected, the test under consideration or the confirmation measurement is actually performed on the system.

As already mentioned, the above proposed scheme does not depend on the specific implementation of the

classifiers

16, 17. The following gives some examples of how possible improvements in selecting new test points can be calculated.

For example, if the parameters of the

classifiers

16, 17 are known with sufficient accuracy, for example due to a limited amount of training data, a series of test cases may be given that cannot be classified 18 as reliable 20 or unreliable 21. In these cases, the reliability of the simulation cannot be clearly determined 22 (fig. 3 and 5). Therefore, additional data is needed to avoid under-detection and reduce the uncertainty 37, 38 (fig. 3) of the

classifiers

16, 17. To this end, new real measurement data 13 may be collected to expand (weiterbilden) 32 and thereby improve the

classifiers

16, 17.

In this embodiment, the newly measured information content is calculated based on common information metrics. The test cases and confirmation experiments are prioritized 25 by computing the information content in pairs.

Another variant is that at least one decision feature of the classifiers 16, 17 (for example the error measure SMerrorX) is based on a meta-model which is decided by means of the simulation results 11. The new real measurement data 13, which are trained only once at the beginning, are used for re-evaluating the

classifiers

16, 17. Here, the evaluation of which points in the parameter space or state space additional information can be used to reduce the uncertainty of the meta-model. This is done for all

directories

14, 15.

The test cases are prioritized 25 based on the respective information content and those candidates 24 with the highest evaluation are actually tested. If some of the

test directories

14, 15 are classified based on consistent features (e.g. based on a uniform error metric of the same signal), the

classifiers

16, 17 may also be evaluated based on a common meta-model. In this case, if the meta-model can be improved, the evaluation will be performed in all involved test cases.

As shown in the schematic diagram in fig. 6, the method 10 may be implemented, for example, in a workstation 40, for example, in software or hardware, or in a hybrid form of software and hardware.

Claims

1. Method (10) for checking a technical system, in particular an at least partially autonomous robot or vehicle,

it is characterized by the following features:

obtaining a simulation result (11) in a simulation of the system,

in a validation attempt (12) on the system, measurement data (13) corresponding to the simulation result (11) are detected,

forming (19) classifiers (16, 17) for a plurality of catalogues (14, 15) of the test cases of the system, respectively, on the basis of the simulation results (11) and the measurement data (13),

classifying (18) the other test cases and the confirmation measures potentially to be taken into account by the classifier (16, 17) depending on whether the simulation in the respective other test is reliable (20), unreliable (21) or untudable (22),

detecting other test cases and potentially considered confirmation measures in which the simulation is unreliable (21) or inconclusive (22), and

prioritizing (25) candidates (24) of possible further validation attempts (26) based on an improvement potential (23) of the classifier (16, 17) associated with the detected further test cases (21, 22).

2. The method (10) of claim 1,

it is characterized by the following features:

the simulation results (11) and the measurement data (13) are stored in a common database (27),

adding the candidates (24) and the stimuli and parameters (28) of the corresponding simulation into the database (27).

3. The method (10) according to claim 1 or 2,

it is characterized by the following features:

the improvement potential (23) is calculated by comparing (28) the classifiers (16, 17) in pairs in each detected test case.

4. The method (10) according to claim 1 or 2,

it is characterized by the following features:

selecting (29) a first test case among the test cases classified by a first classifier (16) of said classifiers (16, 17),

generating an output (30) by simulation of the first test case,

evaluating (31) requirements made to the system for the first test case based on the output (30) and updating (32) a second classifier (17) of the classifiers (16, 17),

for the original second classifier (17) and the updated second classifier (33), respectively, an uncertainty in the classification (18) of the second test case corresponding to the first test case is determined, and

-calculating an improvement potential (23) of the second classifier (17) by comparing the uncertainty (37) of the original second classifier (17) and the uncertainty (38) of the updated second classifier (33).

5. The method (10) according to claim 1 or 2,

it is characterized by the following features:

dividing the other test cases (21, 22) detected into groups (34), and

calculating the improvement potential (23) by comparing (28) the classifiers (16, 17) pairwise for the group (34).

6. The method (10) according to claim 1 or 2,

it is characterized by the following features:

selecting (29) a test case among the test cases classified by a first classifier (16) of the classifiers (16, 17),

generating an output (30) by simulation of the test case,

evaluating (31) requirements made to the system for the test case based on the output (30) and updating (32) a second classifier (17) of the classifiers (16, 17),

for the original second classifier (17) and the updated second classifier (33), knowledge acquisitions (35, 36) are determined, respectively, and

deriving an improvement potential (23) of the second classifier (17) from an increase of knowledge acquisition (36) achieved by the updated second classifier (33) relative to knowledge acquisition (35) achieved by the original second classifier (17).

7. The method (10) according to claim 1 or 2,

it is characterized by the following features:

possible measurements are selected among the possible confirmation measurements that have not been performed,

an output (30) is generated by simulation of the associated stimulus,

evaluating (31) requirements placed on the system for the test case based on the output (30) and updating (32) the associated classifier (16, 17),

for the original classifier (17) and the updated classifier (33), knowledge acquisitions (35, 36) are determined, respectively, and

deriving an improvement potential (23) of the classifier (17) from an increase of knowledge acquisition (36) achieved by the updated classifier (33) relative to knowledge acquisition (35) achieved by the original classifier (17).

8. The method (10) according to any one of claims 1 to 7, characterized in that the error of the system identified by the check is automatically ameliorated.

9. A computer program arranged to perform the method (10) according to any one of claims 1 to 8.

10. A machine readable storage medium having stored thereon a computer program according to claim 9.

11. An apparatus (60) arranged to perform the method (10) according to any one of claims 1 to 8.