CN118037121B - Importance scoring method, device, equipment and medium for situation awareness data - Google Patents

Importance scoring method, device, equipment and medium for situation awareness data Download PDF

Info

Publication number
CN118037121B
CN118037121B CN202410219180.6A CN202410219180A CN118037121B CN 118037121 B CN118037121 B CN 118037121B CN 202410219180 A CN202410219180 A CN 202410219180A CN 118037121 B CN118037121 B CN 118037121B
Authority
CN
China
Prior art keywords
data
situation awareness
awareness data
target
importance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410219180.6A
Other languages
Chinese (zh)
Other versions
CN118037121A (en
Inventor
黎广宇
李林
王太伟
王俊文
刘力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Waterborne Transport Research Institute
Original Assignee
China Waterborne Transport Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Waterborne Transport Research Institute filed Critical China Waterborne Transport Research Institute
Priority to CN202410219180.6A priority Critical patent/CN118037121B/en
Publication of CN118037121A publication Critical patent/CN118037121A/en
Application granted granted Critical
Publication of CN118037121B publication Critical patent/CN118037121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The application provides an importance scoring method, device, equipment and medium for situation awareness data, which can be used in the technical field of situation awareness. In the method, after importance reference scores of situation awareness data to be scored, a target situation awareness data sample and a target situation awareness data sample are obtained, text conversion is carried out on data values in the situation awareness data to be scored and the target situation awareness data sample, and updated situation awareness data to be scored and updated target situation awareness data are obtained; filling the large language model prompting word template with the importance reference score to obtain a large language model prompting word; and then inputting the importance scores into a preset large language model to obtain importance scores of the situation awareness data to be scored in the scoring dimension. According to the method and the device, the importance reference score of the target situation awareness data sample is determined, and the importance score of the situation awareness data to be scored is determined by combining a large language model, so that the scoring efficiency is effectively improved.

Description

Importance scoring method, device, equipment and medium for situation awareness data
Technical Field
The application relates to the technical field of situation awareness, in particular to a method, a device, equipment and a medium for scoring importance of situation awareness data.
Background
With the development of science and technology, situation awareness systems are applied to harbor scenes, power monitoring scenes, network defense scenes and other scenes, situation awareness data can be processed, so that alarming of abnormal conditions is achieved, and safety is improved. When an alarm occurs, a user needs to check situation awareness data, however, the quantity of the situation awareness data is huge, the user cannot check all the situation awareness data, and only the situation awareness data with larger importance can be checked, so that importance scoring is needed for the situation awareness data.
In the prior art, the importance of situation awareness data is scored by making the situation awareness data into a questionnaire and scoring the questionnaire. Or the importance of situational awareness data is scored by a plurality of experts in a interview discussion manner.
In summary, the existing situation awareness data importance scoring method adopts a manual mode to score, so that the scoring efficiency is low.
Disclosure of Invention
The embodiment of the application provides an importance scoring method, device, equipment and medium for situation awareness data, which are used for solving the problem that the existing importance scoring method for situation awareness data adopts a manual mode to score, so that the scoring efficiency is low.
In a first aspect, an embodiment of the present application provides a method for scoring importance of situational awareness data, including:
obtaining scoring dimension selected by a user and a plurality of situation awareness data to be scored;
Acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from a database, and importance reference scores of each target situation awareness data sample in the scoring dimension;
Performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data;
Filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word;
Inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension.
In a specific embodiment, the text conversion is performed on the data values of the data items in the multiple to-be-scored situational awareness data and each target situational awareness data sample to obtain updated to-be-scored situational awareness data and target situational awareness data, including:
For each data item in the situation awareness data to be scored, if the data value of the data item is digital, determining the corresponding text of the data value of the data item according to the corresponding relation between the preset classification value range corresponding to the data item and the text;
replacing the data value of the data item with the text to obtain updated situation awareness data to be scored;
For each data item in each target situation awareness data sample, if the data value of the data item is digital, determining the text corresponding to the data value of the data item according to the corresponding relation between the preset data value corresponding to the data item and the text;
And replacing the data value of the data item with the text to obtain an updated target situation awareness data sample.
In a specific embodiment, before the step of obtaining the scoring dimension selected by the user and the plurality of situation awareness data to be scored, the method further includes:
For each scoring dimension, acquiring a situation awareness model corresponding to the scoring dimension from a situation awareness system, wherein the situation awareness model is a model for determining whether an alarm is given according to situation awareness data;
Carrying out feature importance analysis on the situation awareness model to obtain importance scores of each data item in the situation awareness data;
Screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items;
Constructing a plurality of situation awareness data samples corresponding to the scoring dimension according to the target data item, wherein the number of the situation awareness data samples is larger than that of the samples in the preset large language model prompt word template;
acquiring a plurality of importance scores of each situation awareness data sample in the score dimension;
and screening the situation awareness data samples according to importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and importance reference scores of each target situation awareness data sample in the scoring dimension.
In a specific embodiment, the screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items includes:
According to the sequence of the importance scores from large to small, judging whether the sum of the number of characters of the data item and the number of characters of the determined target data item is less than a preset threshold value or not for each data item in situation awareness data, wherein the preset threshold value is the difference between the context length of the preset large language model and the number of characters of the situation awareness data to be scored;
if the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template, and is larger than or equal to the preset threshold value, discarding the data item;
If the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template and is smaller than the preset threshold, judging whether the number of the determined target data items is smaller than a preset number threshold;
if the number of the determined target data items is smaller than the preset number threshold, determining the data items as target data items;
And discarding the data items if the number of the determined target data items is equal to the preset number threshold.
In a specific embodiment, the filtering the multiple situation awareness data samples according to the importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and the importance reference scores of each target situation awareness data sample in the scoring dimension include:
for each situation awareness data sample, calculating a scoring standard deviation of the situation awareness data sample according to a plurality of importance scores of the situation awareness data sample in the scoring dimension;
taking a situation awareness data sample with the scoring standard deviation smaller than a preset standard deviation threshold as a first situation awareness data sample;
Judging whether the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template or not;
If the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template, taking each first situation awareness data sample as a target situation awareness data sample corresponding to the scoring dimension;
if the number of the first situation awareness data samples is smaller than the number of the samples in the preset large language model prompt word template, selecting a first number of situation awareness data samples from the plurality of situation awareness data samples except the first situation awareness data, wherein the first number is the difference between the number of the samples in the preset large language model prompt word template and the number of the first situation awareness data samples;
Taking each first situation awareness data sample and the first number of situation awareness data samples selected as target situation awareness data samples corresponding to the scoring dimension;
If the number of the first situation awareness data samples is larger than the number of the samples in the preset large language model prompt word template, removing a second number of data from the first situation awareness data to obtain second situation awareness data, wherein the second number is the difference between the number of the first situation awareness data samples and the number of the samples in the preset large language model prompt word template;
taking each second situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension;
And for each target situation awareness data sample, taking the average value of a plurality of importance scores of the target situation awareness data sample in the scoring dimension as an importance reference score of the target situation awareness data sample in the scoring dimension.
In a specific embodiment, before the step of obtaining the scoring dimension selected by the user and the plurality of situation awareness data to be scored, the method further includes:
Acquiring a plurality of reference situation awareness data;
for each data item with a digital data value in the reference situation awareness data, acquiring a plurality of data values corresponding to the data item from the plurality of reference situation awareness data;
Clustering the plurality of data values by adopting a clustering algorithm to obtain a plurality of data classes corresponding to the data items and the number of the data classes, wherein the number of the data classes is determined by adopting an elbow method by adopting the clustering algorithm;
For each data class, determining a classification value range corresponding to the data class according to the size of a data value in the data class;
acquiring characters corresponding to each classified value range according to the number of the data classes;
and taking each classification value range and the corresponding text as the corresponding relation between the preset classification value range and the text corresponding to the data item.
In one embodiment, the method further comprises:
Sequencing situation awareness data to be scored according to the order of importance scores from big to small to obtain a situation awareness data sequence;
And displaying the situation awareness data to be scored in the preset quantity in the situation awareness data sequence.
In a second aspect, an embodiment of the present application provides an importance scoring apparatus for situational awareness data, including:
An acquisition module for:
obtaining scoring dimension selected by a user and a plurality of situation awareness data to be scored;
Acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from a database, and importance reference scores of each target situation awareness data sample in the scoring dimension;
A processing module for:
Performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data;
Filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word;
the scoring module is used for inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension.
In a third aspect, an embodiment of the present application provides an electronic device, including:
A processor, a memory, a communication interface;
the memory is used for storing executable instructions of the processor;
Wherein the processor is configured to perform the method of importance scoring of situational awareness data of any of the first aspects via execution of the executable instructions.
In a fourth aspect, an embodiment of the present application provides a readable storage medium having stored thereon a computer program, which when executed by a processor implements the method for scoring importance of situational awareness data according to any of the first aspects.
According to the importance scoring method, device, equipment and medium for situation awareness data, after score maintenance and situation awareness data to be scored selected by a user are obtained, a plurality of target situation awareness data samples corresponding to the score dimension are obtained according to the score dimension, and importance reference scores of each target situation awareness data sample in the score dimension are obtained; further, performing text conversion on data values of data items in the to-be-scored situation awareness data and the target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data; filling updated data of situation awareness to be scored, target situation awareness data and importance reference scores into a preset large language model prompt word template to obtain a large language model prompt word; and finally, inputting the large language model prompt words into a preset large language model to obtain importance scores of the situation awareness data to be scored in the scoring dimension. According to the method and the device, the importance reference score of the target situation awareness data sample corresponding to the scoring dimension is determined, and the importance score of the situation awareness data to be scored in the scoring dimension is determined by combining the large language model, so that the scoring efficiency is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it will be obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a schematic flow chart of a first embodiment of a situation awareness data importance scoring method provided by the present application;
Fig. 2 is a schematic flow chart of a second embodiment of a situation awareness data importance scoring method provided by the present application;
fig. 3 is a schematic flow chart of a third embodiment of an importance scoring method for situation awareness data provided by the present application;
Fig. 4 is a flow chart of a fourth embodiment of a situation awareness data importance scoring method provided by the present application;
Fig. 5 is a schematic flow chart of a fifth embodiment of an importance scoring method for situation awareness data provided by the present application;
fig. 6a is a flowchart of a sixth embodiment of a situation awareness data importance scoring method provided by the present application;
fig. 6b is a schematic flow chart showing situation awareness data to be scored according to the present application;
fig. 7 is a schematic structural diagram of an embodiment of an importance scoring device for situation awareness data provided by the present application;
fig. 8 is a schematic structural diagram of an electronic device according to the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which are made by a person skilled in the art based on the embodiments of the application in light of the present disclosure, are intended to be within the scope of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
With the development of science and technology, situation awareness systems are applied in various scenes, such as a port scene, a power monitoring scene, a network defense scene and the like. The situation awareness system can process situation awareness data to realize alarming of abnormal conditions and improve safety. When an alarm occurs, a user needs to check situation awareness data, however, the quantity of the situation awareness data is huge, the user cannot check all the situation awareness data, and only the situation awareness data with larger importance can be checked, so that importance scoring is needed for the situation awareness data.
In the prior art, the importance of situation awareness data is scored by making the situation awareness data into a questionnaire and scoring the questionnaire. Or the importance of situational awareness data is scored by a plurality of experts in a interview discussion manner. Both the two modes adopt a manual mode for scoring, however, more situation awareness data can cause the problem of lower scoring efficiency.
Aiming at the problems in the prior art, the inventor finds that the attention points of users with different roles to situation awareness data are different in the process of researching the importance scoring method of the situation awareness data, and the importance of the same piece of situation awareness data under different attention points is different, so that the scoring can be performed on different attention points, namely the scoring is performed under different scoring dimensions.
Furthermore, in order to improve the scoring efficiency, a large language model (Large Language Model, abbreviated as LLM) can be adopted for processing during scoring. Selecting target situation awareness data samples corresponding to the scoring dimension selected by the user, and importance reference scoring of each target situation awareness data sample in the scoring dimension. And (3) considering that the large language model is insensitive to numbers, converting the data values in the target situation awareness data sample and the situation awareness data to be scored into characters, and further filling the characters and the importance reference scores into a preset large language model prompt word template to obtain the large language model prompt word. And inputting the large language model prompt words into a preset large language model to obtain importance scores of the situation awareness data to be scored in the scoring dimension. Based on the inventive concept, an importance scoring scheme of situation awareness data in the application is designed.
The execution subject of the situation awareness data importance scoring method in the present application may be a computer, or may be a server, a terminal device, or other devices, which is not limited by the present application, and a computer is used as an example for the following description.
The application scenario of the importance scoring method for situation awareness data provided by the application is illustrated below.
In the application scenario, the situation awareness system is applied to a port scenario, and situation awareness data are continuously acquired by the situation awareness system, wherein the situation awareness data comprise data items such as vehicle speed, vehicle type and hazard sources around the vehicle. The situation awareness system processes the situation awareness data, alarms are conducted when dangerous situations appear, and a subsequent user needs to check the situation awareness data.
The user selects a grading dimension on the computer, the user pays attention to the safety of the vehicle, the selected grading dimension is the safety dimension of the vehicle, and then the computer acquires a plurality of pieces of situation awareness data to be graded from the situation awareness system.
And the computer acquires a plurality of target situation awareness data samples corresponding to the vehicle safety dimension from the database, and the importance reference score of each target situation awareness data sample in the vehicle safety dimension. And further, carrying out text conversion on the data values of the data items in the to-be-scored situation awareness data and the target situation awareness data samples to obtain updated to-be-scored situation awareness data and updated target situation awareness data.
And filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word. And then inputting the importance scores into a preset large language model to obtain importance scores of the situation awareness data to be scored in the safety dimension of the vehicle.
The subsequent computer can sort the situation awareness data to be scored according to the order of the importance scores from large to small, then display the situation awareness data to be scored, and a user can directly see the situation awareness data with larger importance.
It should be noted that the above scenario is only an example of an application scenario provided by the embodiment of the present application, and the embodiment of the present application does not limit the actual forms of various devices included in the scenario, and may be set according to actual requirements in a specific application of the scheme.
The technical scheme of the application is described in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 1 is a schematic flow chart of an embodiment of an importance scoring method for situation awareness data, which is provided by the application, and the embodiment of the application is used for explaining the situation that a computer obtains importance scores of situation awareness data to be scored in a scoring dimension according to importance reference scores of target situation awareness data samples in the scoring dimension by combining a large language model. The method in this embodiment may be implemented by software, hardware, or a combination of software and hardware. As shown in fig. 1, the importance scoring method of situation awareness data specifically includes the following steps:
s101: and acquiring scoring dimension selected by the user and a plurality of pieces of situation awareness data to be scored.
In this step, when importance scoring is performed on situation awareness data, a user is required to select a scoring dimension on a computer so that the obtained importance score meets the requirements of the user. The computer can acquire scoring dimension selected by the user and a plurality of situation awareness data to be scored.
It should be noted that, the scoring dimension may be vehicle safety, vehicle scheduling, pedestrian safety, etc., and the embodiment of the present application does not limit the scoring dimension, and may be determined according to actual situations.
S102: and acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from the database, and importance reference scores of each target situation awareness data sample in the scoring dimension.
In this step, after the computer obtains the scoring dimension, since the database stores a plurality of target situational awareness data samples corresponding to each scoring dimension and the importance reference score of the target situational awareness data samples in the scoring dimension, the computer can obtain a plurality of target situational awareness data samples corresponding to the scoring dimension selected by the user and the importance reference score of each target situational awareness data sample in the scoring dimension. The target situational awareness data samples corresponding to each scoring dimension are obtained by screening a plurality of situational awareness data samples constructed in advance.
The scoring dimension is an exemplary vehicle safety dimension, and 4 corresponding target situation awareness data samples are provided, wherein the first target situation awareness data sample is: vehicle model: 1, a step of; vehicle speed: 25, a step of selecting a specific type of material; peripheral sources of risk: 4. the second target situational awareness data sample is: vehicle model: 0; vehicle speed: 55; peripheral sources of risk: 0. the third target situational awareness data sample is: vehicle model: 2; vehicle speed: 15; peripheral sources of risk: 10. the fourth target situational awareness data sample is: vehicle model: 0; vehicle speed: 3, a step of; peripheral sources of risk: 0. the first target situational awareness data sample has an importance reference score of 86 in the vehicle safety dimension, the second target situational awareness data sample has an importance reference score of 45 in the vehicle safety dimension, the first target situational awareness data sample has an importance reference score of 79 in the vehicle safety dimension, and the first target situational awareness data sample has an importance reference score of 15 in the vehicle safety dimension. The embodiment of the application does not limit the target situation awareness data sample and the importance reference score of the target situation awareness data sample, and can be determined according to actual conditions.
S103: and performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data.
In this step, after the computer obtains the target situation awareness data sample, in order to improve the accuracy of scoring by the preset large language model, text conversion needs to be performed on the multiple to-be-scored situation awareness data and the data value of the data item in each target situation awareness data sample, so as to obtain updated to-be-scored situation awareness data and updated target situation awareness data. This is because the preset large language model has poor understanding ability of the numerical data.
Specifically, for each data item in each situation awareness data to be scored, if the data value of the data item is digital, determining the text corresponding to the data value of the data item according to the corresponding relation between the preset classification value range corresponding to the data item and the text;
and replacing the data value of the data item with a text to obtain updated situation awareness data to be scored.
Illustratively, the situational awareness data to be scored is: vehicle model: 0; vehicle speed: 25, a step of selecting a specific type of material; peripheral sources of risk: 3. the vehicle model, the vehicle speed and the peripheral dangerous sources are data items.
For a vehicle model, table 1 is a corresponding relation table of a preset classification value range and characters corresponding to the vehicle model.
TABLE 1
According to table 1, 0 is replaced by a car.
For the vehicle speed, table 2 is a corresponding relation table of a preset classification value range and characters corresponding to the vehicle speed.
TABLE 2
According to table 2, 25 is replaced with fast.
For the peripheral dangerous sources, table 3 is a corresponding relation table of the preset classification value range and the characters corresponding to the peripheral dangerous sources.
TABLE 3 Table 3
According to table 3,3 is replaced by a lot. Therefore, the updated situation awareness data to be scored is: vehicle model: a car; vehicle speed: the speed is fast; peripheral sources of risk: many.
It should be noted that, the embodiment of the present application does not limit the data item, the preset classification value range corresponding to the data item, and the corresponding relationship between the text, and can be determined according to the actual situation.
For each data item in each target situation awareness data sample, if the data value of the data item is digital, determining the corresponding text of the data value of the data item according to the corresponding relation between the preset data value corresponding to the data item and the text;
And replacing the data value of the data item with the text to obtain an updated target situation awareness data sample.
Illustratively, based on the above example, the updated first target situational awareness data sample is: vehicle model: a loader; vehicle speed: the speed is fast; peripheral sources of risk: many. The updated second target situational awareness data sample is: vehicle model: a car; vehicle speed: very fast; peripheral sources of risk: and no. The updated third target situation awareness data sample is: vehicle model: a truck; vehicle speed: slow; peripheral sources of risk: there are very many. The fourth updated target situational awareness data sample is: vehicle model: a truck; vehicle speed: is very slow; peripheral sources of risk: and no.
It should be noted that, if the data value of the data item is not a number, the data value does not need to be processed.
It should be noted that, since the data items in the target situational awareness data sample are data items having a large influence on the scoring dimension, the data items in the situational awareness data to be scored include the data items in the target situational awareness data sample, so the data items in the situational awareness data to be scored can be removed first, and the data items identical to the data items in the target situational awareness data sample are retained, so that text conversion is further performed.
S104: and filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word.
In this step, the computer obtains updated situation awareness data to be scored and updated target situation awareness data, and in order to preset a large language model to score, the updated situation awareness data to be scored and the updated target situation awareness data, as well as the importance reference score, are filled into a preset large language model prompt word template to obtain a large language model prompt word.
Illustratively, the preset large language model prompt word template is:
to-be-scored situational awareness data 1: { newtext }
To-be-scored situational awareness data 2: { newtext }, 2
To-be-scored situational awareness data 3: { newtext }, 3
……
The situational awareness related scoring examples are:
sample 1: { sample 1}
Sample 1 importance reference score: { grade1}
Sample 2: { sample 2}
Sample 2 importance reference score: { grade2}
Sample 3: { sample 3}
Sample 3 importance reference score: { grade3}
Sample 4: { sample 4}
Sample 4 importance reference score: { grade4}
……
Please judge the importance score of each piece of newly-appearing situation awareness data to be scored according to the scoring sample, and output the importance score.
The requirements are: and (3) scoring according to the importance of the situation awareness data n to be scored: the template of x' performs data output without outputting other contents.
On the basis of the embodiment, three pieces of updated situation awareness data to be scored are provided, and the first piece of updated situation awareness data to be scored is: vehicle model: a loader; vehicle speed: is very slow; peripheral sources of risk: many. The updated second situation awareness data to be scored is: vehicle model: a car; vehicle speed: is very fast; peripheral sources of risk: and no. The updated third situation awareness data to be scored is as follows: vehicle model: a truck; vehicle speed: is very slow; peripheral sources of risk: many.
Thus, the large language model hint words are:
to-be-scored situational awareness data 1: vehicle model: a car; vehicle speed: is very fast; peripheral sources of risk: without any means for
To-be-scored situational awareness data 2: vehicle model: a car; vehicle speed: is very fast; peripheral sources of risk: without any means for
To-be-scored situational awareness data 3: vehicle model: a truck; vehicle speed: is very slow; peripheral sources of risk: many of them
The situational awareness related scoring examples are:
Sample 1: vehicle model: a loader; vehicle speed: the speed is fast; peripheral sources of risk: multiple ones
Sample 1 importance reference score: 86
Sample 2: vehicle model: a car; vehicle speed: very fast; peripheral sources of risk: without any means for
Sample 2 importance reference score: 45
Sample 3: vehicle model: a truck; vehicle speed: slow; peripheral sources of risk: very much
Sample 3 importance reference score: 79
Sample 4: vehicle model: a truck; vehicle speed: is very slow; peripheral sources of risk: without any means for
Sample 4 importance reference score: 15
Please judge the importance score of each piece of newly-appearing situation awareness data to be scored according to the scoring sample, and output the importance score.
The requirements are: and (3) scoring according to the importance of the situation awareness data n to be scored: the template of x' performs data output without outputting other contents.
S105: inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension.
In the step, after the computer obtains the large language model prompt word, the large language model prompt word is input into a preset large language model, and the importance score of each situation awareness data to be scored in the scoring dimension is obtained.
Illustratively, on the basis of the above example, the output of the preset large language model is: to score situational awareness data 1 importance score: 80; to score situational awareness data 2 importance score: 35; to score situational awareness data 3 importance score: 76. that is, the importance score of the first situational awareness data to be scored is 80, the importance score of the second situational awareness data to be scored is 35, and the importance score of the third situational awareness data to be scored is 76.
According to the importance scoring method for situation awareness data, after score maintenance and situation awareness data to be scored selected by a user are obtained, a plurality of target situation awareness data samples corresponding to the score dimension are obtained according to the score dimension, and importance reference scores of each target situation awareness data sample in the score dimension are obtained; further, performing text conversion on data values of data items in the to-be-scored situation awareness data and the target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data; filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word; and finally, inputting the large language model prompt words into a preset large language model to obtain importance scores of the situation awareness data to be scored in the scoring dimension. According to the method and the device, the importance reference score of the target situation awareness data sample corresponding to the scoring dimension is determined, and the importance score of the situation awareness data to be scored in the scoring dimension is determined by combining the large language model, so that the scoring efficiency is effectively improved. In addition, scoring is performed through a large language model, so that the accuracy of scoring can be improved, the scoring cost is reduced, and scoring can be performed for various scoring dimensions.
Fig. 2 is a schematic flow chart of a second embodiment of an importance scoring method for situation awareness data, which is provided by the present application, and on the basis of the foregoing embodiment, the present application describes a case of determining a target situation awareness data sample, and determining an importance reference score of the target situation awareness data sample in a scoring dimension. As shown in fig. 2, the importance scoring method of situation awareness data specifically includes the following steps:
S201: and for each scoring dimension, acquiring a situation awareness model corresponding to the scoring dimension from a situation awareness system.
In this step, in order to smoothly perform the process of scoring the importance of the situational awareness data, the importance reference score of the target situational awareness data sample and the importance reference score of the target situational awareness data sample in the scoring dimension needs to be determined.
In order to improve the scoring efficiency, the data items can be screened, and the data items with larger influence on the scoring dimension are determined, so that a situation awareness model corresponding to each scoring dimension is required to be obtained from a situation awareness system. The situation awareness model is a model for processing situation awareness data aiming at the grading dimension and determining whether an alarm is given. The situation awareness model can also be a model for early warning, scheduling and prediction according to situation awareness data.
S202: and carrying out feature importance analysis on the situation awareness model to obtain importance scores of each data item in the situation awareness data.
In this step, after the computer obtains the situation awareness model corresponding to the scoring dimension, since the situation awareness model corresponding to the scoring dimension processes situation awareness data for the scoring dimension and determines whether to alarm, feature importance analysis can be performed on the situation awareness model, and each data item in the situation awareness data is a feature, so that importance scores of each data item in the situation awareness data can be obtained.
It should be noted that the larger the importance score of a data item, the greater the impact of the data item on the scoring dimension.
The feature importance analysis method for the situation awareness model may be: when the situation awareness model is a regression model or a random forest model, the situation awareness model can directly output the feature importance scores to obtain the importance score of each data item. It is also possible that: and calculating the correlation between each data item and the target variable in the situation awareness model by a correlation analysis method, and further determining the corresponding importance score. It is also possible that: and sequentially deleting the data items by using a recursive feature elimination method so as to determine the influence degree of the data items on the model performance and further determine the corresponding importance scores. The embodiment of the application does not limit the mode of analyzing the feature importance of the situation awareness model, and can be determined according to actual conditions.
It should be noted that, for obtaining the importance score of each data item in the situational awareness data, the importance score may also be determined manually, and then the determined importance score is input to the computer.
S203: and screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items.
In this step, after the importance score of each data item in the situation awareness data is obtained by the computer, in order to improve the efficiency of subsequent scoring, the data items in the situation awareness data need to be screened according to the importance score of the data item, so as to obtain a target data item.
The target data item is a data item with a larger importance score, and the target data item meets the requirement of a preset large language model.
S204: and constructing a plurality of situation awareness data samples corresponding to the grading dimension according to the target data item.
In this step, after the computer obtains the target data item, a plurality of situation awareness data samples corresponding to the scoring dimension can be constructed according to the target data item.
Specifically, each situation awareness data sample comprises all target data items, and for determining the data value of the target data item in one situation awareness data sample, one data value is randomly selected as the data value of the target data item according to the preset value range corresponding to the target data item.
It should be noted that, the number of situation awareness data samples is greater than the number of samples in the preset large language model prompt word template. This is because there may be non-representative examples among the situational awareness data examples constructed at this time, which may affect the accuracy of the scoring and require subsequent screening.
S205: a plurality of importance scores for each situational awareness data sample in the scoring dimension are obtained.
In the step, after a plurality of situation awareness data samples are constructed by a computer, each situation awareness data sample is transmitted to a plurality of scoring specialists for scoring, each scoring specialist scores the situation awareness data sample through a plurality of angles in the scoring dimension, and the sum of the scores of the angles is used as the importance score of the scoring specialist for the situation awareness data sample in the scoring dimension. Thus, the computer may obtain multiple importance scores for each situational awareness data sample in that scoring dimension.
It should be noted that, when one scoring expert scores all situation awareness data samples, the order of all situation awareness data samples is randomly arranged so as to ensure the objectivity of scoring. The plurality of angles may be urgency, severity, risk, relevance, and the like.
It should be noted that, the greater the importance score, the greater the importance of the situational awareness data sample in the dimension of the score.
S206: screening the multiple situation awareness data samples according to importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and importance reference scores of each target situation awareness data sample in the scoring dimension.
In this step, after the computer obtains the importance scores of each situation awareness data sample in the score dimension, as there may be non-representative samples in the situation awareness data samples and the number of situation awareness data samples is large and does not meet the requirement of the preset large language model prompt word template, the computer may screen the situation awareness data samples according to the importance scores of the situation awareness data samples in the score dimension to obtain the target situation awareness data samples corresponding to the score dimension and the importance reference scores of each target situation awareness data sample in the score dimension.
The number of the target situation awareness data samples is the same as that of the preset large language model prompt word templates, and the standard deviation of a plurality of importance scores of the target situation awareness data samples is smaller. And taking the average value of a plurality of importance scores of the target situation awareness data sample as an importance reference score of the target situation awareness data sample in the score dimension.
It should be noted that, the computer stores the target situation awareness data samples corresponding to the scoring dimension and the importance reference score of each target situation awareness data sample in the scoring dimension into the database for use in the subsequent scoring.
According to the importance scoring method for situation awareness data, the target data item is determined according to the situation awareness model corresponding to the scoring dimension, so that the influence of the target situation awareness data sample on the scoring dimension is large, and the accuracy of subsequent scoring can be improved. By screening the situation awareness data samples, the target situation awareness data samples are more representative, and meet the requirements of a preset large language model prompt word template, and the accuracy of subsequent scoring can be improved.
Fig. 3 is a schematic flow chart of a third embodiment of the importance scoring method for situation awareness data, which is provided by the present application, and on the basis of the above embodiment, the present application describes a case where a computer screens data items in the situation awareness data to obtain target data items. As shown in fig. 3, the importance scoring method of situation awareness data specifically includes the following steps:
S301: according to the sequence of the importance scores from large to small, judging whether the sum of the character number of each data item and the character number of the determined target data item is smaller than a preset threshold value or not for each data item in situation awareness data, multiplying the sum by the sample number in a preset large language model prompt word template; if the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template, and is greater than or equal to a preset threshold value, executing step S304; if the sum of the number of characters of the data item and the number of characters of the determined target data item, multiplied by the number of samples in the preset large language model prompt word template is smaller than the preset threshold, step S302 is executed.
In this step, after obtaining the importance score of each data item in the situation awareness data, in order to determine the data item that has a larger influence on the score dimension and meets the requirement of the preset large language model, the computer needs to determine, according to the order of the importance score from large to small, whether the sum of the number of characters of the data item and the number of characters of the determined target data item, multiplied by the number of samples in the preset large language model prompt word template, is smaller than a preset threshold value for each data item in the situation awareness data in turn. The preset threshold value is the difference between the context length of the preset large language model and the preset situation awareness data character number to be scored.
And processing each data item in the situation awareness data in sequence according to the order of the importance scores from large to small to determine whether the data item is a target data item, so that the importance scores of the target data items are large, and the influence on the grading dimension is large.
And filling the target situation awareness data sample containing the target data item into a preset large language model prompt word template to form a large language model prompt word, and then inputting the large language model prompt word into a preset large language model for processing. The preset large language model has a requirement on the context length, so that the sum of the number of characters of the data item and the number of characters of the determined target data item needs to be judged, and the number of samples in the prompt word template of the preset large language model is multiplied, and whether the number of samples in the prompt word template of the preset large language model is smaller than a preset threshold value is judged, wherein the preset threshold value is the difference between the context length of the preset large language model and the number of characters of the situation awareness data to be scored. The number of samples may be 4, 7, 50, or the like. The context length may be 500, 2000, 4000, etc. The number of characters of the situation awareness data to be scored can be 300, 1000, 2000 and the like. The embodiment of the application does not limit the number of samples, the length of the context and the number of characters of the situation awareness data to be scored in advance, and can be determined according to actual situations.
S302: judging whether the number of the determined target data items is smaller than a preset number threshold value or not; if the number of the determined target data items is smaller than the preset number threshold, step S303 is executed; if the number of the determined target data items is equal to the preset number threshold, step S304 is performed.
In this step, if the sum of the number of characters of the data item and the number of characters of the determined target data item is determined, the computer multiplies the number of samples in the preset large language model prompt word template by the number of samples smaller than the preset threshold, which indicates that if the data item is used as the target data item, the computer meets the requirement of the preset large language model, and also needs to determine whether the number of the determined target data items is smaller than the preset number threshold.
In order to improve the efficiency of the subsequent scoring, the number of target data items cannot be too large, and therefore it is necessary to determine whether the determined number of target data items is smaller than a preset number threshold.
It should be noted that, the preset number threshold may be 5, 8, 50, etc., and the embodiment of the present application does not limit the preset number threshold, and may be set according to actual situations.
S303: the data item is determined to be the target data item.
In this step, if the number of the determined target data items is smaller than the preset number threshold, which means that the number of the currently determined target data items is smaller, the data item may be determined as the target data item.
S304: the data item is discarded.
In this step, if the computer determines that the number of the determined target data items is equal to the preset number threshold, it indicates that the number of the currently determined target data items can meet the scoring requirement, and continuing to increase the target data items affects the scoring efficiency, so that the data items are discarded.
And if the sum of the number of characters of the data item and the number of characters of the determined target data item is determined, multiplying the sum by the number of samples in a preset large language model prompt word template, and if the sum is larger than or equal to a preset threshold value, the computer indicates that the data item is not in accordance with the requirement of the preset large language model if the data item is taken as the target data item.
According to the importance scoring method for situation awareness data, the data items in the situation awareness data are screened to obtain the target data items, so that the obtained target data items meet the requirements of a preset large language model, and scoring efficiency can be improved.
Fig. 4 is a flow chart of a fourth embodiment of the importance scoring method for situation awareness data according to the present application, where on the basis of the foregoing embodiments, the embodiment of the present application filters a situation awareness data sample by a computer to obtain a target situation awareness data sample, and the importance reference scoring situation of the target situation awareness data sample is described. As shown in fig. 4, the importance scoring method of situation awareness data specifically includes the following steps:
S401: and for each situation awareness data sample, calculating the scoring standard deviation of the situation awareness data sample according to a plurality of importance scores of the situation awareness data sample in the scoring dimension.
In this step, after the computer obtains a plurality of importance scores of the situation awareness data samples in the score dimension, the situation awareness data samples need to be screened because of the large number of situation awareness data samples. Firstly, for each situation awareness data sample, calculating the scoring standard deviation of the situation awareness data sample according to a plurality of importance scores of the situation awareness data sample in the scoring dimension.
The standard deviation of the importance scores of the situation awareness data sample in the scoring dimension is the scoring standard deviation of the situation awareness data sample.
S402: and taking the situation awareness data sample with the scoring standard deviation smaller than the preset standard deviation threshold as a first situation awareness data sample.
In this step, after the computer obtains the scoring standard deviation of each situation awareness data sample, the situation awareness data sample with a larger scoring standard deviation indicates that the scoring expert is inconsistent with the scoring of the situation awareness data sample, and the situation awareness data sample is not representative, so that the situation awareness data sample with a smaller scoring standard deviation is selected. And taking the situation awareness data sample with the scoring standard deviation smaller than the preset standard deviation threshold as a first situation awareness data sample.
It should be noted that, the preset standard deviation threshold may be 2,5, 10, etc., and the embodiment of the present application does not limit the preset standard deviation threshold, and may be set according to actual situations.
S403: judging whether the number of the first situation awareness data samples is equal to the number of samples in a preset large language model prompt word template or not; if the number of the first situation awareness data samples is equal to the number of samples in the preset large language model prompt word template, executing step S404; if the number of the first situation awareness data samples is not equal to the number of samples in the preset large language model prompt word template, step S405 is executed.
In this step, after determining the first situation awareness data samples, the computer needs to determine whether the number of the first situation awareness data samples is equal to the number of samples in the preset large language model prompt word template, so as to determine whether the number of the first situation awareness data samples meets the requirement of the preset large language model prompt word template.
It should be noted that, the number of samples in the preset large language model prompt word template may be 4, 7, 50, etc. The embodiment of the application does not limit the number of the samples in the preset large language model prompt word template, and can be determined according to actual conditions.
S404: and taking each first situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension.
In this step, if the number of the first situation awareness data samples is equal to the number of samples in the preset large language model prompt word template, the number of the first situation awareness data samples is described to meet the requirement of the preset large language model prompt word template, and then each first situation awareness data sample is used as a target situation awareness data sample corresponding to the grading dimension.
S405: judging whether the number of the first situation awareness data samples is smaller than the number of samples in a preset large language model prompt word template or not; if the number of the first situation awareness data samples is smaller than the number of samples in the preset large language model prompt word template, executing step S406; if the number of the first situation awareness data samples is greater than the number of samples in the preset large language model prompt word template, step S408 is executed.
In this step, if the number of the first situation awareness data samples is not equal to the number of samples in the preset large language model prompt word template, it is explained that the number of the first situation awareness data samples does not meet the requirement of the preset large language model prompt word template, and then it is continuously judged whether the number of the first situation awareness data samples is smaller than the number of samples in the preset large language model prompt word template.
S406: and selecting a first number of situational awareness data samples from a plurality of situational awareness data samples except the first situational awareness data.
S407: and taking each first situation awareness data sample and the selected first number of situation awareness data samples as target situation awareness data samples corresponding to the scoring dimension.
In the above step, if the computer determines that the number of the first situation awareness data samples is smaller than the number of samples in the preset large language model prompt word template, it is described that some samples need to be supplemented as target situation awareness data samples, and the first number of situation awareness data samples is selected from a plurality of situation awareness data samples except the first situation awareness data, where the first number is the difference between the number of samples in the preset large language model prompt word template and the number of the first situation awareness data samples.
And then taking each first situation awareness data sample and the selected first number of situation awareness data samples as target situation awareness data samples corresponding to the grading dimension.
It should be noted that, from a plurality of situation awareness data samples except the first situation awareness data, the first number of situation awareness data samples may be selected according to the order of the scoring standard deviation from small to large.
S408: and removing the second quantity of data from the first situation awareness data to obtain second situation awareness data.
S409: and taking each second situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension.
In the above step, if the computer determines that the number of the first situation awareness data samples is greater than the number of samples in the preset large language model prompt word template, it indicates that the number of the first situation awareness data samples is greater, and the computer needs to continuously reject the first situation awareness data samples. And removing a second number of data from the first situation awareness data to obtain second situation awareness data, wherein the second number is the difference between the number of samples of the first situation awareness data and the number of samples in a preset large language model prompt word template.
And then taking each second situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension.
It should be noted that, for the method of rejecting the second number of data, it may be: for each first situational awareness data, an average of a plurality of importance scores for the first situational awareness data is calculated. And clustering the first situation awareness data according to the average value to obtain a plurality of groups of data. And sequentially sequencing each group of data according to the sequence from large to small of the data quantity in the group, selecting the data with the preset proportion before rejection, and judging whether the quantity of the rejected data is smaller than the second quantity. If the data is smaller than the set of data, the next set of data is processed continuously according to the process. If the number of the data to be rejected is still smaller than the second number after the processing of each group of data is completed, the data is rejected randomly, so that the number of the data to be rejected is equal to the second number. The preset proportion can be 1/10, 1/5, 1/3 and the like, and the embodiment of the application does not limit the preset proportion and can be determined according to actual conditions.
S410: and for each target situation awareness data sample, taking the average value of a plurality of importance scores of the target situation awareness data sample in the scoring dimension as an importance reference score of the target situation awareness data sample in the scoring dimension.
In the step, after determining the target situation awareness data samples, the computer takes the average value of a plurality of importance scores of the target situation awareness data samples in the scoring dimension as the importance reference score of the target situation awareness data samples in the scoring dimension for each target situation awareness data sample.
According to the importance scoring method for situation awareness data, the situation awareness data samples are screened according to the scoring standard deviation and the number of the samples in the preset large language model prompting word template, so that the obtained target situation awareness data samples are more representative, and the requirement of the preset large language model prompting word template is met. The average value of the importance scores is used as an importance reference score, so that the accuracy of the importance reference score is improved.
Fig. 5 is a flowchart of a fifth embodiment of the situation awareness data importance scoring method provided by the present application, and on the basis of the foregoing embodiment, the embodiment of the present application describes a case where a computer determines a correspondence between a preset classification value range corresponding to a data item and a text. As shown in fig. 5, the importance scoring method of situation awareness data specifically includes the following steps:
S501: and acquiring a plurality of reference situation awareness data.
In this step, in order to perform text conversion in number when importance scoring is performed later, it is necessary to determine a correspondence between a preset classification value range corresponding to a data item and text, and first obtain a plurality of reference situation awareness data.
S502: and for each data item with the data value being a number in the reference situation awareness data, acquiring a plurality of data values corresponding to the data item from the plurality of reference situation awareness data.
In the step, after the computer acquires the plurality of reference situation awareness data, for each data item with a digital data value in the reference situation awareness data, acquiring a plurality of data values corresponding to the data item from the plurality of reference situation awareness data.
Since the data values are not digital data items, no text conversion is required, so these data items are not processed.
S503: and clustering the plurality of data values by adopting a clustering algorithm to obtain a plurality of data classes corresponding to the data items and the number of the data classes.
In the step, after the computer obtains a plurality of data values corresponding to the data item, a clustering algorithm is adopted to cluster the plurality of data values, so as to obtain a plurality of data classes corresponding to the data item and the number of the data classes.
The number of the data classes is determined by adopting an elbow method for a clustering algorithm.
S504: for each data class, determining a classification value range corresponding to the data class according to the size of the data value in the data class.
In this step, after the computer obtains a plurality of data classes, for each data class, according to the size of the data value in the data class, the classification value range corresponding to the data class is determined. That is, a section formed by the minimum value and the maximum value in the data class is used as the classification value range corresponding to the data class.
It should be noted that, the classification value ranges corresponding to each data class may be sorted according to the data values from small to large, and then for two adjacent classification value ranges, if there is an interval between the two classification value ranges, at least one of the two classification value ranges is enlarged, so that the two classification value ranges have no interval. The two adjacent classification value ranges are [2,5] and [8,9], and the [2,6] and (6, 9] are obtained after expansion.
S505: and acquiring characters corresponding to each classified value range according to the number of the data classes.
In this step, after determining the classification value ranges corresponding to the data classes, the computer may obtain, according to the number of the data classes, characters corresponding to each classification value range.
The data item corresponds to a plurality of groups of characters, each group of characters has a corresponding number of data classes, and characters in each group of characters exist in sequence. The number of characters in a group of characters is equal to the number of data classes corresponding to the characters. And determining a plurality of groups of characters corresponding to the data item, determining a corresponding group of characters according to the number of the data items, and sequentially corresponding the characters to the classified value ranges according to the sequence of the characters and the sequence of the classified value ranges from small to large.
The data item is the speed of a vehicle, and the classified value ranges are sequentially from small to large: [0, 5), [5, 20), [20, 30), [30, 50). The text sequence is as follows: slow, fast. So [0, 5) corresponds to slow, [5, 20) corresponds to slow, [20, 30) corresponds to fast, [30, 50) corresponds to fast.
S506: and taking each classification value range and the corresponding text as the corresponding relation between the preset classification value range and the corresponding text of the data item.
In this step, after determining each classification value range and the corresponding text, the computer uses each classification value range and the corresponding text as the corresponding relation between the preset classification value range and the text corresponding to the data item.
According to the importance scoring method for situation awareness data, the corresponding relation between the preset classification value range corresponding to the data item and the characters is determined in a clustering mode, so that the determined corresponding relation between the preset classification value range and the characters is more accurate.
Fig. 6a is a flowchart of a sixth embodiment of the situation awareness data importance scoring method provided by the present application, and on the basis of the foregoing embodiment, the present application describes a case of displaying situation awareness data to be scored. As shown in fig. 6a, the importance scoring method of situation awareness data specifically includes the following steps:
s601: and sequencing situation awareness data to be scored according to the order of importance scores from large to small to obtain a situation awareness data sequence.
When the user looks up the situation awareness data to be scored, the situation awareness data to be scored with larger importance scores can be preferentially looked up, so that the situation awareness data to be scored with larger importance scores is preferentially displayed when displayed.
In the step, the computer sorts the situation awareness data to be scored according to the order of importance scores from large to small to obtain a situation awareness data sequence.
S602: and displaying a preset number of situation awareness data to be scored in the situation awareness data sequence.
In this step, because the display device is limited in size, when more data are in the situation awareness data sequence, all the data cannot be displayed, and only the front preset number of situation awareness data to be scored in the situation awareness data sequence can be displayed. The sequence of display is the same as the sequence of displaying situational awareness data to be scored in the situational awareness data sequence.
It should be noted that, the preset number may be 1, 4, 25, etc., and the embodiment of the present application does not limit the preset number, and may be determined according to actual situations.
It should be noted that, when the number of situation awareness data to be scored in the situation awareness data sequence is smaller than or equal to the preset number, displaying all the situation awareness data to be scored in the situation awareness data sequence.
It should be noted that, the computer may further display the situation awareness data to be scored according to whether the situation awareness data to be scored with the same importance score exists in the display situation awareness data sequence.
Fig. 6b is a schematic flow chart of displaying situation awareness data to be scored, as shown in fig. 6b, where a computer first judges whether situation awareness data to be scored with the same importance score exists in a display situation awareness data sequence, and if no situation awareness data to be scored with the same importance score exists in the display situation awareness data sequence, the preset number of situation awareness data to be scored is displayed according to the order of the situation awareness data to be scored in the display situation awareness data sequence.
If situation awareness data to be scored with the same importance score exists in the display situation awareness data sequence, determining at least one group of data to be ranked from the display situation awareness data sequence, wherein the importance scores of the situation awareness data to be scored in the same group of data to be ranked are the same; the importance scores of the situation awareness data to be scored in the different groups of data to be ranked are different; the importance scores of the situation awareness data to be scored in each group of the data to be ranked are different from the importance scores of the situation awareness data to be scored except all groups of the data to be ranked in the display situation awareness data sequence; the number of situation awareness data to be scored in each group of data to be ranked is multiple.
And further acquiring importance scores of each data item, and determining the ranking value in the situation awareness data to be scored according to the importance scores and the data values of each data item in the situation awareness data to be scored for each situation awareness data to be scored in each group of data to be ranked.
And multiplying the importance score of each data item in the situation awareness data to be scored by the data value of the data item, and then summing to obtain the ranking value of the situation awareness data to be scored. The larger the ranking value is, the greater the importance of the situation awareness data to be scored corresponding to the ranking value is.
For each group of data to be sorted, sorting the situation awareness data to be scored in the group of data to be sorted according to the sequence of the sorting values from big to small to obtain updated data to be sorted corresponding to the group of data to be sorted;
it should be noted that, for the situation awareness data to be scored with equal ranking values, the situation awareness data to be scored may be ranked randomly, or may be ranked sequentially according to the generation time of the situation awareness data to be scored.
And for each group of data to be ordered, replacing the group of data to be ordered in the display situation awareness data sequence with updated data to be ordered corresponding to the group of data to be ordered, and obtaining an updated display situation awareness data sequence.
And displaying the pre-set number of situation awareness data to be scored according to the sequence of the situation awareness data to be scored in the updated display situation awareness data sequence.
According to the importance scoring method for situation awareness data, the situation awareness data to be scored with larger importance is highlighted by displaying the front preset number of the situation awareness data to be scored in the situation awareness data sequence, so that a user can quickly view the situation awareness data to be scored with larger importance.
The following are examples of the apparatus of the present application that may be used to perform the method embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method of the present application.
Fig. 7 is a schematic structural diagram of an embodiment of a situation awareness data importance scoring device according to the present application. As shown in fig. 7, the importance scoring apparatus 70 of the situational awareness data includes:
An acquisition module 71 for:
obtaining scoring dimension selected by a user and a plurality of situation awareness data to be scored;
Acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from a database, and importance reference scores of each target situation awareness data sample in the scoring dimension;
a processing module 72 for:
Performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data;
Filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word;
and the scoring module 73 is used for inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension.
Further, the processing module 72 is specifically configured to:
For each data item in the situation awareness data to be scored, if the data value of the data item is digital, determining the corresponding text of the data value of the data item according to the corresponding relation between the preset classification value range corresponding to the data item and the text;
replacing the data value of the data item with the text to obtain updated situation awareness data to be scored;
For each data item in each target situation awareness data sample, if the data value of the data item is digital, determining the text corresponding to the data value of the data item according to the corresponding relation between the preset data value corresponding to the data item and the text;
And replacing the data value of the data item with the text to obtain an updated target situation awareness data sample.
Further, the obtaining module 71 is further configured to obtain, for each score dimension, a situation awareness model corresponding to the score dimension from a situation awareness system, where the situation awareness model is a model that determines whether to alarm according to situation awareness data;
further, the processing module 72 is further configured to:
Carrying out feature importance analysis on the situation awareness model to obtain importance scores of each data item in the situation awareness data;
Screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items;
Constructing a plurality of situation awareness data samples corresponding to the scoring dimension according to the target data item, wherein the number of the situation awareness data samples is larger than that of the samples in the preset large language model prompt word template;
acquiring a plurality of importance scores of each situation awareness data sample in the score dimension;
and screening the situation awareness data samples according to importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and importance reference scores of each target situation awareness data sample in the scoring dimension.
Further, the processing module 72 is further configured to:
According to the sequence of the importance scores from large to small, judging whether the sum of the number of characters of the data item and the number of characters of the determined target data item is less than a preset threshold value or not for each data item in situation awareness data, wherein the preset threshold value is the difference between the context length of the preset large language model and the number of characters of the situation awareness data to be scored;
if the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template, and is larger than or equal to the preset threshold value, discarding the data item;
If the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template and is smaller than the preset threshold, judging whether the number of the determined target data items is smaller than a preset number threshold;
if the number of the determined target data items is smaller than the preset number threshold, determining the data items as target data items;
And discarding the data items if the number of the determined target data items is equal to the preset number threshold.
Further, the processing module 72 is further configured to:
for each situation awareness data sample, calculating a scoring standard deviation of the situation awareness data sample according to a plurality of importance scores of the situation awareness data sample in the scoring dimension;
taking a situation awareness data sample with the scoring standard deviation smaller than a preset standard deviation threshold as a first situation awareness data sample;
Judging whether the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template or not;
If the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template, taking each first situation awareness data sample as a target situation awareness data sample corresponding to the scoring dimension;
if the number of the first situation awareness data samples is smaller than the number of the samples in the preset large language model prompt word template, selecting a first number of situation awareness data samples from the plurality of situation awareness data samples except the first situation awareness data, wherein the first number is the difference between the number of the samples in the preset large language model prompt word template and the number of the first situation awareness data samples;
Taking each first situation awareness data sample and the first number of situation awareness data samples selected as target situation awareness data samples corresponding to the scoring dimension;
If the number of the first situation awareness data samples is larger than the number of the samples in the preset large language model prompt word template, removing a second number of data from the first situation awareness data to obtain second situation awareness data, wherein the second number is the difference between the number of the first situation awareness data samples and the number of the samples in the preset large language model prompt word template;
taking each second situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension;
And for each target situation awareness data sample, taking the average value of a plurality of importance scores of the target situation awareness data sample in the scoring dimension as an importance reference score of the target situation awareness data sample in the scoring dimension.
Further, the obtaining module 71 is further configured to obtain a plurality of reference situation awareness data.
Further, the processing module 72 is further configured to:
for each data item with a digital data value in the reference situation awareness data, acquiring a plurality of data values corresponding to the data item from the plurality of reference situation awareness data;
Clustering the plurality of data values by adopting a clustering algorithm to obtain a plurality of data classes corresponding to the data items and the number of the data classes, wherein the number of the data classes is determined by adopting an elbow method by adopting the clustering algorithm;
For each data class, determining a classification value range corresponding to the data class according to the size of a data value in the data class;
acquiring characters corresponding to each classified value range according to the number of the data classes;
and taking each classification value range and the corresponding text as the corresponding relation between the preset classification value range and the text corresponding to the data item.
Further, the processing module 72 is further configured to sort the situational awareness data to be scored according to the order from the big importance score to the small importance score, so as to obtain a situational awareness data sequence;
the display module 74 is configured to display a preset number of situational awareness data to be scored in the situational awareness data sequence.
The importance scoring device for situation awareness data provided in this embodiment is configured to execute the technical scheme in any one of the foregoing method embodiments, and its implementation principle and technical effect are similar, and are not described herein again.
Fig. 8 is a schematic structural diagram of an electronic device according to the present application. As shown in fig. 8, the electronic device 80 includes:
a processor 81, a memory 82, a communication interface 83, and a display 84;
The memory 82 is used for storing executable instructions of the processor 81;
Wherein the processor 81 is configured to perform the technical solution of any of the method embodiments described above via execution of the executable instructions.
Alternatively, the memory 82 may be separate or integrated with the processor 81.
Optionally, when the memory 82 is a device separate from the processor 81, the electronic device 80 may further include:
The bus 85, the display 84, the memory 82 and the communication interface 83 are connected to the processor 81 through the bus 85 and perform communication with each other, and the communication interface 83 is used for communication with other devices.
Alternatively, the communication interface 83 may be implemented specifically by a transceiver. The communication interface is used to enable communication between the database access apparatus and other devices (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Bus 85 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The processor may be a general-purpose processor, including a central processing unit CPU, a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.
The electronic device is configured to execute the technical scheme in any of the foregoing method embodiments, and its implementation principle and technical effects are similar, and are not described herein again.
The embodiment of the application also provides a readable storage medium, on which a computer program is stored, which when executed by a processor implements the technical solution provided by any of the foregoing embodiments.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program is used for realizing the technical scheme provided by any one of the method embodiments when being executed by a processor.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features can be replaced equivalently; such modifications and substitutions do not depart from the spirit of the application.

Claims (9)

1. A method for scoring importance of situational awareness data, comprising:
obtaining scoring dimension selected by a user and a plurality of situation awareness data to be scored;
Acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from a database, and importance reference scores of each target situation awareness data sample in the scoring dimension;
Performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data;
Filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word;
Inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension;
before the scoring dimension selected by the user and the multiple pieces of situation awareness data to be scored are obtained, the method further includes:
For each scoring dimension, acquiring a situation awareness model corresponding to the scoring dimension from a situation awareness system, wherein the situation awareness model is a model for determining whether an alarm is given according to situation awareness data;
Carrying out feature importance analysis on the situation awareness model to obtain importance scores of each data item in the situation awareness data;
Screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items;
Constructing a plurality of situation awareness data samples corresponding to the scoring dimension according to the target data item, wherein the number of the situation awareness data samples is larger than that of the samples in the preset large language model prompt word template;
acquiring a plurality of importance scores of each situation awareness data sample in the score dimension;
and screening the situation awareness data samples according to importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and importance reference scores of each target situation awareness data sample in the scoring dimension.
2. The method of claim 1, wherein performing text conversion on the data values of the data items in the multiple to-be-scored situational awareness data and each target situational awareness data sample to obtain updated to-be-scored situational awareness data and target situational awareness data, includes:
For each data item in the situation awareness data to be scored, if the data value of the data item is digital, determining the corresponding text of the data value of the data item according to the corresponding relation between the preset classification value range corresponding to the data item and the text;
replacing the data value of the data item with the text to obtain updated situation awareness data to be scored;
For each data item in each target situation awareness data sample, if the data value of the data item is digital, determining the text corresponding to the data value of the data item according to the corresponding relation between the preset data value corresponding to the data item and the text;
And replacing the data value of the data item with the text to obtain an updated target situation awareness data sample.
3. The method of claim 1, wherein the screening the data items in the situational awareness data according to the importance scores of the data items to obtain target data items comprises:
According to the sequence of the importance scores from large to small, judging whether the sum of the number of characters of the data item and the number of characters of the determined target data item is less than a preset threshold value or not for each data item in situation awareness data, wherein the preset threshold value is the difference between the context length of the preset large language model and the number of characters of the situation awareness data to be scored;
if the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template, and is larger than or equal to the preset threshold value, discarding the data item;
If the sum of the number of characters of the data item and the number of characters of the determined target data item is multiplied by the number of samples in the preset large language model prompt word template and is smaller than the preset threshold, judging whether the number of the determined target data items is smaller than a preset number threshold;
if the number of the determined target data items is smaller than the preset number threshold, determining the data items as target data items;
And discarding the data items if the number of the determined target data items is equal to the preset number threshold.
4. The method according to claim 1, wherein the filtering the plurality of situation awareness data samples according to the importance scores of the situation awareness data samples in the scoring dimension to obtain the target situation awareness data samples corresponding to the scoring dimension, and the importance reference scores of each target situation awareness data sample in the scoring dimension include:
for each situation awareness data sample, calculating a scoring standard deviation of the situation awareness data sample according to a plurality of importance scores of the situation awareness data sample in the scoring dimension;
taking a situation awareness data sample with the scoring standard deviation smaller than a preset standard deviation threshold as a first situation awareness data sample;
Judging whether the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template or not;
If the number of the first situation awareness data samples is equal to the number of the samples in the preset large language model prompt word template, taking each first situation awareness data sample as a target situation awareness data sample corresponding to the scoring dimension;
if the number of the first situation awareness data samples is smaller than the number of the samples in the preset large language model prompt word template, selecting a first number of situation awareness data samples from the plurality of situation awareness data samples except the first situation awareness data, wherein the first number is the difference between the number of the samples in the preset large language model prompt word template and the number of the first situation awareness data samples;
Taking each first situation awareness data sample and the first number of situation awareness data samples selected as target situation awareness data samples corresponding to the scoring dimension;
If the number of the first situation awareness data samples is larger than the number of the samples in the preset large language model prompt word template, removing a second number of data from the first situation awareness data to obtain second situation awareness data, wherein the second number is the difference between the number of the first situation awareness data samples and the number of the samples in the preset large language model prompt word template;
taking each second situation awareness data sample as a target situation awareness data sample corresponding to the grading dimension;
And for each target situation awareness data sample, taking the average value of a plurality of importance scores of the target situation awareness data sample in the scoring dimension as an importance reference score of the target situation awareness data sample in the scoring dimension.
5. The method of claim 1, wherein prior to the obtaining the user selected scoring dimension and the plurality of situational awareness data to be scored, the method further comprises:
Acquiring a plurality of reference situation awareness data;
for each data item with a digital data value in the reference situation awareness data, acquiring a plurality of data values corresponding to the data item from the plurality of reference situation awareness data;
Clustering the plurality of data values by adopting a clustering algorithm to obtain a plurality of data classes corresponding to the data items and the number of the data classes, wherein the number of the data classes is determined by adopting an elbow method by adopting the clustering algorithm;
For each data class, determining a classification value range corresponding to the data class according to the size of a data value in the data class;
acquiring characters corresponding to each classified value range according to the number of the data classes;
and taking each classification value range and the corresponding text as the corresponding relation between the preset classification value range and the text corresponding to the data item.
6. The method according to claim 1, wherein the method further comprises:
Sequencing situation awareness data to be scored according to the order of importance scores from big to small to obtain a situation awareness data sequence;
And displaying the situation awareness data to be scored in the preset quantity in the situation awareness data sequence.
7. An importance scoring device for situational awareness data, comprising:
An acquisition module for:
obtaining scoring dimension selected by a user and a plurality of situation awareness data to be scored;
Acquiring a plurality of target situation awareness data samples corresponding to the scoring dimension from a database, and importance reference scores of each target situation awareness data sample in the scoring dimension;
A processing module for:
Performing text conversion on the data values of the data items in the multiple to-be-scored situation awareness data and each target situation awareness data sample to obtain updated to-be-scored situation awareness data and updated target situation awareness data;
Filling the updated situation awareness data to be scored, the updated target situation awareness data and the importance reference score into a preset large language model prompt word template to obtain a large language model prompt word;
the scoring module is used for inputting the large language model prompt words into a preset large language model to obtain importance scores of each situation awareness data to be scored in the scoring dimension;
The processing module is further configured to, before the scoring dimension selected by the user and the multiple pieces of situation awareness data to be scored are obtained, obtain, for each scoring dimension, a situation awareness model corresponding to the scoring dimension from the situation awareness system, where the situation awareness model is a model that determines whether to alarm according to the situation awareness data;
Carrying out feature importance analysis on the situation awareness model to obtain importance scores of each data item in the situation awareness data;
Screening the data items in the situation awareness data according to the importance scores of the data items to obtain target data items;
Constructing a plurality of situation awareness data samples corresponding to the scoring dimension according to the target data item, wherein the number of the situation awareness data samples is larger than that of the samples in the preset large language model prompt word template;
acquiring a plurality of importance scores of each situation awareness data sample in the score dimension;
and screening the situation awareness data samples according to importance scores of the situation awareness data samples in the scoring dimension to obtain target situation awareness data samples corresponding to the scoring dimension, and importance reference scores of each target situation awareness data sample in the scoring dimension.
8. An electronic device, comprising:
A processor, a memory, a communication interface;
the memory is used for storing executable instructions of the processor;
Wherein the processor is configured to perform the situational awareness data importance scoring method of any one of claims 1 to 7 via execution of the executable instructions.
9. A readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the situation awareness data importance scoring method of any one of claims 1 to 6.
CN202410219180.6A 2024-02-28 2024-02-28 Importance scoring method, device, equipment and medium for situation awareness data Active CN118037121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410219180.6A CN118037121B (en) 2024-02-28 2024-02-28 Importance scoring method, device, equipment and medium for situation awareness data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410219180.6A CN118037121B (en) 2024-02-28 2024-02-28 Importance scoring method, device, equipment and medium for situation awareness data

Publications (2)

Publication Number Publication Date
CN118037121A CN118037121A (en) 2024-05-14
CN118037121B true CN118037121B (en) 2024-06-21

Family

ID=91000107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410219180.6A Active CN118037121B (en) 2024-02-28 2024-02-28 Importance scoring method, device, equipment and medium for situation awareness data

Country Status (1)

Country Link
CN (1) CN118037121B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110324336A (en) * 2019-07-02 2019-10-11 成都信息工程大学 A kind of car networking data Situation Awareness method based on network security
CN115037559A (en) * 2022-08-10 2022-09-09 中国信息通信研究院 Data safety monitoring system based on flow, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2356640A4 (en) * 2008-11-13 2012-11-14 Aser Rich Ltd System and method for improved vehicle safety through enhanced situation awareness of a driver of a vehicle
US20230229785A1 (en) * 2022-01-20 2023-07-20 Capital One Services, Llc Systems and methods for analyzing cybersecurity threat severity using machine learning
CN115277229A (en) * 2022-07-30 2022-11-01 北京冠程科技有限公司 Network security situation perception method and system
CN116279572B (en) * 2023-02-08 2024-01-30 中南大学 Vehicle safety situation assessment and steady driving mode switching method and system
CN116137098A (en) * 2023-02-23 2023-05-19 东风汽车集团股份有限公司 Vehicle situation awareness method and system based on experience synchronization
CN116781358B (en) * 2023-06-27 2024-06-07 广东为辰信息科技有限公司 Vehicle security situation layered evaluation method based on mathematical model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110324336A (en) * 2019-07-02 2019-10-11 成都信息工程大学 A kind of car networking data Situation Awareness method based on network security
CN115037559A (en) * 2022-08-10 2022-09-09 中国信息通信研究院 Data safety monitoring system based on flow, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN118037121A (en) 2024-05-14

Similar Documents

Publication Publication Date Title
CN107016107B (en) Public opinion analysis method and system
CN109857938B (en) Searching method and searching device based on enterprise information and computer storage medium
CN112711983B (en) Nuclear analysis system, method, electronic device, and readable storage medium
CN113836128A (en) Abnormal data identification method, system, equipment and storage medium
CN105989066A (en) Information processing method and device
CN112434238A (en) Webpage quality detection method and device, electronic equipment and storage medium
CN110378739B (en) Data traffic matching method and device
CN115576834A (en) Software test multiplexing method, system, terminal and medium for supporting fault recovery
CN111178701A (en) Risk control method and device based on feature derivation technology and electronic equipment
CN118037121B (en) Importance scoring method, device, equipment and medium for situation awareness data
JPWO2016147220A1 (en) Text visualization system, text visualization method, and program
CN112132111A (en) Parking typical scene extraction method, device, storage medium and device
CN110781211B (en) Data analysis method and device
CN113824580A (en) Network index early warning method and system
KR101462858B1 (en) Methods for competency assessment of corporation for global business
CN113987240B (en) Customs inspection sample tracing method and system based on knowledge graph
CN116049644A (en) Feature screening and clustering and binning method and device, electronic equipment and storage medium
CN115329144A (en) Root cause determination method and device for product defects
CN113962558A (en) Industrial internet platform evaluation method and system based on production data management
CN114048148A (en) Crowdsourcing test report recommendation method and device and electronic equipment
CN113268494B (en) Method and device for processing database statement to be optimized
CN111784069A (en) User preference prediction method, device, equipment and storage medium
CN113554126B (en) Sample evaluation method, device, equipment and computer readable storage medium
CN106528577B (en) Method and device for setting file to be cleaned
JP2019053763A (en) Text visualization system, text visualization method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant