CN113673421A

CN113673421A - Loss assessment method, device and equipment based on video stream and storage medium

Info

Publication number: CN113673421A
Application number: CN202110954663.7A
Authority: CN
Inventors: 程吉安
Original assignee: Ping An Medical and Healthcare Management Co Ltd
Current assignee: Shenzhen Ping An Medical Health Technology Service Co Ltd
Priority date: 2021-08-19
Filing date: 2021-08-19
Publication date: 2021-11-19

Abstract

The invention relates to the field of artificial intelligence, and discloses a loss assessment method, device, equipment and storage medium based on video streams, which are used for improving the accuracy of claim settlement and loss assessment. The loss assessment method based on the video stream comprises the following steps: performing road condition and weather identification on the plurality of video frames to obtain target road condition information and target weather information; carrying out accident vehicle identification on the plurality of video frames to obtain a target accident vehicle; identifying the wounded persons from the plurality of video frames to obtain target wounded persons; carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the conversation fluency and the conversation reasonableness; and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a claim amount prediction model to predict a claim amount, so as to obtain a target claim amount. In addition, the invention also relates to the block chain technology, and the target claim amount can be stored in the block chain node.

Description

Loss assessment method, device and equipment based on video stream and storage medium

Technical Field

The present invention relates to the field of machine learning, and in particular, to a loss assessment method, apparatus, device, and storage medium based on video streams.

Background

As the economy develops, the world becomes more closely linked and any change may cause fluctuations in the economy. It is the individual events that drive economic development, and each occurrence of an event can be an opportunity. Because of the huge reserved quantity of the automobile market, various traffic events occur on roads every day, the preparation of the claim settlement amount is greatly influenced when the events occur, and if enough claim settlement amount is not prepared in time after major events happen, the huge influence is caused, so that the rapid damage assessment and the claim settlement of vehicles with traffic events are very important parts.

According to the conventional scheme, the input variables of the odds model are the secondary factors and the weights of the secondary factors are the maximum to generate the claim amount, and the large weight of the secondary factors causes large deviation of the claim amount generated by the odds model, so that the accuracy of the claim settlement is low.

Disclosure of Invention

The invention provides a loss assessment method, device and equipment based on video streams and a storage medium, which are used for improving the accuracy of claim settlement and loss assessment.

The invention provides a loss assessment method based on video stream in a first aspect, which comprises the following steps: acquiring a field survey video stream to be processed, and framing the field survey video stream to obtain a plurality of video frames; calling a preset semantic segmentation model to identify road conditions and weather of the plurality of video frames to obtain target road condition information and target weather information; calling a preset target detection model to identify the accident vehicles in the plurality of video frames to obtain target accident vehicles; carrying out face recognition on the plurality of video frames to obtain a plurality of face data, and carrying out wounded person recognition on the plurality of face data to obtain a target wounded person; carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and conversation reasonableness of the target wounded person; and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount so as to obtain a target claim settlement amount.

Optionally, in a first implementation manner of the first aspect of the present invention, the invoking a preset semantic segmentation model to identify the road condition and the weather of the plurality of video frames to obtain target road condition information and target weather information includes: extracting the characteristics of the plurality of video frames through an encoder in a preset semantic segmentation model to obtain a characteristic high-dimensional array corresponding to each video frame; performing pixel point type prediction on the characteristic high-dimensional array corresponding to each video frame through a decoder in the semantic segmentation model to obtain target road condition information; and acquiring weather information to be processed, and inputting the weather information into the semantic segmentation model for weather identification to obtain target weather information.

Optionally, in a second implementation manner of the first aspect of the present invention, the invoking a preset target detection model to perform accident vehicle identification on the plurality of video frames to obtain a target accident vehicle includes: inputting the video frames into a preset target detection model for accident vehicle identification to obtain a plurality of candidate accident vehicles; judging the static state of the candidate accident vehicles to obtain the relative sizes of the candidate accident vehicles; and carrying out equal ratio calculation on the relative sizes of the candidate accident vehicles according to a preset pixel ratio to obtain the target accident vehicle.

Optionally, in a third implementation manner of the first aspect of the present invention, the performing face recognition on the plurality of video frames to obtain a plurality of face data, and performing wounded person recognition on the plurality of face data to obtain a target wounded person includes: carrying out face recognition on the plurality of video frames to obtain a plurality of face data; inputting the plurality of face data into a preset character expression recognition model for character annotation to obtain a plurality of character labels; and assigning values to the plurality of character tags to obtain a score rank, and determining the character tag with the highest score in the score rank as a target wounded person.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the performing audio extraction on the site survey video stream to obtain a target audio, and performing voice analysis on the target audio to obtain the dialog fluency and the dialog rationality corresponding to the target victim includes: carrying out audio extraction on the site survey video stream to obtain a target audio, and converting the target audio into a text to obtain a target text; performing word segmentation on the target text through a preset natural language processing model to obtain a word segmentation result, and performing fluency analysis on the word segmentation result and the target audio to obtain the corresponding conversation fluency of the target wounded person; and analyzing the reasonability of the target audio based on the conversation fluency and a preset standard audio to obtain the conversation reasonability corresponding to the target wounded person.

Optionally, in a fifth implementation manner of the first aspect of the present invention, the inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency, and the conversation reasonableness into a preset claim settlement amount prediction model to perform claim settlement amount prediction to obtain the target claim settlement amount includes: performing vector conversion on the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness to obtain a target vector; and inputting the target vector into a preset claim amount prediction model, and performing logistic regression operation on the target vector through the claim amount prediction model to obtain a target claim amount.

Optionally, in a sixth implementation manner of the first aspect of the present invention, after the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency, and the conversation reasonableness are input into a preset claim amount prediction model to perform claim amount prediction, so as to obtain a target claim amount, the method for setting loss based on video stream further includes: acquiring expression data and dialogue data of the target wounded person; generating a target satisfaction degree based on the expression data and the dialogue data, wherein the target satisfaction degree is used for indicating the satisfaction degree of the target wounded on the target claim settlement amount; and constructing a linear relation between the target satisfaction degree and the target claim amount, and adjusting the target claim amount based on the linear relation and a preset amount adjustment range.

A second aspect of the present invention provides a video stream-based impairment apparatus, including: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a field survey video stream to be processed and framing the field survey video stream to obtain a plurality of video frames; the information identification module is used for calling a preset semantic segmentation model to identify the road condition and the weather of the video frames to obtain target road condition information and target weather information; the vehicle identification module is used for calling a preset target detection model to identify accident vehicles for the plurality of video frames to obtain target accident vehicles; the wounded person identification module is used for carrying out face identification on the video frames to obtain a plurality of face data and carrying out wounded person identification on the face data to obtain a target wounded person; the voice analysis module is used for carrying out audio extraction on the site survey video stream to obtain a target audio and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and the conversation reasonableness of the target wounded person; and the prediction module is used for inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount so as to obtain a target claim settlement amount.

Optionally, in a first implementation manner of the second aspect of the present invention, the information identifying module is specifically configured to: extracting the characteristics of the plurality of video frames through an encoder in a preset semantic segmentation model to obtain a characteristic high-dimensional array corresponding to each video frame; performing pixel point type prediction on the characteristic high-dimensional array corresponding to each video frame through a decoder in the semantic segmentation model to obtain target road condition information; and acquiring weather information to be processed, and inputting the weather information into the semantic segmentation model for weather identification to obtain target weather information.

Optionally, in a second implementation manner of the second aspect of the present invention, the vehicle identification module is specifically configured to: inputting the video frames into a preset target detection model for accident vehicle identification to obtain a plurality of candidate accident vehicles; judging the static state of the candidate accident vehicles to obtain the relative sizes of the candidate accident vehicles; and carrying out equal ratio calculation on the relative sizes of the candidate accident vehicles according to a preset pixel ratio to obtain the target accident vehicle.

Optionally, in a third implementation manner of the second aspect of the present invention, the wounded person identification module is specifically configured to: carrying out face recognition on the plurality of video frames to obtain a plurality of face data; inputting the plurality of face data into a preset character expression recognition model for character annotation to obtain a plurality of character labels; and assigning values to the plurality of character tags to obtain a score rank, and determining the character tag with the highest score in the score rank as a target wounded person.

Optionally, in a fourth implementation manner of the second aspect of the present invention, the speech analysis module is specifically configured to: carrying out audio extraction on the site survey video stream to obtain a target audio, and converting the target audio into a text to obtain a target text; performing word segmentation on the target text through a preset natural language processing model to obtain a word segmentation result, and performing fluency analysis on the word segmentation result and the target audio to obtain the corresponding conversation fluency of the target wounded person; and analyzing the reasonability of the target audio based on the conversation fluency and a preset standard audio to obtain the conversation reasonability corresponding to the target wounded person.

Optionally, in a fifth implementation manner of the second aspect of the present invention, the prediction module is specifically configured to: performing vector conversion on the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness to obtain a target vector; and inputting the target vector into a preset claim amount prediction model, and performing logistic regression operation on the target vector through the claim amount prediction model to obtain a target claim amount.

Optionally, in a sixth implementation manner of the second aspect of the present invention, the apparatus for impairment based on video stream further includes: the adjusting module is used for acquiring expression data and conversation data of the target wounded person; generating a target satisfaction degree based on the expression data and the dialogue data, wherein the target satisfaction degree is used for indicating the satisfaction degree of the target wounded on the target claim settlement amount; and constructing a linear relation between the target satisfaction degree and the target claim amount, and adjusting the target claim amount based on the linear relation and a preset amount adjustment range.

A third aspect of the present invention provides a loss assessment apparatus based on video streaming, comprising: a memory and at least one processor, the memory having stored therein a computer program; the at least one processor invokes the computer program in the memory to cause the video stream based impairment device to perform the video stream based impairment method described above.

A fourth aspect of the present invention provides a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform the above-mentioned video-stream-based impairment method.

According to the technical scheme provided by the invention, a to-be-processed field survey video stream is obtained, and the field survey video stream is framed to obtain a plurality of video frames; calling a preset semantic segmentation model to identify road conditions and weather of a plurality of video frames to obtain target road condition information and target weather information; calling a preset target detection model to identify accident vehicles in a plurality of video frames to obtain target accident vehicles; carrying out face recognition on the plurality of video frames to obtain a plurality of face data, and carrying out wounded person recognition on the plurality of face data to obtain a target wounded person; carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and conversation reasonableness of the target wounded; and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount, so as to obtain the target claim settlement amount. In the embodiment of the invention, a plurality of video frames are obtained by performing video framing on a site survey video stream; respectively identifying road conditions, weather, accident vehicles and wounded persons for a plurality of video frames to obtain target road condition information, target weather information, target accident vehicles and target wounded persons; carrying out voice analysis on the site survey video stream to obtain the conversation fluency and the conversation reasonableness of the target wounded person; and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a claim amount prediction model to predict the claim amount to obtain the target claim amount.

Drawings

Fig. 1 is a schematic diagram of an embodiment of a loss assessment method based on a video stream according to an embodiment of the present invention;

fig. 2 is a schematic diagram of another embodiment of a video stream-based impairment method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an embodiment of a loss assessment apparatus based on video stream according to an embodiment of the present invention;

fig. 4 is a schematic diagram of another embodiment of a loss assessment apparatus based on video stream according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an embodiment of a video stream-based loss assessment apparatus according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a loss assessment method, device and equipment based on video streaming and a storage medium, which are used for inputting various types of information as a claim sum prediction model and improving the accuracy of claim settlement and loss assessment.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For the sake of understanding, the following describes a specific flow of an embodiment of the present invention, and referring to fig. 1, a first embodiment of a loss assessment method based on a video stream according to an embodiment of the present invention includes:

101. acquiring a field survey video stream to be processed, and framing the field survey video stream to obtain a plurality of video frames;

it is to be understood that the main implementation of the present invention may be a loss assessment apparatus based on video stream, and may also be a terminal or a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject.

Specifically, the server reads a video stream recorded by a preset survey instrument through a preset decision system to obtain a to-be-processed live survey video stream, and the server decomposes the live survey video stream into a plurality of pictures through a preset video decoding algorithm to obtain a plurality of video frames, where the video decoding algorithm may be a high efficiency video coding algorithm (e.g., HEVC/h.265), and is not limited specifically here.

102. Calling a preset semantic segmentation model to identify road conditions and weather of a plurality of video frames to obtain target road condition information and target weather information;

specifically, the server performs road condition recognition on the plurality of video frames through a preset semantic segmentation model to obtain target road condition information, and further, the voice segmentation model may be a residual learning model and is not specifically limited herein. It should be noted that the principle of the semantic segmentation model specifically includes: the semantic segmentation model comprises an encoder and a decoder, the server extracts the features of the video frames through the encoder in the semantic segmentation model so as to decompose the video frames into feature high-dimensional arrays corresponding to the video frames, and the server processes the feature high-dimensional arrays corresponding to the video frames through the decoder so as to predict the classification type of each pixel point in the video frames and obtain the target road condition information. It should be noted that the model training process of the semantic segmentation model includes: the method comprises the steps that a server collects training images, wherein the training images are images with road condition marking information, the server inputs the training images into a preset training model, the characteristics of the training images are extracted through a neural network in the training model to obtain prediction information of the training images, loss values of the prediction information of the training images and the marking information in the training images are calculated, when the loss values are smaller than a preset threshold value, model training is completed, and the trained training models are used as semantic segmentation models to obtain voice segmentation models. Further, when the server performs semantic segmentation on the plurality of video frames, because the plurality of video frames have sequence and information correlation before and after, the server adds the video frames (such as the n-1 th frame and the n +1 th frame) before and after the video frame (such as the n-th frame) as supplementary information to increase the identification accuracy of the plurality of video frames. Further, the server splices the multiple video frames into a more complete large-size picture (i.e., a large-size video frame) through the additional position information, and the server performs semantic segmentation and identification on the large-size picture to obtain a more accurate identification result. For example: if the server identifies a certain part of the road, accurate road condition information cannot be obtained, and if the server expands the video frame to the front and the back of the road, the road condition information can be judged more accurately. The server acquires weather information to be processed from a preset weather forecast website through a preset crawler tool, inputs the weather information to be processed into a preset semantic segmentation model and processes the same to obtain target weather information.

103. Calling a preset target detection model to identify accident vehicles in a plurality of video frames to obtain target accident vehicles;

specifically, the server performs accident vehicle identification on a plurality of video frames through a preset target detection model to obtain a target accident vehicle, wherein the target detection model may be a continuous convolution network model, and is not specifically limited herein. Further, the continuous convolution network model is composed of a continuous convolution network, the continuous convolution network extracts different features of an input image through convolution operation, only a few low-level features such as edges, lines, angles and other levels can be extracted by a single-layer convolution layer, more complex features can be extracted by the continuous convolution layer in an iteration mode from the low-level features, and the server trains the continuous convolution network through a plurality of sample images to obtain the continuous convolution network model (namely, the target detection model), wherein the sample images are accident vehicle images and carry vehicle marking information. The server processes the sample image through a continuous convolution network in the target detection model, identifies a target object in the sample image, and displays the target object. Further, the server uses the identified vehicle entity (i.e., the candidate accident vehicle) as a temporary number through the target detection model, and the target accident vehicle for the vehicle insurance investigation does not move, so the server realizes temporary registration of the target accident vehicle through the target identification of the large-size picture.

104. Carrying out face recognition on the plurality of video frames to obtain a plurality of face data, and carrying out wounded person recognition on the plurality of face data to obtain a target wounded person;

specifically, the server performs face recognition on a plurality of video frames, the server performs temporary registration on the existing figures to obtain a plurality of face data, the existing figures are all figures included in the plurality of video frames, the server completes the label of each registered figure through a figure expression recognition model according to the figure posture, the plurality of face data and the collected distance of the target accident car to obtain a plurality of figure labels, the server assigns values to the plurality of figure labels, and the server determines the figure corresponding to the figure label with the highest score after assignment as the target wounded person. For example: the gestures of the characters A, B and C are standing, sitting and lying respectively, then the characters A, B and C are respectively sequenced into 1, 2 and 3, wherein the sequencing is that the distances between the characters and a target accident vehicle are sequenced from large to small, then the characters are assigned according to the sequence of natural numbers, the character A is assigned into 1, the character B is assigned into 2 and the character C is assigned into 3, further, the server classifies the pain degree into 10 grades from light to heavy through a character expression recognition model, the more serious the expression pain is, the higher the score is; the character consciousness is divided into the following parts according to whether the eyes are opened or not and whether the mouth is opened or not through a character expression recognition model: waking (i.e., open eye, open mouth), unclear (i.e., other states), and coma (i.e., closed eye, closed mouth) and assigning waking (i.e., open eye, open mouth) to 0, ambiguous (i.e., other states) to 5, coma (i.e., closed eye, closed mouth) to 10, the server adding the assigned values and determining the highest score as the target victim.

105. Carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and conversation reasonableness of the target wounded;

specifically, the server guides a field survey video stream into a preset video editor for audio extraction to obtain target audio, the server performs voice analysis on the target audio to obtain conversation fluency corresponding to a target wounded person, wherein the voice analysis on the target audio is to obtain conversation fluency and conversation reasonableness by classifying emotion of the target wounded person, further, the server converts the target audio into text to obtain target text, the server performs word segmentation on the target text through a natural voice processing model to obtain word segmentation results, the server compares differences between voice broken sentences and word segmentation results of the target wounded person according to the word segmentation results, the server identifies whether the target wounded person has abnormal slow-down speech speed to obtain identification results, the identification results comprise abnormal slow-down speech speed of the target wounded person and abnormal slow-down speech speed of the target wounded person, the server judges whether the target wounded person has the breathing disorder or not according to the recognition result, and determines that the target wounded person has the breathing disorder when the target wounded person has abnormal slow speech speed, and the conversation fluency is unsmooth; and when the speech speed of the target wounded person does not have abnormal relaxation, determining that the target wounded person does not have respiratory disorder, and obtaining that the conversation fluency is smooth. The server judges the degree of reasonability of the conversation with the conversation fluency and the preset standard audio after the conversation of the target wounded person is subjected to voice recognition to obtain the degree of reasonability of the conversation, wherein the degree of reasonability of the conversation is used for indicating the consciousness condition of the target wounded person, and the server compares the conversation of the target wounded person with fluent conversation fluency with the preset standard audio to obtain the degree of reasonability of the conversation, wherein the degree of reasonability of the conversation comprises the following steps: reasonable and unreasonable, for example: when the conversation of the target wounded person is 'i feel very bad now' and the preset standard audio frequency 'i feel not good', the conversation of the target wounded person is compared with the preset standard audio frequency, and the obtained conversation reasonability of the target wounded person is reasonable.

106. And inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount, so as to obtain the target claim settlement amount.

Specifically, the server encodes the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation rationality to obtain a target vector, for example: when the target vector is [0.2,40, 0.7, 4, 2, 101, 5], where [0.2,40] is the fluency of conversation, [0.7] is the reasonability of conversation, [4] is the target road condition information, [2] is the target weather information, [101] is the target accident vehicle, [5] is the target wounded. The server takes the target vector as the input of the claim amount prediction model, and the server predicts the claim amount of the target vector to obtain the target claim amount. It should be noted that the claim amount prediction model is a logistic regression analysis model, and the principle of the claim amount prediction model includes: the independent variables may be continuous or classified, and the independent variables are processed through logistic regression analysis to obtain the weight of the independent variables, and the claim amount is generated according to the weight of the independent variables. The training process of the claim amount prediction model comprises the following steps: obtaining a sample vector, inputting the sample vector into a logistic regression analysis model to predict the claim amount to obtain a training amount, calculating the loss value of the training amount, adjusting the parameters of the logistic regression analysis model according to the loss value until the logistic regression analysis model converges to obtain a claim amount prediction model.

Further, the server stores the target claim amount in a blockchain database, which is not limited herein.

In the embodiment of the invention, a plurality of video frames are obtained by performing video framing on a site survey video stream; respectively identifying road conditions, weather, accident vehicles and wounded persons for a plurality of video frames to obtain target road condition information, target weather information, target accident vehicles and target wounded persons; carrying out voice analysis on the site survey video stream to obtain the conversation fluency and the conversation reasonableness of the target wounded person; and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a claim amount prediction model to predict the claim amount to obtain the target claim amount.

Referring to fig. 2, a second embodiment of the loss assessment method based on video stream according to the embodiment of the present invention includes:

201. acquiring a field survey video stream to be processed, and framing the field survey video stream to obtain a plurality of video frames;

specifically, the server receives the service loss assessment request, extracts a to-be-processed field survey video stream from the service loss assessment request, and frames the field survey video stream through a preset video decoding algorithm to obtain a plurality of video frames, where the video decoding algorithm may be a high-efficiency video encoding algorithm (e.g., h.261), and is not limited herein.

202. Calling a preset semantic segmentation model to identify road conditions and weather of a plurality of video frames to obtain target road condition information and target weather information;

specifically, the server adds additional position information to the current video frame (belonging to any one of the plurality of video frames), that is, the server splices the plurality of video frames into a more complete large-size picture, and the server performs semantic segmentation recognition on the large-size picture to obtain a more accurate recognition result.

Optionally, the server performs feature extraction on a plurality of video frames through an encoder in a preset semantic segmentation model to obtain a feature high-dimensional array corresponding to each video frame, and further, when the server performs semantic segmentation on the plurality of video frames, the server increases the identification accuracy of the plurality of video frames by overlapping front and rear video frames of the video frames as supplementary information because the front and rear video frames have sequence and information correlation; the server carries out pixel point type prediction on the characteristic high-dimensional array corresponding to each video frame through a decoder in the semantic segmentation model to obtain target road condition information; the server acquires weather information to be processed from a preset weather forecast website through a preset crawler tool, codes the acquired weather information into a two-dimensional array, and inputs the two-dimensional array into a semantic segmentation model as one channel of a plurality of video frames for weather identification to obtain target weather information. For example: when the weather information acquired by the server is weather fine, the server encodes the weather fine into a two-dimensional array, the server inputs the two-dimensional array into a semantic segmentation model for weather identification, and the target weather information is acquired as the weather fine, wherein the encoding of the weather information is realized by the server through a neural network layer of the semantic segmentation model.

203. Calling a preset target detection model to identify accident vehicles in a plurality of video frames to obtain target accident vehicles;

specifically, the server uses the random number as a temporary number for the identified vehicle entity through the target detection model, the target accident vehicle of the vehicle insurance survey cannot move, and therefore the server realizes the registration of the vehicle through the target identification of the large-size picture.

Optionally, the server inputs a plurality of video frames into a preset target detection model for accident vehicle identification to obtain a plurality of candidate accident vehicles, further, the server creates a large-size picture through a preset acceleration sensor and a preset gyroscope, after the server is started for two minutes through preset compass information, the server establishes a polar coordinate system of the large-size picture taking east and west as horizontal coordinates by taking the position as a center to mark the position of the vehicle identified in the picture to obtain a plurality of candidate accident vehicles, wherein the angle information of the polar coordinates is the angle of a connecting line between the center of the observed target projected on the ground and the surveying instrument, and the distance information is expressed in relative distance; the server judges the static state of the candidate accident vehicles to obtain the relative sizes of the candidate accident vehicles, wherein the static state judgment is obtained by the coordinate change degrees of the candidate accident vehicles and the pixel value change degree of the pixel value of the image obtained by average synthesis of multiple superposition, the coordinate change is calculated by the moving distance, the pixel value change degree is measured by the pixel variance of the identification area along with the change of time, and the relative size is calculated by the wheel height in the image through the pixel proportion according to the equal ratio; the server calculates the relative sizes of a plurality of candidate accident vehicles in an equal ratio mode through a preset pixel proportion to obtain a target accident vehicle, the server judges the static vehicle to obtain the target accident vehicle through the change degree of vehicle coordinates and the pixel value change degree of a pixel value of a multi-time superposition average synthetic image, further, the coordinate change is calculated through the moving distance, and the pixel value change degree is used for indicating the change of the pixel variance of the identification area along with the time.

204. Carrying out face recognition on the plurality of video frames to obtain a plurality of face data, and carrying out wounded person recognition on the plurality of face data to obtain a target wounded person;

specifically, the server generates a label of each registered person through a person expression recognition model to obtain a plurality of person labels, assigns values to the plurality of person labels, and determines the person corresponding to the person label with the highest score after assignment as the target wounded person.

Optionally, the server performs face recognition on the plurality of video frames to obtain a plurality of face data; the server inputs a plurality of face data into a preset character expression recognition model for character annotation to obtain a plurality of character labels; the server assigns values to the plurality of character tags to obtain a score rank, determines the character tag with the highest score in the score rank as a target wounded person, and assigns values to the postures of different characters according to standing, sitting, squatting and lying orders and a natural number sequence, for example: the postures of the character A, the character B and the character C are standing, sitting and lying respectively, and then the characters A, B and C are respectively sequenced into 1, 2 and 3; and after the distances between the characters and the target accident vehicle are sorted from large to small, assigning values according to a natural number sequence. Further, the server divides the pain degree into 10 grades from light to heavy through the character expression recognition model, and the more serious the expression pain is, the higher the score is; the character consciousness is divided into the following parts according to whether the eyes are opened or not and whether the mouth is opened or not through a character expression recognition model: waking (i.e., open eyes, mouth opening), unclear (i.e., other states), and coma (i.e., closed eyes, mouth opening) and assigning values of 0, 5, and 10, and the server determines that the sum of the assigned values has the highest score as the target victim.

205. Carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and conversation reasonableness of the target wounded;

specifically, the server judges the degree of reasonability of the conversation of the target wounded person through speech recognition, the degree of fluency of the conversation and preset standard audio to obtain the degree of reasonability of the conversation, wherein the degree of reasonability of the conversation is used for indicating the consciousness condition of the target wounded person.

Optionally, the server performs audio extraction on the site survey video stream to obtain a target audio, and converts the target audio into a text to obtain a target text; the server performs word segmentation on the target text through a preset natural language processing model to obtain word segmentation results, for example: the target text is 'often intentionally split', the server calculates the word probability in the target text by the Viterbi algorithm in the natural language processing model, and the word probability in the target text is obtained as follows: "often" -2.3, "warp" -3, "with" -2.3, "with" -1.6, "diverge" -1.6, "see" -3, "mean" -3, "see diverge" -3, "divide" -2.3, and the word with the high word probability is taken as the word segmentation result, and the obtained word segmentation result is: "often", "having", "meaning", "seeing", "diverging", wherein the natural language processing model is a natural language processing model (NLP). The server analyzes the fluency of the word segmentation result and the target audio to obtain the corresponding conversation fluency of the target wounded person; the server analyzes the reasonability of the target audio based on the conversation fluency and the preset standard audio to obtain the conversation reasonability corresponding to the target wounded person, further, the server identifies characters in the target voice and the starting time and the ending time of each sentence through a preset voice identification algorithm, the server calculates the speech speed of each sentence, the speech speed takes the word/minute as a unit, wherein, the server marks the words with the speed less than 200 words/minute, the server calculates the rate of variation of the target wounded person, namely, the server divides the standard deviation of the speech speed of each sentence by the mean value, the server returns the sentence ratio and the rate of variation of the speech speed of the target wounded person as a group of data, in the judgment process of the conversation fluency, the server can prompt some commonly used simple questions for obtaining relatively standardized answers and assisting in judging the rationality.

206. Carrying out vector conversion on target road condition information, target weather information, target accident vehicles, target wounded persons, conversation fluency and conversation reasonableness to obtain a target vector;

specifically, the server performs vector conversion on target road condition information, target weather information, target accident vehicle, target wounded person, conversation fluency and conversation reasonableness to obtain a target vector, and the server inputs the target vector as a prediction factor into a logistic regression (i.e., a linear regression, a ridge regression, etc.), for example: when the target vector is [0.1,30, 0.8, 5, 1, 100, 3], where [0.1,30] is the fluency of conversation, [0.8] is the reasonability of conversation, [5] is the target road condition information, [1] is the target weather information, [100] is the target accident vehicle, [3] is the target wounded.

207. Inputting the target vector into a preset claim amount prediction model, and performing logistic regression operation on the target vector through the claim amount prediction model to obtain a target claim amount;

specifically, the server inputs the target vector into the claim amount prediction model, and the server performs logistic regression operation on the target vector through the claim amount prediction model to obtain the target claim amount, further, the logistic regression operation is that the server creates a linear regression task for each column of vectors in the target vector, where each column of vectors in the target vector is a value of a different object for the same feature, and the operation of the server creating the linear regression task for each column of vectors specifically includes: the server corrects each column vector to obtain a regression value vector of each column vector; and the server determines the co-linearity parameters of each column vector according to the regression value vector of each column vector and the preset predicted value vector and returns the predicted value to obtain the target claim amount.

208. Acquiring expression data and dialogue data of a target wounded person;

specifically, the server collects the expression data of the target wounded person through a preset camera, and the server collects the dialogue data of the target wounded person through a preset voice collector.

209. Generating a target satisfaction degree based on the expression data and the dialogue data, wherein the target satisfaction degree is used for indicating the satisfaction degree of the target wounded on the target claim settlement amount;

specifically, the server acquires expression data and conversation data of a target wounded person, further, the server takes 80% of target claim amount as initial offer amount, the server carries out offer in a step-by-step increasing mode according to 5% of the target claim amount, the server collects facial emotion of the target wounded person in the offer process to obtain expression data of the target wounded person, wherein the probability of joyful is subtracted from the probability of joyful in emotion of the expression data to obtain a probability difference value corresponding to the expression data, the expression data of the target wounded person is considered to be satisfied when the probability difference value is a positive value, the expression data of the target wounded person is considered to be unsatisfactory when the probability difference value is a negative value, the server generates target satisfaction degree based on the probability difference value corresponding to the expression data and the conversation data, specifically, the server identifies the conversation data, and when the conversation data obtained through identification include "satisfaction", the conversation data, And when the words are ' good ' words, generating target satisfaction according to the probability difference value corresponding to the expression data, and further, when the dialogue data comprises the satisfaction ' words and the ' good ' words and the probability difference value is a positive value, taking the probability difference value as the target satisfaction corresponding to the target wounded person by the server, wherein the target satisfaction is used for indicating the satisfaction of the target wounded person on the target claim settlement amount.

210. And constructing a linear relation between the target satisfaction degree and the target claim amount, and adjusting the target claim amount based on the linear relation and a preset amount adjustment range.

Specifically, the server fits a regression model of the price and probability difference value through linear regression to obtain target satisfaction; the server constructs a linear relation between the target satisfaction degree and the target claim amount, and adjusts the target claim amount through the linear relation and a preset amount adjusting range.

With reference to fig. 3, the loss assessment method based on video stream in the embodiment of the present invention is described above, and a loss assessment apparatus based on video stream in the embodiment of the present invention is described below, where a first embodiment of the loss assessment apparatus based on video stream in the embodiment of the present invention includes:

an obtaining module 301, configured to obtain a field survey video stream to be processed, and perform framing on the field survey video stream to obtain multiple video frames;

the information identification module 302 is configured to call a preset semantic segmentation model to identify the road condition and the weather of the plurality of video frames, so as to obtain target road condition information and target weather information;

the vehicle identification module 303 is configured to call a preset target detection model to perform accident vehicle identification on the plurality of video frames to obtain a target accident vehicle;

the wounded person identification module 304 is configured to perform face identification on the multiple video frames to obtain multiple face data, and perform wounded person identification on the multiple face data to obtain a target wounded person;

a voice analysis module 305, configured to perform audio extraction on the site survey video stream to obtain a target audio, and perform voice analysis on the target audio to obtain a dialog fluency and a dialog rationality corresponding to the target victim;

the prediction module 306 is configured to input the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount, so as to obtain a target claim settlement amount.

Further, the target claim amount is stored in the blockchain database, which is not limited herein.

Referring to fig. 4, a second embodiment of the apparatus for loss estimation based on video stream according to the present invention comprises:

Optionally, the information identifying module 302 is specifically configured to: extracting the characteristics of the plurality of video frames through an encoder in a preset semantic segmentation model to obtain a characteristic high-dimensional array corresponding to each video frame; performing pixel point type prediction on the characteristic high-dimensional array corresponding to each video frame through a decoder in the semantic segmentation model to obtain target road condition information; and acquiring weather information to be processed, and inputting the weather information into the semantic segmentation model for weather identification to obtain target weather information.

Optionally, the vehicle identification module 303 is specifically configured to: inputting the video frames into a preset target detection model for accident vehicle identification to obtain a plurality of candidate accident vehicles; judging the static state of the candidate accident vehicles to obtain the relative sizes of the candidate accident vehicles; and carrying out equal ratio calculation on the relative sizes of the candidate accident vehicles according to a preset pixel ratio to obtain the target accident vehicle.

Optionally, the wounded person identification module 304 is specifically configured to: carrying out face recognition on the plurality of video frames to obtain a plurality of face data; inputting the plurality of face data into a preset character expression recognition model for character annotation to obtain a plurality of character labels; and assigning values to the plurality of character tags to obtain a score rank, and determining the character tag with the highest score in the score rank as a target wounded person.

Optionally, the voice analysis module 305 is specifically configured to: carrying out audio extraction on the site survey video stream to obtain a target audio, and converting the target audio into a text to obtain a target text; performing word segmentation on the target text through a preset natural language processing model to obtain a word segmentation result, and performing fluency analysis on the word segmentation result and the target audio to obtain the corresponding conversation fluency of the target wounded person; and analyzing the reasonability of the target audio based on the conversation fluency and a preset standard audio to obtain the conversation reasonability corresponding to the target wounded person.

Optionally, the prediction module 306 is specifically configured to: performing vector conversion on the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness to obtain a target vector; and inputting the target vector into a preset claim amount prediction model, and performing logistic regression operation on the target vector through the claim amount prediction model to obtain a target claim amount.

Optionally, the apparatus for loss assessment based on video stream further includes: the adjusting module 307 is used for acquiring expression data and dialogue data of the target wounded person; generating a target satisfaction degree based on the expression data and the dialogue data, wherein the target satisfaction degree is used for indicating the satisfaction degree of the target wounded on the target claim settlement amount; and constructing a linear relation between the target satisfaction degree and the target claim amount, and adjusting the target claim amount based on the linear relation and a preset amount adjustment range.

Fig. 3 and fig. 4 above describe the loss assessment apparatus based on video stream in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the loss assessment apparatus based on video stream in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Fig. 5 is a schematic structural diagram of a video stream-based lossy device 500 according to an embodiment of the present invention, where the video stream-based lossy device 500 may have relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, and one or more storage media 530 (e.g., one or more mass storage devices) for storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a series of computer program operations in the video stream-based impairment device 500. Still further, the processor 510 may be arranged to communicate with the storage medium 530, to execute a series of computer program operations in the storage medium 530 on the video stream based impairment device 500.

The video stream-based impairment device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows server, Mac OS X, Unix, Linux, FreeBSD, and so forth. It will be appreciated by those skilled in the art that the video stream based impairment device architecture shown in fig. 5 does not constitute a limitation of video stream based impairment devices and may comprise more or less components than those shown, or some components may be combined, or a different arrangement of components.

The present invention further provides a video stream-based impairment apparatus comprising a memory and a processor, wherein the memory stores a computer-readable computer program, and when the computer-readable computer program is executed by the processor, the processor executes the steps of the video stream-based impairment method in the embodiments described above.

The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored thereon a computer program, which, when run on a computer, causes the computer to perform the steps of the video stream-based impairment method.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several computer programs to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A video stream-based impairment method, characterized in that the video stream-based impairment method comprises:

acquiring a field survey video stream to be processed, and framing the field survey video stream to obtain a plurality of video frames;

calling a preset semantic segmentation model to identify road conditions and weather of the plurality of video frames to obtain target road condition information and target weather information;

calling a preset target detection model to identify the accident vehicles in the plurality of video frames to obtain target accident vehicles;

carrying out face recognition on the plurality of video frames to obtain a plurality of face data, and carrying out wounded person recognition on the plurality of face data to obtain a target wounded person;

carrying out audio extraction on the site survey video stream to obtain a target audio, and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and conversation reasonableness of the target wounded person;

and inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount so as to obtain a target claim settlement amount.

2. The loss assessment method based on video streaming according to claim 1, wherein the step of calling a preset semantic segmentation model to identify the road condition and the weather of the plurality of video frames to obtain the target road condition information and the target weather information comprises:

extracting the characteristics of the plurality of video frames through an encoder in a preset semantic segmentation model to obtain a characteristic high-dimensional array corresponding to each video frame;

performing pixel point type prediction on the characteristic high-dimensional array corresponding to each video frame through a decoder in the semantic segmentation model to obtain target road condition information;

and acquiring weather information to be processed, and inputting the weather information into the semantic segmentation model for weather identification to obtain target weather information.

3. The video stream-based loss assessment method according to claim 1, wherein said invoking a preset target detection model to perform accident vehicle identification on said plurality of video frames to obtain a target accident vehicle comprises:

inputting the video frames into a preset target detection model for accident vehicle identification to obtain a plurality of candidate accident vehicles;

judging the static state of the candidate accident vehicles to obtain the relative sizes of the candidate accident vehicles;

and carrying out equal ratio calculation on the relative sizes of the candidate accident vehicles according to a preset pixel ratio to obtain the target accident vehicle.

4. The loss assessment method according to claim 1, wherein the performing face recognition on the plurality of video frames to obtain a plurality of face data, and performing wounded recognition on the plurality of face data to obtain a target wounded person comprises:

carrying out face recognition on the plurality of video frames to obtain a plurality of face data;

inputting the plurality of face data into a preset character expression recognition model for character annotation to obtain a plurality of character labels;

and assigning values to the plurality of character tags to obtain a score rank, and determining the character tag with the highest score in the score rank as a target wounded person.

5. The video stream-based impairment method of claim 1, wherein the performing audio extraction on the live survey video stream to obtain a target audio, and performing voice analysis on the target audio to obtain a fluency and a reasonableness of a dialog corresponding to the target victim comprises:

carrying out audio extraction on the site survey video stream to obtain a target audio, and converting the target audio into a text to obtain a target text;

performing word segmentation on the target text through a preset natural language processing model to obtain a word segmentation result, and performing fluency analysis on the word segmentation result and the target audio to obtain the corresponding conversation fluency of the target wounded person;

and analyzing the reasonability of the target audio based on the conversation fluency and a preset standard audio to obtain the conversation reasonability corresponding to the target wounded person.

6. The video stream-based damage assessment method according to claim 1, wherein the step of inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model for prediction of a claim settlement amount to obtain a target claim settlement amount comprises:

performing vector conversion on the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness to obtain a target vector;

and inputting the target vector into a preset claim amount prediction model, and performing logistic regression operation on the target vector through the claim amount prediction model to obtain a target claim amount.

7. The video stream-based damage assessment method according to any one of claims 1-6, wherein after said inputting said target road condition information, said target weather information, said target accident vehicle, said target wounded person, said fluency of conversation and said reasonableness of conversation into a preset prediction model of claim amount for prediction of claim amount, obtaining a target claim amount, said video stream-based damage assessment method further comprises:

acquiring expression data and dialogue data of the target wounded person;

generating a target satisfaction degree based on the expression data and the dialogue data, wherein the target satisfaction degree is used for indicating the satisfaction degree of the target wounded on the target claim settlement amount;

and constructing a linear relation between the target satisfaction degree and the target claim amount, and adjusting the target claim amount based on the linear relation and a preset amount adjustment range.

8. A video-stream-based impairment apparatus, characterized in that the video-stream-based impairment apparatus comprises:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a field survey video stream to be processed and framing the field survey video stream to obtain a plurality of video frames;

the information identification module is used for calling a preset semantic segmentation model to identify the road condition and the weather of the video frames to obtain target road condition information and target weather information;

the vehicle identification module is used for calling a preset target detection model to identify accident vehicles for the plurality of video frames to obtain target accident vehicles;

the wounded person identification module is used for carrying out face identification on the video frames to obtain a plurality of face data and carrying out wounded person identification on the face data to obtain a target wounded person;

the voice analysis module is used for carrying out audio extraction on the site survey video stream to obtain a target audio and carrying out voice analysis on the target audio to obtain the corresponding conversation fluency and the conversation reasonableness of the target wounded person;

and the prediction module is used for inputting the target road condition information, the target weather information, the target accident vehicle, the target wounded person, the conversation fluency and the conversation reasonableness into a preset claim settlement amount prediction model to predict a claim settlement amount so as to obtain a target claim settlement amount.

9. A video-stream-based impairment apparatus, characterized in that the video-stream-based impairment apparatus comprises: a memory and at least one processor, the memory having stored therein a computer program;

the at least one processor invokes the computer program in the memory to cause the video stream based impairment device to perform the video stream based impairment method of any one of claims 1 to 7.

10. A computer-readable storage medium, having stored thereon a computer program, which, when being executed by a processor, implements the video-stream-based impairment method according to any one of claims 1 to 7.