TW202030683A - Method and apparatus for extracting claim settlement information, and electronic device - Google Patents

Method and apparatus for extracting claim settlement information, and electronic device Download PDF

Info

Publication number
TW202030683A
TW202030683A TW108131514A TW108131514A TW202030683A TW 202030683 A TW202030683 A TW 202030683A TW 108131514 A TW108131514 A TW 108131514A TW 108131514 A TW108131514 A TW 108131514A TW 202030683 A TW202030683 A TW 202030683A
Authority
TW
Taiwan
Prior art keywords
image data
image
classification
settlement
model
Prior art date
Application number
TW108131514A
Other languages
Chinese (zh)
Other versions
TWI712980B (en
Inventor
吳博坤
Original Assignee
香港商阿里巴巴集團服務有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 香港商阿里巴巴集團服務有限公司 filed Critical 香港商阿里巴巴集團服務有限公司
Publication of TW202030683A publication Critical patent/TW202030683A/en
Application granted granted Critical
Publication of TWI712980B publication Critical patent/TWI712980B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Multimedia (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

A method and apparatus for extracting claim settlement information, and an electronic device. The method comprises: obtaining an image data set related to a claim settlement case; inputting image data in the image data set into a first classification model for classification calculation, and classifying the image data in the image data set based on the classification calculation result, wherein the first classification model is a machine learning model trained based on a plurality of image data samples annotated with image categories; and respectively extracting key information for claim settlement from the image data of each image category obtained by classification.

Description

理賠資訊提取方法和裝置、電子設備Claim information extraction method and device, and electronic equipment

本說明書一個或多個實施例涉及電腦應用技術領域,尤其涉及一種理賠資訊提取方法和裝置、電子設備。 One or more embodiments of this specification relate to the field of computer application technology, and in particular to a method and device for extracting claims information, and electronic equipment.

現如今,在發生了車禍事故,接到車主報案之後,通常需要收集大量照片用於理賠。後續在處理相應的理賠案件時,通常需要由該理賠案件的相關責任人對這些照片進行人工歸檔,並對這些照片進行分析,得到與該理賠案件相關的理賠資訊。然而,這樣不僅需要消耗大量的人力資源,理賠資訊的提取效率也較低。Nowadays, in the event of a car accident, after receiving a report from the car owner, it is usually necessary to collect a large number of photos for claims. In the subsequent processing of the corresponding claim case, it is usually necessary for the relevant person responsible for the claim case to manually archive these photos and analyze the photos to obtain the claim information related to the claim case. However, this not only consumes a lot of human resources, but the extraction efficiency of claims information is also low.

本說明書提出一種理賠資訊提取方法,該方法包括: 獲取與理賠案件相關的圖像數據集合; 將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 可選地,該方法還包括: 獲取提取出的該用於理賠的關鍵資訊; 基於該用於理賠的關鍵資訊進行理賠處理。 可選地,該第一分類模型為卷積神經網路CNN模型。 可選地,對該圖像數據集合中的圖像數據進行分類得到的圖像類別包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。 可選地,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 可選地,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 可選地,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。 可選地,該第二分類模型為CNN模型。 本說明書還提出一種理賠資訊提取裝置,該裝置包括: 第一獲取模組,用於獲取與理賠案件相關的圖像數據集合; 分類模組,用於將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 提取模組,用於從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 可選地,該裝置還包括: 第二獲取模組,用於獲取提取出的該用於理賠的關鍵資訊; 理賠模組,用於基於該用於理賠的關鍵資訊進行理賠處理。 可選地,該第一分類模型為卷積神經網路CNN模型。 可選地,對該圖像數據集合中的圖像數據進行分類得到的圖像類別包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。 可選地,該提取模組具體用於: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 可選地,該提取模組具體用於: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 可選地,該提取模組具體用於: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。 可選地,該第二分類模型為CNN模型。 本說明書還提出一種電子設備,該電子設備包括: 處理器; 用於儲存機器可執行指令的記憶體; 其中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 獲取與理賠案件相關的圖像數據集合; 將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 在上述技術方案中,對於某個理賠案件而言,可以將與該理賠案件相關的圖像數據集合輸入至分類模型,以由該分類模型對該圖像數據集合中的圖像數據進行分類。後續,可以自動從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。採用這樣的方式,與常用的針對理賠案件的圖像數據進行人工分類和分析的方式相比,可以提高理賠資訊的提取效率,減少人力資源的消耗。This manual proposes a method for extracting claims information, which includes: Obtain a collection of image data related to claims settlement; Input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification model is based on A machine learning model trained on a number of image data samples labeled with image categories; Extract the key information for claim settlement from the image data of each image category obtained by classification. Optionally, the method further includes: Obtain the extracted key information for claim settlement; Perform claims processing based on the key information used for claims. Optionally, the first classification model is a convolutional neural network CNN model. Optionally, the image category obtained by classifying the image data in the image data set includes one or more of the following image categories: Document image; document image; scene image; damage image; other images. Optionally, the extraction of key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. Optionally, the extraction of key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. Optionally, the extraction of key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. Optionally, the second classification model is a CNN model. This specification also proposes a claim information extraction device, which includes: The first acquisition module is used to acquire a collection of image data related to a claim settlement case; The classification module is used to input the image data in the image data set into the first classification model for classification calculation, and to classify the image data in the image data set based on the classification calculation result; wherein, the The first classification model is a machine learning model trained based on a number of image data samples labeled with image categories; The extraction module is used to extract key information used for claim settlement from the image data of each image category obtained by classification. Optionally, the device further includes: The second acquisition module is used to acquire the extracted key information for claim settlement; The claim settlement module is used for claim settlement processing based on the key information used for claim settlement. Optionally, the first classification model is a convolutional neural network CNN model. Optionally, the image category obtained by classifying the image data in the image data set includes one or more of the following image categories: Document image; document image; scene image; damage image; other images. Optionally, the extraction module is specifically used for: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. Optionally, the extraction module is specifically used for: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. Optionally, the extraction module is specifically used for: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. Optionally, the second classification model is a CNN model. This specification also proposes an electronic device, which includes: processor; Memory used to store machine executable instructions; Wherein, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic of claim information extraction, the processor is prompted to: Obtain a collection of image data related to claims settlement; Input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification model is based on A machine learning model trained on a number of image data samples labeled with image categories; Extract the key information for claim settlement from the image data of each image category obtained by classification. In the above technical solution, for a certain claim settlement case, the image data set related to the claim settlement case may be input to the classification model, so that the image data in the image data set can be classified by the classification model. Subsequently, the key information for claim settlement can be extracted automatically from the image data of each image category obtained by classification. By adopting this method, compared with the commonly used method of manually classifying and analyzing the image data of claim settlement cases, it can improve the extraction efficiency of claims information and reduce the consumption of human resources.

這裡將詳細地對示例性實施例進行說明,其示例表示在圖式中。下面的描述涉及圖式時,除非另有表示,不同圖式中的相同數字表示相同或相似的要素。以下示例性實施例中所描述的實施方式並不代表與本說明書一個或多個實施例相一致的所有實施方式。相反,它們僅是與如所附申請專利範圍中所詳述的、本說明書一個或多個實施例的一些方面相一致的裝置和方法的例子。 在本說明書使用的術語是僅僅出於描述特定實施例的目的,而非旨在限制本說明書。在本說明書和所附申請專利範圍中所使用的單數形式的“一種”、“所述”和“該”也旨在包括多數形式,除非上下文清楚地表示其他含義。還應當理解,本文中使用的術語“和/或”是指並包含一個或多個相關聯的列出項目的任何或所有可能組合。 應當理解,儘管在本說明書可能採用術語第一、第二、第三等來描述各種資訊,但這些資訊不應限於這些術語。這些術語僅用來將同一類型的資訊彼此區分開。例如,在不脫離本說明書範圍的情況下,第一資訊也可以被稱為第二資訊,類似地,第二資訊也可以被稱為第一資訊。取決於語境,如在此所使用的詞語“如果”可以被解釋成為“在……時”或“當……時”或“回應於確定”。 本說明書旨在提供一種針對理賠案件,對與該理賠案件相關的圖像數據集合中的圖像數據進行分類,並從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊的技術方案。 在具體實現時,對於某個理賠案件而言,可以先獲取與該理賠案件相關的圖像數據集合。 其中,圖像數據集合中可以包含與理賠案件相關的至少一個圖像類別的圖像數據,例如:證件圖像的圖像數據;單據圖像的圖像數據;以及該車禍事故的現場圖像的圖像數據等。 在獲取到該圖像數據集合後,可以基於分類模型對該圖像數據集合中的圖像數據進行分類。 其中,該分類模型可以是基於若干被標注了分類標簽的圖像數據樣本訓練出的機器學習模型。 後續,即可從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊,例如:可以從證件圖像的圖像數據中提取該理賠案件的相關人員資訊;從單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例;以及基於現場圖像的圖像數據確定該理賠案件對應的事故類型等。 在上述技術方案中,對於某個理賠案件而言,可以將與該理賠案件相關的圖像數據集合輸入至分類模型,以由該分類模型對該圖像數據集合中的圖像數據進行分類。後續,可以自動從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。採用這樣的方式,與常用的針對理賠案件的圖像數據進行人工分類和分析的方式相比,可以提高理賠資訊的提取效率,減少人力資源的消耗。 下面通過具體實施例對本說明書進行描述。 請參考圖1,圖1是本說明一示例性實施例示出的一種理賠資訊提取方法的流程圖。該方法可以應用於伺服器、手機、平板設備、筆記型電腦、個人數位助理(Personal Digital Assistants,PDAs)等電子設備,包括如下步驟: 步驟102,獲取與理賠案件相關的圖像數據集合; 步驟104,將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 步驟106,從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 在本實施例中,對於某個理賠案件而言,可以先獲取與該理賠案件相關的圖像數據集合。 其中,圖像數據集合中可以包含與理賠案件相關的至少一個圖像類別的圖像數據。 在示出的一種實施方式中,該圖像數據集合中包含的圖像數據的圖像類別可以包括以下圖像類別中的一個或多個:證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。 舉例來說,假設該理賠案件為針對車禍事故中的受損車輛的理賠案件,則該圖像數據集合中可以包含證件圖像的圖像數據(例如:該受損車輛的車主的駕駛證圖像、該受損車輛的行車證圖像等);單據圖像的圖像數據(例如:事故責任認定書的圖像數據等);該車禍事故的現場圖像的圖像數據;該受損車輛的損傷圖像的圖像數據;以及不屬於前4個圖像類別的圖像數據(稱為其他圖像的圖像數據)。 在實際應用中,可以獲取由用戶上傳的對駕駛證和行車證等有效證件、事故責任認定書等單據以及事故車輛的受損部位等進行拍攝得到的至少一張圖像,並將拍攝得到的圖像作為與該理賠案件相關的圖像數據集合中的圖像數據。 或者,可以利用配置在該理賠案件對應的事故現場附近的攝像頭,獲取這些攝像頭拍攝得到的至少一張圖像,並將拍攝得到的圖像作為與該理賠案件相關的圖像數據集合中的圖像數據。 或者,可以利用配置在該理賠案件對應的事故現場附近的攝像頭,獲取這些攝像頭拍攝得到的視頻,並提取這些視頻中的圖像幀,以將這些圖像幀作為與該理賠案件相關的圖像數據集合中的圖像數據。 在獲取到與該理賠案件相關的圖像數據集合後,可以將該圖像數據集合中的圖像數據輸入至預設的分類模型(稱為第一分類模型)中進行分類計算。 其中,第一分類模型可以是常用的卷積神經網路 (Convolutional Neural Networks,CNN)模型等機器學習模型。 需要說明的是,可以先從歷史理賠案件(即以前執行完理賠處理的理賠案件)對應的圖像數據集合中獲取預設數量的圖像數據,並為這些圖像數據標注分類標簽。 其中,分類標簽可以是用於表徵圖像數據所屬的圖像類別,例如:可以為這些圖像數據中與證件相關的圖像數據標注證件圖像作為分類標簽;為這些圖像數據中與單據相關的圖像數據標注單據圖像作為分類標簽;為這些圖像中與事故現場相關的圖像數據標注現場圖像作為分類標簽;為這些圖像中與事故車輛的受損部位相關的圖像標注損傷圖像作為分類標簽;並為這些圖像中與證件、單據、事故現場以及事故車輛的受損部位均不相關的圖像標注其他圖像作為分類標簽。 後續,可以將這些被標注了圖像類別的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:CNN演算法),針對這些圖像數據樣本進行訓練,以得到用於對與上述理賠案件相關的圖像數據集合中的圖像數據進行分類的上述第一分類模型。 舉例來說,假設預設的圖像數據樣本的數量為100張,則可以從歷史理賠案件對應的圖像數據集合中獲取100張圖像,並為這些圖像標注圖像類別。後續,可以將這100張被標注了圖像類別的圖像作為訓練樣本,採用反向傳播的方式,基於CNN演算法,針對這100張被標注了圖像類別的圖像進行訓練,以得到該第一分類模型。 這樣,可以基於已訓練好的上述第一分類模型對與上述理賠案件相關的圖像數據集合中的圖像數據進行分類計算,從而可以基於分類計算結果對該圖像數據集合中的圖像數據進行分類,即確定該圖像數據集合中的圖像數據所屬的圖像類別。 在確定了該圖像數據集合中的圖像數據所屬的圖像類別後,可以從各個類別的圖像數據中分別提取用於理賠的關鍵資訊。 在示出的一種實施方式中,針對與上述理賠案件相關的圖像數據集合中的證件圖像的圖像數據,可以基於OCR (Optical Character Recognition,光學字元識別)演算法,從分類得到的證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊,並將提取出的該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 具體地,可以將分類得到的證件圖像的圖像數據輸入至字串檢測模型,以由該字串檢測模型基於該圖像數據獲取其中包含目標字串的圖像區域。 對於證件圖像的圖像數據來說,目標字串可以是用於表徵姓名或身份證號碼等人員資訊的字串,也可以是用於表徵車牌號等車輛資訊的字串。 其中,字串檢測模型可以是常用的CNN模型等機器學習模型。 類似地,可以將被標注了包含目標字串的圖像區域的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:CNN演算法),針對這些圖像數據樣本進行訓練,以得到用於從上述證件圖像的圖像數據中檢測出包含目標字串的圖像區域的字串檢測模型。 而在獲取到包含目標字串的圖像區域後,則可以繼續將該圖像區域輸入至字串識別模型,以由該字串識別模型對該圖像區域中的目標字串進行識別,得到該理賠案件的相關人員資訊和相關車輛資訊,並將得到的該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 舉例來說,可以基於用於表徵姓名的目標字串識別得到該理賠案件的相關人員的姓名;基於用於表徵身份證號碼的目標字串識別得到該理賠案件的相關人員的身份證號碼;基於用於表徵車牌號的目標字串識別得到該理賠案件的相關車輛的車牌號等。後續,可以將針對該理賠案件得到的姓名和身份證號碼等相關人員資訊,以及車牌號等相關車輛資訊作為用於理賠的關鍵資訊。 其中,字串識別模型可以是基於CTC(Connectionist Temporal Classification)損失函數的循環神經網路 (Recurrent Neural Network,RNN)模型。 類似地,可以將被標注了字串對應的文字內容的包含該字串的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:基於CTC損失函數的RNN演算法),針對這些圖像數據樣本進行訓練,以得到用於對上述圖像區域中的目標字串進行識別的字串識別模型。 在示出的一種實施方式中,針對與上述理賠案件相關的圖像數據集合中的單據圖像的圖像數據,可以基於OCR演算法,以及NLP(Natural Language Processing,自然語言處理)演算法,從分類得到的單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例,並將提取出的該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 具體地,可以將分類得到的單據圖像的圖像數據輸入至字串檢測模型,以由該字串檢測模型基於該圖像數據獲取其中包含目標字串的圖像區域。 對於單據圖像的圖像數據而言,目標字串可以是用於表徵該理賠案件的相關人員的責任資訊的字串。以事故責任認定書為例,事故責任認定書的圖像數據中的目標字串可以是該事故責任認定書中對事故的相關人員所承擔的責任進行描述的文字(例如:相關人員A承擔主要責任、相關人員B承擔次要責任等)對應的字串。 其中,字串檢測模型可以是常用的CNN模型等機器學習模型。 類似地,可以將被標注了包含目標字串的圖像區域的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:CNN演算法),針對這些圖像數據樣本進行訓練,以得到用於從上述單據圖像的圖像數據中檢測出包含目標字串的圖像區域的字串檢測模型。 而在獲取到包含目標字串的圖像區域後,則可以繼續將該圖像區域輸入至字串識別模型,以由該字串識別模型對該圖像區域中的目標字串進行識別,得到該理賠案件的相關人員的責任資訊。 其中,字串識別模型可以是基於CTC損失函數的RNN模型。 類似地,可以將被標注了字串對應的文字內容的包含該字串的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:基於CTC損失函數的RNN演算法),針對這些圖像數據樣本進行訓練,以得到用於對上述圖像區域中的目標字串進行識別的字串識別模型。 進一步地,在得到該理賠案件的相關人員的責任資訊後,可以基於NPL演算法對該理賠案件的相關人員的責任資訊進行分析,得到該理賠案件的相關人員的責任比例,並將得到的該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 舉例來說,在識別得到的該理賠案件的相關人員的責任資訊包括“相關人員A承擔主要責任”和“相關人員B承擔次要責任”時,可以基於NPL演算法對該責任資訊進行分析,從而可以確定相關人員A的責任比例大於50%,相關人員B的責任比例小於50%。 需要說明的是,用於從上述證件圖像的圖像數據中檢測出包含目標字串的圖像區域的字串檢測模型,與用於從上述單據圖像的圖像數據中檢測出包含目標字串的圖像區域的字串檢測模型可以是同一個字串檢測模型,也可以不同的兩個字串檢測模型,本說明書對此不作限定。同樣地,用於對證件圖像的圖像區域中的目標字串進行識別的字串識別模型,與用於對單據圖像的圖像區域中的目標字串進行識別的字串識別模型可以是同一個字串識別模型,也可以不同的兩個字串識別模型,本說明書對此不作限定。 在示出的一種實施方式中,針對與上述理賠案件相關的圖像數據集合中的現場圖像的圖像數據,可以將分類得到的現場圖像的圖像數據輸入至預設的分類模型(稱為第二分類模型)中進行分類計算。 其中,第二分類模型可以是常用的CNN模型等機器學習模型。 類似地,可以將被標注了事故類型的現場圖像的圖像數據作為訓練樣本,採用反向傳播的方式,基於預設的機器學習演算法(例如:CNN演算法),針對這些現場圖像樣本進行訓練,以得到用於確定現場圖像的圖像數據對應的事故類型的上述第二分類模型。 其中,事故類型可以包括單車事故,雙車事故,多車事故等。 舉例來說,可以將作為訓練樣本的現場圖像中顯示的車禍事故僅包含一輛車輛的圖像標注為“單車事故”,將這些現場圖像中顯示的車禍事故包含兩輛車輛的圖像標注為“雙車事故”,將這些現場圖像中顯示的車禍事故包含三輛及以上車輛的圖像標注為“多車事故”。 這樣,可以基於已訓練好的上述第二分類模型對上述現場圖像的圖像數據進行分類計算,從而可以基於分類計算結果對該現場圖像的圖像數據進行分類,即確定該現場圖像對應的事故所屬的事故類型。 在實際應用中,還可以對從各個類別的圖像數據中分別提取出的用於理賠的關鍵資訊加以利用。具體地,可以獲取提取出的用於理賠的關鍵資訊,並基於該用於理賠的關鍵資訊進行理賠處理。 舉例來說,在獲取到該用於理賠的關鍵資訊後,可以將該用於理賠的關鍵資訊輸入至裝載在電子設備上的理賠系統,以由該理賠系統自行錄入該用於理賠的關鍵資訊作為上述理賠案件的資訊,從而可以由該理賠系統基於該理賠案件的資訊進行後續的理賠處理。 在實際應用中,該理賠系統還可以按照上述分類結果對與該理賠案件相關的圖像數據集合進行分類儲存,即將該圖像數據集合中的圖像數據按照其所屬的圖像類別進行分類儲存。 在上述技術方案中,對於某個理賠案件而言,可以將與該理賠案件相關的圖像數據集合輸入至分類模型,以由該分類模型對該圖像數據集合中的圖像數據進行分類。後續,可以自動從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。採用這樣的方式,與常用的針對理賠案件的圖像數據進行人工分類和分析的方式相比,可以提高理賠資訊的提取效率,減少人力資源的消耗。 與前述理賠資訊提取方法的實施例相對應,本說明書還提供了理賠資訊提取裝置的實施例。 本說明書理賠資訊提取裝置的實施例可以應用在電子設備上。裝置實施例可以通過軟體實現,也可以通過硬體或者軟硬體結合的方式實現。以軟體實現為例,作為一個邏輯意義上的裝置,是通過其所在電子設備的處理器將非揮發性記憶體中對應的電腦程式指令讀取到內部記憶體中運行形成的。從硬體層面而言,如圖2所示,為本說明書理賠資訊提取裝置所在電子設備的一種硬體結構圖,除了圖2所示的處理器、內部記憶體、網路介面、以及非揮發性記憶體之外,實施例中裝置所在的電子設備通常根據該理賠資訊提取的實際功能,還可以包括其他硬體,對此不再贅述。 請參考圖3,圖3是本說明書一示例性實施例示出的一種理賠資訊提取裝置的方塊圖。該裝置30可以應用於圖2所示的電子設備,包括: 第一獲取模組301,用於獲取與理賠案件相關的圖像數據集合; 分類模組302,用於將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 提取模組303,用於從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 在本實施例中,該裝置30還可以包括: 第二獲取模組304,用於獲取提取出的該用於理賠的關鍵資訊; 理賠模組305,用於基於該用於理賠的關鍵資訊進行理賠處理。 在本實施例中,該第一分類模型可以為卷積神經網路CNN模型。 在本實施例中,對該圖像數據集合中的圖像數據進行分類得到的圖像類別可以包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。 在本實施例中,該提取模組303具體可以用於: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 在本實施例中,該提取模組303具體可以用於: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 在本實施例中,該提取模組303具體可以用於: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。 在本實施例中,該第二分類模型可以為CNN模型。 上述裝置中各個模組的功能和作用的實現過程具體詳見上述方法中對應步驟的實現過程,在此不再贅述。 對於裝置實施例而言,由於其基本對應於方法實施例,所以相關之處參見方法實施例的部分說明即可。以上所描述的裝置實施例僅僅是示意性的,其中所述作為分離部件說明的模組可以是或者也可以不是實體上分開的,作為模組顯示的部件可以是或者也可以不是實體模組,即可以位於一個地方,或者也可以分佈到多個網路模組上。可以根據實際的需要選擇其中的部分或者全部模組來實現本說明書方案的目的。本領域普通技術人員在不付出創造性勞動的情況下,即可以理解並實施。 上述實施例闡明的系統、裝置、模組或模組,具體可以由電腦晶片或實體實現,或者由具有某種功能的產品來實現。一種典型的實現設備為電腦,電腦的具體形式可以是個人電腦、筆記型電腦、蜂巢式電話、相機電話、智慧型電話、個人數位助理、媒體播放器、導航設備、電子郵件收發設備、遊戲控制台、平板電腦、可穿戴設備或者這些設備中的任意幾種設備的組合。 與上述理賠資訊提取方法實施例相對應,本說明書還提供了一種電子設備的實施例。該電子設備包括:處理器以及用於儲存機器可執行指令的記憶體;其中,處理器和記憶體通常通過內部匯流排相互連接。在其他可能的實現方式中,該設備還可能包括外部介面,以能夠與其他設備或者部件進行通信。 在本實施例中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 獲取與理賠案件相關的圖像數據集合; 將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。 在本實施例中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器還被促使: 獲取提取出的該用於理賠的關鍵資訊; 基於該用於理賠的關鍵資訊進行理賠處理。 在本實施例中,該第一分類模型為卷積神經網路CNN模型。 在本實施例中,對該圖像數據集合中的圖像數據進行分類得到的圖像類別包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。 在本實施例中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。 在本實施例中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。 在本實施例中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。 在本實施例中,該第二分類模型為CNN模型。 本領域技術人員在考慮說明書及實踐這裡公開的發明後,將容易想到本說明書的其它實施方案。本說明書旨在涵蓋本說明書的任何變型、用途或者適應性變化,這些變型、用途或者適應性變化遵循本說明書的一般性原理並包括本說明書未公開的本技術領域中的公知常識或慣用技術手段。說明書和實施例僅被視為示例性的,本說明書的真正範圍和精神由下面的申請專利範圍指出。 應當理解的是,本說明書並不局限於上面已經描述並在圖式中示出的精確結構,並且可以在不脫離其範圍進行各種修改和改變。本說明書的範圍僅由所附的申請專利範圍來限制。 以上所述僅為本說明書一個或多個實施例的較佳實施例而已,並不用以限制本說明書一個或多個實施例,凡在本說明書一個或多個實施例的精神和原則之內,所做的任何修改、等同替換、改進等,均應包含在本說明書一個或多個實施例保護的範圍之內。The exemplary embodiments will be described in detail here, and examples thereof are shown in the drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with one or more embodiments of this specification. On the contrary, they are only examples of devices and methods consistent with some aspects of one or more embodiments of this specification as detailed in the scope of the appended application. The terms used in this specification are only for the purpose of describing specific embodiments, and are not intended to limit the specification. The singular forms of "a", "the" and "the" used in this specification and the scope of the appended application are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and/or" used herein refers to and includes any or all possible combinations of one or more associated listed items. It should be understood that although the terms first, second, and third may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of this specification, the first information can also be referred to as second information, and similarly, the second information can also be referred to as first information. Depending on the context, the word "if" as used herein can be interpreted as "when" or "when" or "in response to certainty". This manual aims to provide a method for classifying the image data in the image data set related to the claim settlement case, and extracting the key information for the claim settlement from the image data of each category obtained by the classification. Technical solutions. In specific implementation, for a certain claims case, the image data set related to the claim case can be obtained first. Wherein, the image data set may contain image data of at least one image category related to the claim case, such as: image data of a certificate image; image data of a document image; and a scene image of the car accident Image data, etc. After the image data set is acquired, the image data in the image data set can be classified based on the classification model. Wherein, the classification model may be a machine learning model trained based on several image data samples labeled with classification labels. Subsequently, the key information used for claim settlement can be extracted from the image data of each image category obtained by the classification, for example: the relevant personnel information of the claim case can be extracted from the image data of the certificate image; From the image data of the image, extract the responsibility ratio of the relevant personnel of the claim case; and determine the type of accident corresponding to the claim case based on the image data of the scene image. In the above technical solution, for a certain claim settlement case, the image data set related to the claim settlement case may be input to the classification model, so that the image data in the image data set can be classified by the classification model. Subsequently, the key information for claim settlement can be extracted automatically from the image data of each image category obtained by classification. By adopting this method, compared with the commonly used method of manually classifying and analyzing the image data of claim settlement cases, it can improve the extraction efficiency of claims information and reduce the consumption of human resources. The specification is described below through specific embodiments. Please refer to FIG. 1, which is a flowchart of a method for extracting claims information according to an exemplary embodiment of this description. The method can be applied to electronic devices such as servers, mobile phones, tablet devices, notebook computers, personal digital assistants (Personal Digital Assistants, PDAs), and includes the following steps: Step 102: Obtain a collection of image data related to a claim settlement case; Step 104: Input the image data in the image data set into a first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification The model is a machine learning model trained based on a number of image data samples labeled with image categories; Step 106: Extract key information for claim settlement from the image data of each image category obtained by classification. In this embodiment, for a certain claim settlement case, the image data set related to the claim settlement case may be acquired first. Wherein, the image data set may include image data of at least one image category related to the claim case. In the illustrated embodiment, the image category of the image data contained in the image data set may include one or more of the following image categories: document image; document image; live image; damage Image; other images. For example, suppose the claim case is a claim case for a damaged vehicle in a car accident, then the image data set may include image data of a certificate image (for example, the driver’s license image of the owner of the damaged vehicle). Image, the image of the vehicle’s license, etc.); the image data of the document image (for example: the image data of the accident liability certificate, etc.); the image data of the scene image of the car accident; the damaged Image data of damaged images of the vehicle; and image data that do not belong to the first 4 image categories (called image data of other images). In practical applications, it is possible to obtain at least one image uploaded by the user of valid documents such as driving licenses and driving permits, documents such as accident responsibility certificates, and damaged parts of the accident vehicle, and the captured images The image is used as the image data in the image data set related to the claim. Alternatively, a camera disposed near the accident site corresponding to the claim settlement case can be used to obtain at least one image captured by these cameras, and the captured image can be used as the image in the image data set related to the claim settlement case. Like data. Or, you can use cameras located near the accident site corresponding to the claim settlement case to obtain the videos captured by these cameras, and extract the image frames in these videos to use these image frames as images related to the claim settlement case The image data in the data set. After acquiring the image data set related to the claim settlement case, the image data in the image data set can be input into a preset classification model (referred to as the first classification model) for classification calculation. Among them, the first classification model can be a commonly used convolutional neural network (Convolutional Neural Networks, CNN) model and other machine learning models. It should be noted that a preset number of image data can be obtained from the image data set corresponding to the historical claim case (that is, the claim case in which the claim processing has been performed before), and a classification label can be marked for the image data. Among them, the classification label can be used to characterize the image category to which the image data belongs. For example, the image data related to the certificate in the image data can be labeled as the classification label; Relevant image data is labeled with the document image as a classification label; for the image data related to the accident scene in these images, the scene image is labeled as the classification label; for these images related to the damaged part of the accident vehicle Mark damaged images as classification labels; and mark other images as classification labels for those images that are not related to documents, documents, accident scenes, and damaged parts of accident vehicles. Later, you can use these image data marked with image categories as training samples, using backpropagation, based on preset machine learning algorithms (for example: CNN algorithm), to train these image data samples , In order to obtain the first classification model used to classify the image data in the image data set related to the claim settlement case. For example, if the preset number of image data samples is 100, then 100 images can be obtained from the image data collection corresponding to the historical claim case, and image categories can be marked for these images. Later, you can use these 100 images with image categories as training samples, and use backpropagation, based on the CNN algorithm, to train the 100 images with image categories to obtain The first classification model. In this way, the image data in the image data set related to the claim case can be classified and calculated based on the trained first classification model, so that the image data in the image data set can be calculated based on the result of the classification calculation. Classification is to determine the image category to which the image data in the image data set belongs. After the image category to which the image data in the image data set belongs is determined, the key information for claim settlement can be extracted from the image data of each category. In the illustrated embodiment, the image data of the credential image in the image data set related to the above-mentioned claims settlement can be obtained from the classification based on the OCR (Optical Character Recognition) algorithm The relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image, and the extracted relevant personnel information and relevant vehicle information of the claim settlement case are used as the key information for the settlement of claims. Specifically, the image data of the certificate image obtained by the classification may be input to the word string detection model, so that the word string detection model obtains the image area containing the target word string based on the image data. For the image data of the certificate image, the target string can be a string used to characterize personnel information such as a name or ID number, or a character string used to characterize vehicle information such as a license plate number. Among them, the string detection model may be a machine learning model such as a commonly used CNN model. Similarly, the image data marked with the image area containing the target string can be used as training samples, and the back propagation method can be used based on a preset machine learning algorithm (for example: CNN algorithm) to target these images. The image data sample is trained to obtain a character string detection model for detecting the image area containing the target character string from the image data of the aforementioned document image. After obtaining the image area containing the target string, you can continue to input the image area into the string recognition model to recognize the target string in the image area by the string recognition model to obtain The relevant personnel information and relevant vehicle information of the claim settlement case, and the obtained relevant personnel information and relevant vehicle information of the claim settlement case are used as the key information for the claim settlement. For example, the name of the relevant person in the claim case can be obtained based on the target string used to characterize the name; the ID number of the relevant person in the claim case can be obtained based on the target character string used to characterize the ID number; The target string used to characterize the license plate number is used to identify the license plate number of the relevant vehicle in the claim case. Later, the relevant personnel information such as the name and ID number, and the relevant vehicle information such as the license plate number obtained for the claim settlement case can be used as the key information for the claim settlement. Among them, the string recognition model can be a cyclic neural network based on the CTC (Connectionist Temporal Classification) loss function (Recurrent Neural Network, RNN) model. Similarly, the image data containing the word string labeled with the text content corresponding to the word string can be used as a training sample, and the back propagation method is adopted based on a preset machine learning algorithm (for example: based on the CTC loss function). RNN algorithm) to train these image data samples to obtain a string recognition model for recognizing the target string in the image area. In the illustrated embodiment, the image data of the document image in the image data set related to the above-mentioned claim settlement case may be based on the OCR algorithm and the NLP (Natural Language Processing, natural language processing) algorithm, From the image data of the document images obtained by classification, the liability ratio of the relevant persons in the claim settlement case is extracted, and the extracted liability ratio of the relevant persons in the claim settlement case is used as the key information for the settlement. Specifically, the image data of the document image obtained by the classification may be input to the word string detection model, so that the word string detection model obtains the image area containing the target word string based on the image data. For the image data of the document image, the target string may be a string used to characterize the responsibility information of the relevant person in the claim settlement case. Take an accident responsibility confirmation letter as an example, the target string in the image data of the accident responsibility confirmation letter can be the text describing the responsibility of the person involved in the accident in the accident responsibility confirmation letter (for example: relevant person A assumes the main responsibility Responsibility, related personnel B assumes secondary responsibilities, etc.) corresponding string. Among them, the string detection model may be a machine learning model such as a commonly used CNN model. Similarly, the image data marked with the image area containing the target string can be used as training samples, and the back propagation method can be used based on a preset machine learning algorithm (for example: CNN algorithm) to target these images. Image data samples are trained to obtain a character string detection model for detecting an image area containing the target character string from the image data of the above-mentioned document image. After obtaining the image area containing the target string, you can continue to input the image area into the string recognition model to recognize the target string in the image area by the string recognition model to obtain Responsibility information of the relevant personnel in the claim settlement case. Among them, the string recognition model may be an RNN model based on the CTC loss function. Similarly, the image data containing the word string labeled with the text content corresponding to the word string can be used as a training sample, and the back propagation method is adopted based on a preset machine learning algorithm (for example: based on the CTC loss function). RNN algorithm) to train these image data samples to obtain a string recognition model for recognizing the target string in the image area. Further, after obtaining the liability information of the relevant personnel of the claim settlement case, the liability information of the relevant personnel of the claim settlement case can be analyzed based on the NPL algorithm to obtain the liability ratio of the relevant personnel of the claim settlement case, and obtain the The liability ratio of the relevant personnel in the claims settlement case is used as the key information for the settlement. For example, when the identified responsibility information of the relevant person in the claim case includes "relevant person A assumes primary responsibility" and "relevant person B assumes secondary responsibility", the responsibility information can be analyzed based on the NPL algorithm. Therefore, it can be determined that the responsibility ratio of related person A is greater than 50%, and the responsibility ratio of related person B is less than 50%. It should be noted that the character string detection model used to detect the image area containing the target character string from the image data of the above-mentioned document image is the same as the character string detection model used to detect the image area containing the target character string from the image data of the aforementioned document image. The string detection model of the image area of the string can be the same string detection model, or two different string detection models, which are not limited in this specification. Similarly, the string recognition model used to recognize the target string in the image area of the document image can be the same as the string recognition model used to recognize the target string in the image area of the document image. It is the same character string recognition model, or two different character string recognition models. This specification does not limit this. In the illustrated embodiment, for the image data of the live image in the image data set related to the above-mentioned claim settlement case, the image data of the live image obtained by classification may be input to a preset classification model ( It is called the second classification model) in the classification calculation. Among them, the second classification model may be a machine learning model such as a commonly used CNN model. Similarly, the image data of the scene images marked with the accident type can be used as training samples, and the back propagation method can be used based on a preset machine learning algorithm (for example: CNN algorithm) to target these scene images The samples are trained to obtain the above-mentioned second classification model for determining the accident type corresponding to the image data of the scene image. Among them, the types of accidents can include single-vehicle accidents, double-vehicle accidents, and multiple-vehicle accidents. For example, the car accident shown in the scene image as a training sample contains only one vehicle image as a "single car accident", and the car accident shown in these scene images contains images of two vehicles Marked as "two-vehicle accidents", and the car accidents shown in these scene images containing three or more vehicles are marked as "multi-vehicle accidents." In this way, the image data of the live image can be classified and calculated based on the trained second classification model, so that the image data of the live image can be classified based on the result of the classification calculation, that is, the live image is determined The accident type to which the corresponding accident belongs. In practical applications, it is also possible to use the key information extracted from each category of image data for claims settlement. Specifically, the extracted key information for claim settlement can be obtained, and the claim settlement process can be performed based on the key information used for claim settlement. For example, after obtaining the key information for claims, the key information for claims can be input to the claims system installed on the electronic device, so that the claims system can automatically enter the key information for claims. As the information of the aforementioned claim settlement case, the claims settlement system can then perform subsequent claims settlement processing based on the information of the claim settlement case. In practical applications, the claims system can also classify and store the image data set related to the claim case according to the above classification results, that is, the image data in the image data set is classified and stored according to the image category to which it belongs . In the above technical solution, for a certain claim settlement case, the image data set related to the claim settlement case may be input to the classification model, so that the image data in the image data set can be classified by the classification model. Subsequently, the key information for claim settlement can be extracted automatically from the image data of each image category obtained by classification. By adopting this method, compared with the commonly used method of manually classifying and analyzing the image data of claim settlement cases, it can improve the extraction efficiency of claims information and reduce the consumption of human resources. Corresponding to the foregoing embodiment of the claim information extraction method, this specification also provides an embodiment of the claim information extraction device. The embodiments of the claim settlement information extraction device in this manual can be applied to electronic equipment. The device embodiments can be implemented by software, or can be implemented by hardware or a combination of software and hardware. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the internal memory through the processor of the electronic device where it is located. From the perspective of hardware, as shown in Figure 2, it is a hardware structure diagram of the electronic equipment where the claim information extraction device is located in this manual, except for the processor, internal memory, network interface, and non-volatile components shown in Figure 2. In addition to the sexual memory, the electronic equipment in the embodiment usually extracts actual functions based on the claim information, and may also include other hardware, which will not be repeated here. Please refer to FIG. 3, which is a block diagram of a claim information extraction device according to an exemplary embodiment of this specification. The device 30 can be applied to the electronic equipment shown in FIG. 2, including: The first obtaining module 301 is used to obtain a collection of image data related to a claim settlement case; The classification module 302 is configured to input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, The first classification model is a machine learning model trained based on a number of image data samples labeled with image categories; The extraction module 303 is used for extracting key information for claim settlement from the image data of each image category obtained by classification. In this embodiment, the device 30 may further include: The second obtaining module 304 is used to obtain the extracted key information for claim settlement; The claim settlement module 305 is used for claim settlement processing based on the key information used for claim settlement. In this embodiment, the first classification model may be a convolutional neural network CNN model. In this embodiment, the image category obtained by classifying the image data in the image data set may include one or more of the following image categories: Document image; document image; scene image; damage image; other images. In this embodiment, the extraction module 303 can be specifically used for: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. In this embodiment, the extraction module 303 can be specifically used for: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. In this embodiment, the extraction module 303 can be specifically used for: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. In this embodiment, the second classification model may be a CNN model. For the implementation process of the functions and roles of each module in the above-mentioned device, refer to the implementation process of the corresponding steps in the above-mentioned method for details, which will not be repeated here. For the device embodiment, since it basically corresponds to the method embodiment, the relevant part can refer to the part of the description of the method embodiment. The above-described device embodiments are merely illustrative. The modules described as separate components may or may not be physically separate, and the components displayed as modules may or may not be physical modules. It can be located in one place, or it can be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this specification. Those of ordinary skill in the art can understand and implement it without creative work. The system, device, module, or module set forth in the above embodiments may be implemented by a computer chip or entity, or implemented by a product with a certain function. A typical implementation device is a computer. The specific form of the computer can be a personal computer, a notebook computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email receiving and sending device, and a game control A console, a tablet, a wearable device, or a combination of any of these devices. Corresponding to the foregoing embodiment of the method for extracting claims information, this specification also provides an embodiment of an electronic device. The electronic device includes a processor and a memory used to store machine executable instructions; wherein the processor and the memory are usually connected to each other through an internal bus. In other possible implementation manners, the device may also include an external interface to be able to communicate with other devices or components. In this embodiment, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic for claim information extraction, the processor is prompted to: Obtain a collection of image data related to claims settlement; Input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification model is based on A machine learning model trained on a number of image data samples labeled with image categories; Extract the key information for claim settlement from the image data of each image category obtained by classification. In this embodiment, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic for claim information extraction, the processor is also prompted to: Obtain the extracted key information for claim settlement; Perform claims processing based on the key information used for claims. In this embodiment, the first classification model is a convolutional neural network CNN model. In this embodiment, the image categories obtained by classifying the image data in the image data set include one or more of the following image categories: Document image; document image; scene image; damage image; other images. In this embodiment, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic for claim information extraction, the processor is prompted to: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. In this embodiment, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic for claim information extraction, the processor is prompted to: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. In this embodiment, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic for claim information extraction, the processor is prompted to: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. In this embodiment, the second classification model is a CNN model. Those skilled in the art will easily think of other embodiments of this specification after considering the specification and practicing the invention disclosed herein. This specification is intended to cover any variations, uses, or adaptive changes of this specification. These variations, uses or adaptive changes follow the general principles of this specification and include common knowledge or conventional technical means in the technical field not disclosed in this specification. . The specification and embodiments are only regarded as exemplary, and the true scope and spirit of the specification are pointed out by the following patent scope. It should be understood that this specification is not limited to the precise structure described above and shown in the drawings, and various modifications and changes can be made without departing from its scope. The scope of this specification is only limited by the scope of the attached patent application. The above descriptions are only preferred embodiments of one or more embodiments of this specification, and are not used to limit one or more embodiments of this specification. All within the spirit and principle of one or more embodiments of this specification, Any modification, equivalent replacement, improvement, etc. made should be included in the protection scope of one or more embodiments of this specification.

102~106:步驟 30:理賠資訊提取裝置 301:第一獲取模組 302:分類模組 303:提取模組 304:第二獲取模組 305:理賠模組 102~106: Step 30: Claim information extraction device 301: The first acquisition module 302: Classification module 303: Extract Module 304: The second acquisition module 305: Claims module

圖1是本說明書一示例性實施例示出的一種理賠資訊提取方法的流程圖; 圖2是本說明書一示例性實施例示出的一種理賠資訊提取裝置所在電子設備的硬體結構圖; 圖3是本說明書一示例性實施例示出的一種理賠資訊提取裝置的方塊圖。Fig. 1 is a flowchart of a method for extracting claims information according to an exemplary embodiment of this specification; Figure 2 is a hardware structure diagram of an electronic device where a claim information extraction device is shown in an exemplary embodiment of this specification; Fig. 3 is a block diagram of an apparatus for extracting claims information according to an exemplary embodiment of this specification.

Claims (17)

一種理賠資訊提取方法,該方法包括: 獲取與理賠案件相關的圖像數據集合; 將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。A method for extracting claims information, the method comprising: Obtain a collection of image data related to claims settlement; Input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification model is based on A machine learning model trained on a number of image data samples labeled with image categories; Extract the key information for claim settlement from the image data of each image category obtained by classification. 根據請求項1所述的方法,該方法還包括: 獲取提取出的該用於理賠的關鍵資訊; 基於該用於理賠的關鍵資訊進行理賠處理。According to the method described in claim 1, the method further includes: Obtain the extracted key information for claim settlement; Perform claims processing based on the key information used for claims. 根據請求項1所述的方法,該第一分類模型為卷積神經網路CNN模型。According to the method described in claim 1, the first classification model is a convolutional neural network CNN model. 根據請求項1所述的方法,對該圖像數據集合中的圖像數據進行分類得到的圖像類別包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。According to the method described in claim 1, the image categories obtained by classifying the image data in the image data set include one or more of the following image categories: Document image; document image; scene image; damage image; other images. 根據請求項4所述的方法,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。According to the method described in claim 4, the extraction of the key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. 根據請求項4所述的方法,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。According to the method described in claim 4, the extraction of the key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. 根據請求項4所述的方法,該從分類得到的各個類別的圖像數據中分別提取用於理賠的關鍵資訊,包括: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。According to the method described in claim 4, the extraction of the key information for claim settlement from the image data of each category obtained by classification includes: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. 根據請求項7所述的方法,該第二分類模型為CNN模型。According to the method described in claim 7, the second classification model is a CNN model. 一種理賠資訊提取裝置,該裝置包括: 第一獲取模組,用於獲取與理賠案件相關的圖像數據集合; 分類模組,用於將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 提取模組,用於從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。A device for extracting claims information, the device comprising: The first acquisition module is used to acquire a collection of image data related to a claim settlement case; The classification module is used to input the image data in the image data set into the first classification model for classification calculation, and to classify the image data in the image data set based on the classification calculation result; wherein, the The first classification model is a machine learning model trained based on a number of image data samples labeled with image categories; The extraction module is used to extract key information used for claim settlement from the image data of each image category obtained by classification. 根據請求項9所述的裝置,該裝置還包括: 第二獲取模組,用於獲取提取出的該用於理賠的關鍵資訊; 理賠模組,用於基於該用於理賠的關鍵資訊進行理賠處理。The device according to claim 9, the device further comprising: The second acquisition module is used to acquire the extracted key information for claim settlement; The claim settlement module is used for claim settlement processing based on the key information used for claim settlement. 根據請求項9所述的裝置,該第一分類模型為卷積神經網路CNN模型。According to the device described in claim 9, the first classification model is a convolutional neural network CNN model. 根據請求項9所述的裝置,對該圖像數據集合中的圖像數據進行分類得到的圖像類別包括以下圖像類別中的一個或多個: 證件圖像;單據圖像;現場圖像;損傷圖像;其他圖像。According to the device according to claim 9, the image categories obtained by classifying the image data in the image data set include one or more of the following image categories: Document image; document image; scene image; damage image; other images. 根據請求項12所述的裝置,該提取模組具體用於: 如果分類得到該證件圖像的圖像數據,則基於光學字元識別OCR演算法,從該證件圖像的圖像數據中提取該理賠案件的相關人員資訊和相關車輛資訊作為用於理賠的關鍵資訊。According to the device described in claim 12, the extraction module is specifically used for: If the image data of the certificate image is obtained by classification, based on the optical character recognition OCR algorithm, the relevant personnel information and relevant vehicle information of the claim settlement case are extracted from the image data of the certificate image as the key to the settlement News. 根據請求項12所述的裝置,該提取模組具體用於: 如果分類得到該單據圖像的圖像數據,則基於OCR演算法和自然語言處理NLP演算法,從該單據圖像的圖像數據中提取該理賠案件的相關人員的責任比例作為用於理賠的關鍵資訊。According to the device described in claim 12, the extraction module is specifically used for: If the image data of the document image is obtained by classification, based on the OCR algorithm and the natural language processing NLP algorithm, the liability ratio of the relevant person in the claim settlement case is extracted from the image data of the document image as the claim settlement Key information. 根據請求項12所述的裝置,該提取模組具體用於: 如果分類得到該現場圖像的圖像數據,則將該現場圖像的圖像數據輸入至第二分類模型中進行分類計算,並基於分類結果確定與該現場圖像的圖像數據對應的事故類型,以將該事故類型作為用於理賠的關鍵資訊;其中,該第二分類模型為基於若干被標注了事故類型的現場圖像樣本訓練出的機器學習模型。According to the device described in claim 12, the extraction module is specifically used for: If the image data of the scene image is obtained by classification, the image data of the scene image is input into the second classification model for classification calculation, and the accident corresponding to the image data of the scene image is determined based on the classification result Type, using the accident type as the key information for claims; wherein, the second classification model is a machine learning model trained based on a number of scene image samples labeled with the accident type. 根據請求項15所述的裝置,該第二分類模型為CNN模型。According to the device according to claim 15, the second classification model is a CNN model. 一種電子設備,該電子設備包括: 處理器; 用於儲存機器可執行指令的記憶體; 其中,通過讀取並執行該記憶體儲存的與理賠資訊提取的控制邏輯對應的機器可執行指令,該處理器被促使: 獲取與理賠案件相關的圖像數據集合; 將該圖像數據集合中的圖像數據輸入至第一分類模型中進行分類計算,並基於分類計算結果對該圖像數據集合中的圖像數據進行分類;其中,該第一分類模型為基於若干被標注了圖像類別的圖像數據樣本訓練出的機器學習模型; 從分類得到的各個圖像類別的圖像數據中分別提取用於理賠的關鍵資訊。An electronic device, which includes: processor; Memory used to store machine executable instructions; Wherein, by reading and executing the machine executable instructions stored in the memory and corresponding to the control logic of claim information extraction, the processor is prompted to: Obtain a collection of image data related to claims settlement; Input the image data in the image data set into the first classification model for classification calculation, and classify the image data in the image data set based on the classification calculation result; wherein, the first classification model is based on A machine learning model trained on a number of image data samples labeled with image categories; Extract the key information for claim settlement from the image data of each image category obtained by classification.
TW108131514A 2019-01-31 2019-09-02 Claim information extraction method and device, and electronic equipment TWI712980B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910097463.7A CN109903172A (en) 2019-01-31 2019-01-31 Claims Resolution information extracting method and device, electronic equipment
CN201910097463.7 2019-01-31

Publications (2)

Publication Number Publication Date
TW202030683A true TW202030683A (en) 2020-08-16
TWI712980B TWI712980B (en) 2020-12-11

Family

ID=66944493

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108131514A TWI712980B (en) 2019-01-31 2019-09-02 Claim information extraction method and device, and electronic equipment

Country Status (3)

Country Link
CN (1) CN109903172A (en)
TW (1) TWI712980B (en)
WO (1) WO2020155790A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109903172A (en) * 2019-01-31 2019-06-18 阿里巴巴集团控股有限公司 Claims Resolution information extracting method and device, electronic equipment
CN111401438B (en) * 2020-03-13 2023-08-25 德联易控科技(北京)有限公司 Image sorting method, device and system
CN111681116A (en) * 2020-04-27 2020-09-18 中国平安财产保险股份有限公司 Claims data verification method, device, equipment and storage medium
CN111680693A (en) * 2020-05-28 2020-09-18 泰康保险集团股份有限公司 Method and device for batch processing of claim settlement services
CN112132527A (en) * 2020-08-07 2020-12-25 精英数智科技股份有限公司 Claims investigation method and device based on enterprise monitoring data
CN116342300B (en) * 2023-05-26 2023-08-01 凯泰铭科技(北京)有限公司 Method, device and equipment for analyzing characteristics of insurance claim settlement personnel

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160011916A (en) * 2014-07-23 2016-02-02 삼성전자주식회사 Method and apparatus of identifying user using face recognition
JP6402653B2 (en) * 2015-03-05 2018-10-10 オムロン株式会社 Object recognition device, object recognition method, and program
CN106780048A (en) * 2016-11-28 2017-05-31 中国平安财产保险股份有限公司 A kind of self-service Claims Resolution method of intelligent vehicle insurance, self-service Claims Resolution apparatus and system
CN107220648B (en) * 2017-04-11 2018-06-22 平安科技(深圳)有限公司 The character identifying method and server of Claims Resolution document
CN107610091A (en) * 2017-07-31 2018-01-19 阿里巴巴集团控股有限公司 Vehicle insurance image processing method, device, server and system
CN107292749A (en) * 2017-08-04 2017-10-24 平安科技(深圳)有限公司 Car damages sorting technique, system and the readable storage medium storing program for executing of certificate photograph
CN108986468A (en) * 2018-08-01 2018-12-11 平安科技(深圳)有限公司 Processing method, device, computer equipment and the computer storage medium of traffic accident
CN109903172A (en) * 2019-01-31 2019-06-18 阿里巴巴集团控股有限公司 Claims Resolution information extracting method and device, electronic equipment

Also Published As

Publication number Publication date
WO2020155790A1 (en) 2020-08-06
TWI712980B (en) 2020-12-11
CN109903172A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
TWI712980B (en) Claim information extraction method and device, and electronic equipment
WO2019109526A1 (en) Method and device for age recognition of face image, storage medium
WO2019120115A1 (en) Facial recognition method, apparatus, and computer apparatus
US8792722B2 (en) Hand gesture detection
CN111886842B (en) Remote user authentication using threshold-based matching
WO2017124990A1 (en) Method, system, device and readable storage medium for realizing insurance claim fraud prevention based on consistency between multiple images
WO2019033572A1 (en) Method for detecting whether face is blocked, device and storage medium
US20120027263A1 (en) Hand gesture detection
WO2019033571A1 (en) Facial feature point detection method, apparatus and storage medium
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
WO2020238353A1 (en) Data processing method and apparatus, storage medium, and electronic apparatus
US10423817B2 (en) Latent fingerprint ridge flow map improvement
WO2020220453A1 (en) Method and device for verifying certificate and certificate holder
WO2019056503A1 (en) Store monitoring evaluation method, device and storage medium
WO2021114612A1 (en) Target re-identification method and apparatus, computer device, and storage medium
WO2021031704A1 (en) Object tracking method and apparatus, computer device, and storage medium
EP2901259A1 (en) Handwritten signature detection, validation, and confirmation
CN111008576A (en) Pedestrian detection and model training and updating method, device and readable storage medium thereof
CN111738199B (en) Image information verification method, device, computing device and medium
CN114093022A (en) Activity detection device, activity detection system, and activity detection method
WO2021051602A1 (en) Lip password-based face recognition method and system, device, and storage medium
CN114663871A (en) Image recognition method, training method, device, system and storage medium
CN105184236A (en) Robot-based face identification system
WO2021049234A1 (en) Image analysis device, control method, and program
CN113553947B (en) Method and device for generating and describing multi-mode pedestrian re-recognition and electronic equipment