CN114170029A - Data processing method and device, computer equipment and storage medium - Google Patents

Data processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114170029A
CN114170029A CN202111423691.2A CN202111423691A CN114170029A CN 114170029 A CN114170029 A CN 114170029A CN 202111423691 A CN202111423691 A CN 202111423691A CN 114170029 A CN114170029 A CN 114170029A
Authority
CN
China
Prior art keywords
data
corrected
factor
factors
settlement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111423691.2A
Other languages
Chinese (zh)
Inventor
陈兴全
叶文斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Meyacom Technology Co ltd
Original Assignee
Shenzhen Meyacom Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Meyacom Technology Co ltd filed Critical Shenzhen Meyacom Technology Co ltd
Priority to CN202111423691.2A priority Critical patent/CN114170029A/en
Publication of CN114170029A publication Critical patent/CN114170029A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

A method of data processing, comprising: acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor; determining similarity between the claim settlement factor and each entity in the knowledge graph based on the knowledge graph established in advance; performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors; analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed; generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.

Description

Data processing method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a data processing method and apparatus, a computer device, and a storage medium.
Background
Currently, each insurance company claims system uses an auxiliary function or system such as an OCR and the like for the claims process to improve the efficiency of the claims process. However, due to the fact that the nouns or features related to health, medical treatment and diseases are too many, and the information of health insurance responsibility, disease classification, medical knowledge and the like of each insurance company is uneven, the claim settlement auxiliary system in the prior art cannot comprehensively and accurately check the recognized professional terms, and the acquired text data is inaccurate. Most insurance companies carry out adjustment flow in a manual adjustment mode, so that the efficiency is low, mistakes are easy to occur, and effective supervision by supervision departments is not facilitated. On the other hand, the settlement calculation is carried out manually, and the review of the settlement flow are difficult to accurately and completely realize after the settlement calculation result is obtained.
Disclosure of Invention
Therefore, it is necessary to provide a data processing method, an apparatus, a computer device, and a storage medium, which can comprehensively and accurately check claim data and accurately review and review claim processes.
In a first aspect, the present invention provides a data processing method, including:
acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph;
performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors;
analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
In a second aspect, the present invention provides a data processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring data to be claimed and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
the confirmation module is used for determining the similarity between the claim settlement factor and each entity in the knowledge graph based on the knowledge graph established in advance;
the correction module is used for correcting the content of the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed, and the corrected data to be claimed comprises the corrected claim factors;
the analysis module is used for analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
and the report generation module is used for generating a claim settlement risk report according to the analysis result, wherein the claim settlement risk report at least comprises one of a claim amount, a claim term and a claim plan.
In a third aspect, the present invention provides a computer apparatus comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph;
performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors;
analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
In a fourth aspect, the present invention provides a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph;
performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors;
analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
The application relates to a data processing method, comprising: acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor; determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph; performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors; analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed; generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan. According to the method and the device, a large number of professional terms of various industries can be stored by establishing the knowledge graph, and after the data to be claimed is obtained, term correction is carried out on the data to be claimed according to the knowledge graph, so that the accuracy of the data to be claimed is improved; after the accurate data to be claimed is obtained, the invention can analyze the data to be claimed according to the accurate data to be claimed to obtain the claim term matched with the data to be claimed and obtain the claim amount under the claim term; and after the analysis result is obtained, the analysis result can be imported into a preset claim settlement risk report. Therefore, the visualization of claim analysis is realized, so that a user can completely and accurately view claim terms, claim schemes, claim amounts and the like obtained by analysis, and the requirements of review, review and the like are met.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow diagram of a data processing method in one embodiment;
FIG. 2a is a diagram illustrating an angle of inclination of characters in an image of a claim with respect to horizontal in one embodiment;
FIG. 2b is a diagram of a claims image after rotation in one embodiment;
FIG. 3 is a flow diagram of a method of data processing in one embodiment;
FIG. 4 is a block diagram of a data processing apparatus according to an embodiment;
FIG. 5 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In one embodiment, the application provides a data processing method, which is applicable to a claim settlement system, and is used for performing term correction on acquired data to be claimed, and obtaining a data analysis result according to the corrected data to be claimed. It should be noted that the main body of the implementation of the present solution is a computer. After the user uploads the claim data to the claim settlement system, the computer automatically identifies and analyzes the claim data according to different claim settlement terms of different companies stored in the computer storage device through a computer program so as to solve the problems of low labor efficiency and easy error, thereby achieving the effects of improving the efficiency and accuracy of claim settlement and avoiding the waste of a large amount of manpower and material resources.
In one embodiment, as shown in fig. 1, the present application provides a data processing method, the method comprising:
102, acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factors include at least one of accident factors, medical factors, physiological factors, psychological factors, and asset factors.
The data to be claim settled at least comprises basic information of a user and a claim settlement factor. The basic information at least comprises name, age and gender; the claim factors include at least one of accident factors, medical factors, physiological factors, psychological factors, and asset factors. In practical application, the claim settlement factor further includes an insurance factor, and the insurance factor at least includes an insurance category and an insurance unit.
Wherein the accident factor comprises at least one of an accident type, an accident occurrence time and an accident occurrence place; the medical factors at least comprise one of hospitalization information, operation information, medicine taking information and historical medical information, wherein the medicine taking information specifically comprises medicine taking types, the specified medicine taking amount of an instruction book and the medicine taking amount of a medical order; the physiological factors comprise at least one of injury position and injury grade; the psychological factors at least comprise psychological health conditions; the asset factors include at least one of a mobile asset, a long term investment, a fixed asset, an intangible asset, and a deferred asset.
In a specific embodiment, the acquiring data to be claimed includes: acquiring a claim settlement image input by a user; and carrying out optical character recognition on the claim image to obtain a text recognition result, and taking the text recognition result as data to be claimed.
The claim settlement image is a single-frame image, and the single-frame image is a single picture in a photo set or a single picture in a video. It is understood that the photo set may be a photo taken by a photographing device, or may be a PDF output by an image-text editing tool.
The mode of inputting the claim image by the user can be online uploading the claim image, or offline uploading the claim image in an identification area of the automatic identification equipment of the claim system.
Among them, Optical Character Recognition (OCR) refers to a process in which an electronic device (e.g., a scanner or a digital camera) checks characters printed on paper and then translates shapes into computer characters by a Character Recognition method; namely, the process of scanning the text data, then analyzing and processing the image file and obtaining the character and layout information.
After the claim image input by the user is obtained, optical character recognition can be carried out on the claim image, the claim image is converted into text information, and a text recognition result is obtained. And the computer takes the character recognition result as the data to be claimed and performs data processing on the data to be claimed.
In a specific embodiment, the performing optical character recognition on the claim image to obtain a text recognition result includes: acquiring the gray value of each pixel point of the claim settlement image; determining the definition value of the claim settlement image according to the gray value of each pixel point; if the definition value is larger than a preset definition threshold value, carrying out optical character recognition on the claim settlement image to obtain a text recognition result; and if the definition value is smaller than a preset definition threshold value, returning the claim settlement image.
In this embodiment, after the gray value of each pixel point on the claim image is obtained, the definition value of the claim image is determined according to a preset definition calculation formula and the gray value of each pixel point. Specifically, the formula for calculating the definition is as follows:
D(f)=∑yx|f(x,y)-f(x+1,y)|*|f(x,y)-f(x,y+1)|
wherein f (x, y) is the gray value of the pixel point, and D (f) is the definition value. The formula also has high sensitivity to the vicinity of a focus by multiplying two gray differences in each pixel field and accumulating the multiplied gray differences one by one, and the calculated definition value is more accurate. Calculating the definition of the claim image according to the definition calculation formula, and if the definition is greater than a preset definition threshold, performing optical character recognition on the claim image to obtain a text recognition result; and if the definition value is smaller than a preset definition threshold value, returning the claim settlement image, and prompting a user to input the claim settlement image with higher definition.
In this embodiment, through the definition of discerning the claim image, the accuracy of optical character recognition that can be very big to promote the rate of accuracy of analysis.
In a specific embodiment, if the sharpness value is greater than a preset sharpness threshold, performing optical character recognition on the claim image to obtain a text recognition result, including: acquiring an inclination angle of characters in the claim image relative to the horizontal direction; rotating the claim image by an angle equal to the inclination angle to enable characters in the claim image to be horizontal relative to the horizontal direction, so as to obtain a rotated claim image; and carrying out optical character recognition on the rotated claim image to obtain a text recognition result.
In this embodiment, a binary image of the claim image is obtained through a Canny algorithm, then the position of any character in the claim image is determined, and the edge of the character is detected, so as to obtain a character region external rectangle of the character and the bottom edge of the character region external rectangle, then the inclination angle between the bottom edge of the character region external rectangle and the horizontal direction is determined, and the inclination angle between the bottom edge of the character region external rectangle and the horizontal direction is used as the inclination angle of the character in the claim image relative to the horizontal direction; after the inclination angle of the characters in the claim image relative to the horizontal direction is determined, the claim image can be rotated by an angle equal to the inclination angle, so that the characters in the claim image are horizontal relative to the horizontal direction. For example, as shown in fig. 2a, after the binarized image of the claim image is obtained, the position of the character "bao" may be determined, then the circumscribed rectangle of "bao" and the bottom edge of the circumscribed rectangle are detected, the inclination angle α between the bottom edge of the circumscribed rectangle and the horizontal direction is used as the inclination angle of the character in the claim image relative to the horizontal direction, and after the claim image is rotated clockwise by the angle α, the rotated claim image may be obtained, as shown in fig. 2 b.
In this embodiment, whether the characters in the claim image are horizontal is judged on the basis of judging the definition of the claim image, so that the characters of the claim image can be in a horizontal state, the accuracy of character recognition can be greatly improved, and the accuracy of data analysis is further improved.
In a specific embodiment, before performing optical character recognition on the rotated claim image to obtain a text recognition result, the method includes: acquiring a chromatic value of the rotated claim image; if the chromatic value is larger than a first preset chromatic threshold value, reducing the chromatic value to the first preset chromatic threshold value; and if the chromatic value is lower than a second preset chromatic threshold, increasing the chromatic value to the second preset chromatic threshold.
Where a color is commonly represented by luminance and chrominance, which is a property of a color excluding luminance, and reflects the hue and saturation of the color. The chroma value is used for reflecting the chroma.
Wherein the first preset chrominance threshold value is greater than the second preset chrominance threshold value.
In practical applications, the chroma values of the overexposed areas of the image are significantly higher, and the chroma values of the shading and watermark of the image are generally significantly lower. Therefore, in this embodiment, the claims image after the definition detection and the level determination is subjected to the chrominance determination, and the claims image is adjusted according to the chrominance determination result to remove the excessive exposure area, the shading, the watermark, and the like, so that the de-noising processing of the claims image is completed, and the accuracy of character recognition and the accuracy of data analysis are further improved.
In this embodiment, after obtaining the chrominance value of the claim image after rotation, the chrominance image is compared with a first preset chrominance threshold value and a second preset chrominance threshold value, and if the chrominance value is greater than the first preset chrominance threshold value, which indicates that the claim image has an over-exposure area, the chrominance value is reduced to the first preset chrominance threshold value; if the chroma value is lower than a second preset chroma threshold value, the fact that the claim image has the shading or the watermark is indicated, and the chroma value is increased to the second preset chroma threshold value.
And 104, determining the similarity between the claim settlement factor and each entity in the knowledge graph based on the pre-established knowledge graph.
The knowledge map comprises industry linguistic data such as medical industry linguistic data, financial industry linguistic data and insurance industry linguistic data, and at least comprises one of a disease database, a medical insurance catalogue database, a hospital and department database, a medicine database, a clinical diagnosis and treatment path database, a medical instrument database and a financial database.
In a specific embodiment, the method for determining the similarity between the claim settlement factor and each entity in the knowledge-graph comprises: and identifying the number of characters and keywords in the claim factors, and determining the similarity between the claim factors and each entity in the knowledge graph according to the number of characters and the keywords. For example, if the claim factor is 99 Gaoling, the number of characters identifying the claim factor is 5, the keywords are Gaoling and 99, then the similarity between the claim factor and each entity in the knowledge graph can be determined according to 999, Gaoling and the number of characters 6.
106, performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claimed comprises the corrected claim factor.
The method comprises the steps of obtaining a knowledge graph, calculating a set of term similarity factors, and calculating a set of term similarity factors according to the set of term similarity factors, wherein the term similarity factors and each entity in the knowledge graph can correct industry terms in a character recognition result, and particularly, correct terms in the knowledge graph are used for replacing wrong terms in the character recognition result.
After the similarity between the claim factors and each entity in the knowledge graph is obtained, the similarity is ranked, the entity with the highest similarity to the claim factors is selected, the entity with the highest similarity replaces the claim factors with errors in the text recognition result, and correction of the data to be claimed is completed, so that accuracy of recognition of academic nouns in the data processing process is improved, and claim scene recognition errors caused by inaccurate recognition of professional terms are avoided. For example, if the claim factor is 99 Ganmaoling, and the highest similarity entity identified from the knowledge graph is 999 Ganmaoling, the 999 Ganmaoling is used to replace the wrong 99 Ganmaoling.
Step 108, analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of the claim terms and claim schemes applicable to the corrected data to be claimed.
After the corrected claim factors and the corrected data to be claimed are obtained, a claim scene corresponding to the data to be claimed can be determined from the claim scene data set according to the data to be claimed, and finally, an analysis result of the corrected data to be claimed in the claim scene can be determined. For example, if the corrected claim factor is a stock, the claim terms related to the long-term asset are found out from the claim scene data set, and the corrected data to be claimed is analyzed by using the claim terms related to the long-term asset; and if the corrected claim factor is the breast cancer, finding out claim terms related to the breast cancer from the claim scene data set, and analyzing the corrected data to be claimed by using the claim terms of the claim terms related to the breast cancer.
In a specific embodiment, the determining, from the claim scene data set, a claim scene corresponding to the claim data according to the claim data to be claimed includes: judging the correlation among the plurality of claim factors; judging the fraud protection probability of the data to be claimed according to the correlation; if the fraud protection probability is not less than the preset probability, sending a survey instruction to a terminal; and if the fraud protection probability is smaller than the preset probability, determining a claim settlement scene corresponding to the data to be claimed from the claim settlement scene data set according to the data to be claimed.
Wherein, judging the correlation among a plurality of claim factors comprises: and determining the relevance among the plurality of claim factors according to the keywords of the plurality of claim factors. Specifically, the correlation between the claim factors refers to the degree of correlation between the claim factors. It will be appreciated that the correlation between claim factors of the same category is higher than the correlation between claim factors of different categories. For example, the correlation between operations and medicines is higher than the correlation between medicines and stocks; the correlation between the inflammation and the amoxicillin capsules is higher than the correlation between the inflammation and the insulin.
The lower the correlation among the claim factors is, the higher the corresponding fraud probability is; the higher the correlation between claim factors, the lower the corresponding fraud probability. Judging the fraud protection probability of the data to be claimed according to the correlation among the plurality of claim settlement factors, wherein the plurality of claim settlement factors are classified firstly; then determining the relevance of the claim factors among different categories; next, an average value a of the correlation of the claim factors between the different categories is determined, and finally, the fraud probability B of the data to be claimed is determined according to the average value a. For example, if the corrected claim factor includes: car insurance, car crash, surgery, chemotherapy, easyton (Asteady); the claim factors are classified as insurance: vehicle danger, vehicle collision and medical treatment: surgery, chemotherapy, essay (Asteady); then, the correlations of claim factors between different classes were determined, i.e. the correlations of car risk to surgery, chemotherapy, and asteatan (Asteady) were determined to be 0.5, 0, respectively, and the correlations of crash to surgery, chemotherapy, and asteatan (Asteady) were determined to be 0.4, 0, respectively, and then the average a of the correlations of claim factors between different classes was determined as: and finally, determining the fraud probability B of the data to be claimed as 1-0.15 as 0.85 according to B as 1-A. If the preset probability is 0.5, since 0.85 is greater than 0.5, the probability of fraud protection of the user is over high, and a request for investigating the authenticity of the claim settlement material of the user is sent to the terminal.
In the embodiment, the fraud protection probability of the user can be determined by analyzing the correlation performance between the claim settlement factors, so that not only can the accuracy of data processing be improved, but also property loss caused by fraud protection and the like of the user can be avoided.
In a specific embodiment, the analyzing the corrected data to be claimed according to the corrected claim factor includes: carrying out dislocation correction on the corrected data to be claimed by utilizing the space and semantic relation among the characters to obtain target data to be claimed; importing the target claim settlement data into a preset data template to obtain structured data to be claimed; and analyzing the structured data to be claimed according to the corrected claim settlement factors to obtain an analysis result.
Structured data, in short, is a database. And the data to be claimed which is obtained after the dislocation correction is carried out on the data is semi-structured data. The structured data to be claimed can be obtained by importing the target data to be claimed into a preset data template, and specifically, the target data to be claimed can be stored in a database to perfect the data.
After the claim image is identified as the text information, problems of character confusion, incomplete characters and the like exist. In this embodiment, after the text recognition result is obtained, each misplaced or missing part in the text recognition result is determined according to the text gaps and the semantic relationship, and then the text is corrected or supplemented to the correct position, so as to correct the data to be claimed. The character recognition result can be subjected to dislocation correction and incomplete character correction through the space and semantic relation between the characters, and the accuracy of character recognition is improved. Illustratively, the character recognition result is 'south fish pond on the side of willow', the 'south' can be determined to be dislocated according to the spatial and semantic relation, and the 'south' is corrected to a correct position to obtain a corrected character recognition result 'fish pond on the south side of willow'. The character recognition result can be further corrected through the space and semantic relation between the characters, the situation that the semantics are unclear due to character dislocation is avoided, and the accuracy of character recognition is improved.
In a specific embodiment, as shown in fig. 3, the analyzing the structured data to be claimed according to the corrected claim factor to obtain an analysis result includes:
and 1022, searching a data application scene corresponding to the structured data to be claimed according to the corrected claim settlement factors based on a deep learning model.
Wherein, the deep learning model can be a natural language processing model.
In a specific embodiment, searching for a data application scenario corresponding to the structured data to be claimed according to the corrected claim factor includes: acquiring a claim settlement scene data set; the claim settlement scene comprises claim settlement terms applicable to the data to be claimed; and determining a claim settlement scene corresponding to the structured data to be claimed from the claim settlement scene data set according to the structured data to be claimed based on a deep learning model.
The claim setting data set is a collection of claim setting data and contains different claim terms of different companies. And aiming at different insurance products of different companies, corresponding claim terms can be searched from the claim scene data set.
The structured data to be claimed and the claim scene data set are used as input of a natural language processing model, the natural language processing model identifies the structured data to be claimed, specifically, the natural language processing model identifies claim factors of the structured data to be claimed, and determines a claim scene corresponding to the data to be claimed from the claim scene data set according to the claim factors.
In this embodiment, by identifying the claim factors of the structured data to be claimed, the claim scene adapted to the structured data to be claimed can be accurately identified according to the claim factors. Specifically, the claim settlement factor further includes a basic information factor, and the basic information at least includes name, age, gender, kind of insurance application, and insurance unit. In practical application, determining the application category and the application unit of a user according to the basic information of the user, and then identifying the claim settlement factor of the data to be claimed; searching a first claim scene data subset matched with the application unit from a claim scene data set, and searching a second claim scene data subset matched with the application category from the first claim scene data subset; and finally, determining a claim scene matched with the data to be claimed from the second claim scene data subset based on the claim factors. For example, if the insurance category in the user basic information is accident insurance and the insurance unit is insurance company a, the claim scene data subset of the company a is determined from the claim scene data set, then the claim scene data subset of the accident insurance is found from the claim scene data subset of the company a, and finally the claim scene applicable to the accident insurance is determined from the claim scene data subset of the accident insurance based on the claim factor.
And 1024, obtaining an analysis result of the structured claim data in the data application scene based on a logic algorithm model.
In a specific embodiment, the logic algorithm model is specifically a function calculation model, and the analysis result of the structured claim data obtained based on the logic algorithm model in the data application scenario is specifically: the obtaining of the analysis result of the structured data to be claimed in the claim settlement scene based on the logic algorithm model includes: acquiring a claim settlement data table corresponding to the claim settlement scene; the claim settlement data table contains claim settlement rules corresponding to the claim settlement terms; and according to the claim settlement rule, obtaining an analysis result of the structured data to be claimed in the claim settlement scene through the logic algorithm model.
The claim data table is used for storing the corresponding relation between the claim scenes and the claim rules and the specific content of the claim rules. For example, if the claim settlement scene is "accident and disability", the claim settlement rule stored in the claim data table has a specific content of "in the insurance period, the insured person is unfortunate to suffer from the accident and cause the accident and the disability, and we pay 5 to 50 ten thousand of accident and disability insurance funds by one money. After the claim settlement rule is determined, the specific claim settlement scheme and the claim settlement amount are determined according to the disability level or accident and accident through the logic algorithm model.
And 110, generating a claim settlement risk report according to the analysis result, wherein the claim settlement risk report at least comprises one of a claim amount, a claim term and a claim plan.
And after the analysis result is obtained, the analysis result can be imported into a preset claim settlement risk report. Specifically, the system is imported to obtain claim terms, claim schemes, claim amounts, basic information provided by the user, claim information provided by the user, and the like. In the risk report, the claims analysis is visualized to meet the requirements of being retrospective, being reviewed and the like.
The application relates to a data processing method, comprising: acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor; determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph; performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors; analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed; generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan. According to the method and the device, a large number of professional terms of various industries can be stored by establishing the knowledge graph, and after the data to be claimed is obtained, term correction is carried out on the data to be claimed according to the knowledge graph, so that the accuracy of the data to be claimed is improved; after the accurate data to be claimed is obtained, the invention can analyze the data to be claimed according to the accurate data to be claimed to obtain the claim term matched with the data to be claimed and obtain the claim amount under the claim term; and after the analysis result is obtained, the analysis result can be imported into a preset claim settlement risk report. Therefore, the visualization of claim analysis is realized, so that a user can completely and accurately view claim terms, claim schemes, claim amounts and the like obtained by analysis, and the requirements of review, review and the like are met.
As shown in fig. 4, the present invention provides a data processing apparatus, the apparatus including:
an obtaining module 402, configured to obtain data to be claimed, and determine a claim factor corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
a confirmation module 404, configured to determine, based on a pre-established knowledge graph, a similarity between the claim settlement factor and each entity in the knowledge graph;
a correcting module 406, configured to correct the content of the claim factor according to the similarity, so as to obtain a corrected claim factor and corrected data to be claimed, where the corrected data to be claimed includes the corrected claim factor;
the analysis module 408 is configured to analyze the corrected data to be claimed according to the corrected claim settlement factor to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
a report generating module 410, configured to generate a claim risk report according to the analysis result, where the claim risk report includes at least one of a claim amount, a claim term, and a claim solution.
As shown in FIG. 5, in one embodiment, an internal structure of a computer device is shown. The computer device may be a data processing apparatus, or a terminal or server connected to a data processing apparatus. As shown in fig. 5, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program which, when executed by the processor, causes the processor to implement a data processing method. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a data processing method. The network interface is used for communicating with an external device. Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a data processing method provided by the present application may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 5. The memory of the computer device can store various program templates which form the virtual content information batch pushing device. For example, the acquisition module 402, the confirmation module 404, the correction module 406, the analysis module 408, and the report generation module 410.
A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of: acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor; determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph; performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors; analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed; generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
In one embodiment, the acquiring data to be claimed includes: acquiring a claim settlement image input by a user; and carrying out optical character recognition on the claim image to obtain a text recognition result, and taking the text recognition result as data to be claimed.
In one embodiment, the analyzing the corrected data to be claimed according to the corrected claim factor includes: carrying out dislocation correction on the corrected data to be claimed by utilizing the space and semantic relation among the characters to obtain target data to be claimed; importing the target claim settlement data into a preset data template to obtain structured data to be claimed; and analyzing the structured data to be claimed according to the corrected claim settlement factors to obtain an analysis result.
In one embodiment, the analyzing the structured data to be claimed according to the corrected claim factor to obtain an analysis result includes: based on a deep learning model, searching a data application scene corresponding to the structured data to be claimed according to the corrected claim factors; and obtaining an analysis result of the structured claim data in the data application scene based on a logic algorithm model.
In one embodiment, the performing optical character recognition on the claim image to obtain a text recognition result includes: acquiring the gray value of each pixel point of the claim settlement image; determining the definition value of the claim settlement image according to the gray value of each pixel point; if the definition value is larger than a preset definition threshold value, carrying out optical character recognition on the claim settlement image to obtain a text recognition result; and if the definition value is smaller than a preset definition threshold value, returning the claim settlement image.
In one embodiment, the performing optical character recognition on the claim image to obtain a text recognition result includes: acquiring an inclination angle of characters in the claim image relative to the horizontal direction; rotating the claim image by an angle equal to the inclination angle to enable characters in the claim image to be horizontal relative to the horizontal direction, so as to obtain a rotated claim image; and carrying out optical character recognition on the rotated claim image to obtain a text recognition result.
In one embodiment, before performing optical character recognition on the rotated claim image to obtain a text recognition result, the method includes: acquiring a chromatic value of the rotated claim image; if the chromatic value is larger than a first preset chromatic threshold value, reducing the chromatic value to the first preset chromatic threshold value; and if the chromatic value is lower than a second preset chromatic threshold, increasing the chromatic value to the second preset chromatic threshold.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of: acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor; determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph; performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors; analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed; generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
In one embodiment, the acquiring data to be claimed includes: acquiring a claim settlement image input by a user; and carrying out optical character recognition on the claim image to obtain a text recognition result, and taking the text recognition result as data to be claimed.
In one embodiment, the analyzing the corrected data to be claimed according to the corrected claim factor includes: carrying out dislocation correction on the corrected data to be claimed by utilizing the space and semantic relation among the characters to obtain target data to be claimed; importing the target claim settlement data into a preset data template to obtain structured data to be claimed; and analyzing the structured data to be claimed according to the corrected claim settlement factors to obtain an analysis result.
In one embodiment, the analyzing the structured data to be claimed according to the corrected claim factor to obtain an analysis result includes: based on a deep learning model, searching a data application scene corresponding to the structured data to be claimed according to the corrected claim factors; and obtaining an analysis result of the structured claim data in the data application scene based on a logic algorithm model.
In one embodiment, the performing optical character recognition on the claim image to obtain a text recognition result includes: acquiring the gray value of each pixel point of the claim settlement image; determining the definition value of the claim settlement image according to the gray value of each pixel point; if the definition value is larger than a preset definition threshold value, carrying out optical character recognition on the claim settlement image to obtain a text recognition result; and if the definition value is smaller than a preset definition threshold value, returning the claim settlement image.
In one embodiment, the performing optical character recognition on the claim image to obtain a text recognition result includes: acquiring an inclination angle of characters in the claim image relative to the horizontal direction; rotating the claim image by an angle equal to the inclination angle to enable characters in the claim image to be horizontal relative to the horizontal direction, so as to obtain a rotated claim image; and carrying out optical character recognition on the rotated claim image to obtain a text recognition result.
In one embodiment, before performing optical character recognition on the rotated claim image to obtain a text recognition result, the method includes: acquiring a chromatic value of the rotated claim image; if the chromatic value is larger than a first preset chromatic threshold value, reducing the chromatic value to the first preset chromatic threshold value; and if the chromatic value is lower than a second preset chromatic threshold, increasing the chromatic value to the second preset chromatic threshold.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only show some embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method of data processing, the method comprising:
acquiring data to be claimed, and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
determining similarity between the claim settlement factor and each entity in the knowledge graph based on a pre-established knowledge graph;
performing content correction on the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed; the corrected data to be claim corrected comprises the corrected claim factors;
analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
generating a claim risk report based on the analysis result, the claim risk report including at least one of a claim amount, the claim terms, and the claim plan.
2. The method of claim 1, wherein the obtaining data to be claimed comprises:
acquiring a claim settlement image input by a user;
and carrying out optical character recognition on the claim image to obtain a text recognition result, and taking the text recognition result as data to be claimed.
3. The method according to claim 2, wherein the analyzing the corrected data to be claimed according to the corrected claim factors comprises:
carrying out dislocation correction on the corrected data to be claimed by utilizing the space and semantic relation among the characters to obtain target data to be claimed;
importing the target claim settlement data into a preset data template to obtain structured data to be claimed;
and analyzing the structured data to be claimed according to the corrected claim settlement factors to obtain an analysis result.
4. The method according to claim 3, wherein the analyzing the structured data to be claimed according to the corrected claim factor to obtain an analysis result comprises:
based on a deep learning model, searching a data application scene corresponding to the structured data to be claimed according to the corrected claim factors;
and obtaining an analysis result of the structured claim data in the data application scene based on a logic algorithm model.
5. The method of claim 2, wherein the performing optical character recognition on the claim image to obtain a text recognition result comprises:
acquiring the gray value of each pixel point of the claim settlement image;
determining the definition value of the claim settlement image according to the gray value of each pixel point;
if the definition value is larger than a preset definition threshold value, carrying out optical character recognition on the claim settlement image to obtain a text recognition result;
and if the definition value is smaller than a preset definition threshold value, returning the claim settlement image.
6. The method of claim 5, wherein the performing optical character recognition on the claim image results in a text recognition result, comprising:
acquiring an inclination angle of characters in the claim image relative to the horizontal direction;
rotating the claim image by an angle equal to the inclination angle to enable characters in the claim image to be horizontal relative to the horizontal direction, so as to obtain a rotated claim image;
and carrying out optical character recognition on the rotated claim image to obtain a text recognition result.
7. The method of claim 6, wherein before performing optical character recognition on the rotated claim image to obtain a text recognition result, the method comprises:
acquiring a chromatic value of the rotated claim image;
if the chromatic value is larger than a first preset chromatic threshold value, reducing the chromatic value to the first preset chromatic threshold value;
and if the chromatic value is lower than a second preset chromatic threshold, increasing the chromatic value to the second preset chromatic threshold.
8. A data processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring data to be claimed and determining claim factors corresponding to the data to be claimed; the claim factor at least comprises one of an accident factor, a medical factor, a physiological factor, a psychological factor and an asset factor;
the confirmation module is used for determining the similarity between the claim settlement factor and each entity in the knowledge graph based on the knowledge graph established in advance;
the correction module is used for correcting the content of the claim factors according to the similarity to obtain corrected claim factors and corrected data to be claimed, and the corrected data to be claimed comprises the corrected claim factors;
the analysis module is used for analyzing the corrected data to be claimed according to the corrected claim settlement factors to obtain an analysis result; the analysis result at least comprises one of a claim term and a claim scheme applicable to the corrected data to be claimed;
and the report generation module is used for generating a claim settlement risk report according to the analysis result, wherein the claim settlement risk report at least comprises one of a claim amount, a claim term and a claim plan.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202111423691.2A 2021-11-26 2021-11-26 Data processing method and device, computer equipment and storage medium Pending CN114170029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111423691.2A CN114170029A (en) 2021-11-26 2021-11-26 Data processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111423691.2A CN114170029A (en) 2021-11-26 2021-11-26 Data processing method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114170029A true CN114170029A (en) 2022-03-11

Family

ID=80481341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111423691.2A Pending CN114170029A (en) 2021-11-26 2021-11-26 Data processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114170029A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237126A (en) * 2023-09-18 2023-12-15 广州美保科技有限公司 Insurance platform and insurance data processing method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117237126A (en) * 2023-09-18 2023-12-15 广州美保科技有限公司 Insurance platform and insurance data processing method

Similar Documents

Publication Publication Date Title
US20230021040A1 (en) Methods and systems for automated table detection within documents
US7475061B2 (en) Image-based document indexing and retrieval
US10489644B2 (en) System and method for automatic detection and verification of optical character recognition data
US8625886B2 (en) Finding repeated structure for data extraction from document images
CA2922512A1 (en) Method of classifying medical documents
US11869259B2 (en) Text line image splitting with different font sizes
US20210012426A1 (en) Methods and systems for anamoly detection in dental insurance claim submissions
CN111444795A (en) Bill data identification method, electronic device, storage medium and device
CN114913942A (en) Intelligent matching method and device for patient recruitment projects
CN110866457A (en) Electronic insurance policy obtaining method and device, computer equipment and storage medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN114170029A (en) Data processing method and device, computer equipment and storage medium
CN114170600A (en) Data processing method and device, computer equipment and storage medium
CN111753723B (en) Fingerprint identification method and device based on density calibration
US11715310B1 (en) Using neural network models to classify image objects
US11335108B2 (en) System and method to recognise characters from an image
CN113807256A (en) Bill data processing method and device, electronic equipment and storage medium
CN113705560A (en) Data extraction method, device and equipment based on image recognition and storage medium
US20230053464A1 (en) Systems, Methods, and Devices for Automatically Converting Explanation of Benefits (EOB) Printable Documents into Electronic Format using Artificial Intelligence Techniques
CN115761745A (en) Bill data identification method and device, electronic equipment and storage medium
CN118093527B (en) Report quality inspection method and device and electronic equipment
CN113784009B (en) Paper text image processing method and device and electronic equipment
CN118377852B (en) Data processing method and system based on multi-mode large language model
CN117831052A (en) Identification method and device for financial form, electronic equipment and storage medium
CN113053495A (en) Evaluation method and device for image information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination