CN117891926A

CN117891926A - Text feature fraud early warning system based on artificial intelligence

Info

Publication number: CN117891926A
Application number: CN202410294859.1A
Authority: CN
Inventors: 张卫平; 李显阔; 邵胜博; 王晶; 丁洋
Original assignee: Global Digital Group Co Ltd
Current assignee: Global Digital Group Co Ltd
Priority date: 2024-03-15
Filing date: 2024-03-15
Publication date: 2024-04-16
Anticipated expiration: 2044-03-15
Also published as: CN117891926B

Abstract

The invention provides a text feature fraud early warning system based on artificial intelligence, which relates to the field of electric digital data processing and comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module, wherein the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in a case and analyzing the text information to obtain text features, the information interception module is used for acquiring chat information received by a terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information; the system is based on a large number of actual cases, the text information is intelligently analyzed to obtain the characteristic information, and the chat content on the terminal is identified as fraud according to the characteristic information, so that the fraud information can be effectively pre-warned.

Description

Text feature fraud early warning system based on artificial intelligence

Technical Field

The invention relates to the field of electric digital data processing, in particular to a text feature fraud early warning system based on artificial intelligence.

Background

Currently, many software can push out social functions, which can expand social circles, but the risk of fraud is increased accordingly, especially for the elderly, the elderly can easily get trust through chat and then be fraudulently, so a system is needed to identify fraud and early warn timely on text information in chat content, so that the user can be alerted and the possibility of damage is reduced.

The foregoing discussion of the background art is intended to facilitate an understanding of the present invention only. This discussion is not an admission or admission that any of the material referred to was common general knowledge.

Many fraud early warning systems have been developed, and through a large number of searches and references, the existing fraud early warning systems are found to have a system as disclosed in publication number CN114641004B, and these systems generally include a data acquisition module, a data analysis module and a mobile terminal feedback module; the data acquisition module acquires text data and/or voice data transmitted through a network or a telecommunication in a time period and sends the text data and the voice data in the time period to the data analysis module; the data analysis module analyzes and judges whether the text data and/or the voice number collected in the time period are fraudulent or not according to the stripes; if the fraud mode exists or is used for fraud, the data analysis module compares the existing fraud mode, and if the fraud mode does not exist, the data analysis module alerts the mobile terminal feedback module connected with the data analysis module that the fraud mode does not exist. But the system aims at text content in the process of analyzing the data, so that new fraud content is not easy to identify.

Disclosure of Invention

The invention aims to provide a text characteristic fraud early warning system based on artificial intelligence aiming at the defects.

The invention adopts the following technical scheme:

a text feature fraud early warning system based on artificial intelligence comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;

the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;

the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;

the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;

the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;

the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;

further, the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;

further, the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:

s21, obtaining the type of each semantic code in the sequence to form a basic vector；

Wherein Tp _i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;

s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vector；

S23, calculating a structural coding value according to the following formula:

；

s24, corresponding structural codes are given according to the range of the structural code values;

further, the process of generating the feature information by the feature generation processor includes the following steps:

s31, acquiring all structural coding string information of the same case;

s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };

s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:

；

where m is the number of retained statistics, k _i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;

s34, calculating standard deviations of all case evaluation values V;

s35, adjusting weight coefficient { K _i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;

s36, weighting coefficient { K } _i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;

further, the intelligent fraud recognition unit comprises a code conversion processor, a recognition control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the recognition control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the recognition control processor.

The beneficial effects obtained by the invention are as follows:

the system classifies the existing fraud cases, and performs intelligent analysis on the text content to obtain structural feature information, compared with the text feature information, the structural feature information can identify new fraud content more easily, early warning is performed on an end user by displaying the corresponding case content, and the possibility of loss of the user caused by fraud is reduced.

For a further understanding of the nature and the technical aspects of the present invention, reference should be made to the following detailed description of the invention and the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the invention.

Drawings

FIG. 1 is a schematic diagram of the overall structural framework of the present invention;

fig. 2 is a schematic diagram of a case acquisition module according to the present invention;

FIG. 3 is a schematic diagram of a text parsing module according to the present invention;

FIG. 4 is a schematic diagram of a feature analysis unit according to the present invention;

FIG. 5 is a schematic diagram showing the construction of the intelligent fraud recognition unit.

Detailed Description

The following embodiments of the present invention are described in terms of specific examples, and those skilled in the art will appreciate the advantages and effects of the present invention from the disclosure herein. The invention is capable of other and different embodiments and its several details are capable of modification and variation in various respects, all without departing from the spirit of the present invention. The drawings of the present invention are merely schematic illustrations, and are not intended to be drawn to actual dimensions. The following embodiments will further illustrate the related art content of the present invention in detail, but the disclosure is not intended to limit the scope of the present invention.

Embodiment one: the embodiment provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module in combination with fig. 1;

the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;

the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:

S23, calculating a structural coding value according to the following formula:

；

the process of generating feature information by the feature generation processor comprises the following steps:

s31, acquiring all structural coding string information of the same case;

；

s34, calculating standard deviations of all case evaluation values V;

the intelligent fraud identification unit comprises a code conversion processor, an identification control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor.

Embodiment two: the embodiment comprises the whole content of the first embodiment, and provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;

referring to fig. 2, the case collection module includes an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;

referring to fig. 3, the text analysis module includes a text extraction unit, a feature analysis unit and a feature storage unit, where the text extraction unit is used to extract text information in each case, the feature analysis unit is used to analyze the text information to obtain text features, and the feature storage unit is used to store the text feature information obtained by analysis;

the privacy processing unit comprises an exclusive information detection processor and an information replacement processor, wherein the exclusive information detection processor is used for detecting exclusive information in case information, and the information replacement processor is used for replacing the detected exclusive information with a plain replacing word;

the exclusive information comprises a person name, a place name and the like;

the case classification unit comprises a key identification processor and a classification coding processor, wherein the key identification processor is used for identifying key information in case information, and the classification coding processor is used for classifying and coding cases according to the identified key information;

the key information includes, but is not limited to, fraud means, fraud content, fraud means, etc.;

the fraud means refers to an identity relation formed by a fraud person and a person to be spoofed, the fraud content refers to chat content of the fraud person and the person to be spoofed, and the fraud means refers to a profit obtaining way of the fraud person;

the text extraction unit comprises an identity locking processor and a text information register, wherein the identity locking processor is used for locking the identity identification information of a fraudster in each case, and the text information register is used for extracting and storing the text information sent by the identification information;

the text information stored in the text information register is stored in a partition mode according to the classification of the case columns;

referring to fig. 4, the feature analysis unit includes a semantic analysis processor, a structure analysis processor and a feature generation processor, where the semantic analysis processor is used to read and understand text information, the structure analysis processor structurally encodes each paragraph based on semantic information, and the feature generation processor analyzes the structural encoding string to generate feature information;

the semantic analysis processor and the structure analysis processor are used for analyzing one case;

the characteristic generation processor is used for analyzing and processing the same type of cases;

the process of the semantic analysis processor for analyzing the text information comprises the following steps:

s1, identifying keywords in text information;

s2, converting the keywords into semantic codes;

s3, segmenting semantic codes based on coding integrity, wherein the coding integrity refers to the fact that each segment contains necessary semantic coding types;

the semantic information consists of a plurality of sections of semantic codes which are arranged in sequence;

in step S2, the similar keywords are converted into the same semantic codes;

s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vectorIn the secondary common vector, the element value corresponding to the common code is 1, the element value corresponding to the unusual code is 0, and in the secondary unusual vector, the element value corresponding to the common code is 0, and the element value corresponding to the unusual code is 1;

s23, calculating a structural coding value according to the following formula:

；

s31, acquiring all structural coding string information of the same case;

s32, counting the number of all adjacent two structural codes as a statistical item, sequencing the statistical result from high to low to obtain a series { N (i) }, and simultaneously establishing a query table of the statistical item, namely a mapping relation table between the serial number of the statistical item and the adjacent structural codes;

；

{K _i all 1 in the initial state;

s34, calculating standard deviations of all case evaluation values V;

the information management unit comprises a time control processor and an update detection processor, wherein the time control processor is used for recording the non-update time of each piece of chat information and deleting the corresponding chat information when the non-update time exceeds a set value, and the update detection processor is used for sending backup information to the fraud early warning module as an early warning detection target when the chat information update is detected;

the anomaly screening unit comprises a keyword register, a keyword comparison processor and a paragraph screening processor, wherein the keyword register is used for storing keyword information, the keyword comparison processor is used for comparing chat information with keywords, and the paragraph screening processor is used for selecting paragraph information of which the keyword content exceeds an anomaly threshold value and sending the paragraph information to the intelligent fraud identification unit;

referring to fig. 5, the intelligent fraud recognition unit includes a transcoding processor, an identification control processor and a fraud calculation processor, wherein the transcoding processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for calculating the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor;

the fraud calculation processor calculates an evaluation value of the structural coding string, when the evaluation value is larger than the early warning value, the structural coding string is judged to be fraud, and the identification control processor judges a corresponding fraud type according to the type of the transmitted characteristic information;

the early warning feedback unit comprises a case screening processor and a case display processor, wherein the case screening processor is used for screening cases of the same fraud type from the data storage unit, and the case display processor is used for displaying case contents in a pop-up window on the terminal;

the i appearing above is an ordinal number used to represent a sequence number.

The foregoing disclosure is only a preferred embodiment of the present invention and is not intended to limit the scope of the invention, so that all equivalent technical changes made by applying the description of the present invention and the accompanying drawings are included in the scope of the present invention, and in addition, elements in the present invention can be updated as the technology develops.

Claims

1. The text feature fraud early warning system based on artificial intelligence is characterized by comprising a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;

the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified.

2. The artificial intelligence based text feature fraud pre-warning system of claim 1, wherein the feature parsing unit includes a semantic parsing processor for reading the understanding text information, a structural parsing processor for structurally encoding each paragraph based on the semantic information, and a feature generation processor for analyzing the structural encoding string to generate feature information.

3. The text feature fraud pre-warning system based on artificial intelligence as claimed in claim 2, wherein the parsing process of each section of the semantic code sequence by the structure parsing processor comprises the following steps:

s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary extraordinary vector +.>；

S23, calculating a structural coding value according to the following formula:

；

s24, corresponding structural codes are given according to the range of the structural code values.

4. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 3, wherein said feature generation processor generates feature information comprising the steps of:

s31, acquiring all structural coding string information of the same case;

；

s34, calculating standard deviations of all case evaluation values V;

s36, weighting coefficient { K } _i And sending the pre-warning value and the statistical item information to the feature storage unit as feature information.

5. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 4, wherein said intelligent fraud recognition unit includes a transcoding processor for transmitting paragraph information to said feature parsing unit and obtaining structural code string information, an identification control processor for acquiring all types of feature information from said feature storage unit and sequentially transmitting to said fraud calculation processor, and a fraud calculation processor for calculating and feeding back structural code string information to said identification control processor based on feature information.