CN117891926A - Text feature fraud early warning system based on artificial intelligence - Google Patents
Text feature fraud early warning system based on artificial intelligence Download PDFInfo
- Publication number
- CN117891926A CN117891926A CN202410294859.1A CN202410294859A CN117891926A CN 117891926 A CN117891926 A CN 117891926A CN 202410294859 A CN202410294859 A CN 202410294859A CN 117891926 A CN117891926 A CN 117891926A
- Authority
- CN
- China
- Prior art keywords
- information
- fraud
- unit
- text
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 13
- 238000004458 analytical method Methods 0.000 claims abstract description 49
- 238000012545 processing Methods 0.000 claims abstract description 18
- 238000011156 evaluation Methods 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000012216 screening Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 14
- 238000003032 molecular docking Methods 0.000 claims description 12
- 238000013500 data storage Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 238000007726 management method Methods 0.000 claims description 9
- 238000000034 method Methods 0.000 claims description 9
- 230000002688 persistence Effects 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 4
- 238000007405 data analysis Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
- G06Q30/0185—Product, service or business identity fraud
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Business, Economics & Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Finance (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Human Computer Interaction (AREA)
- Entrepreneurship & Innovation (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a text feature fraud early warning system based on artificial intelligence, which relates to the field of electric digital data processing and comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module, wherein the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in a case and analyzing the text information to obtain text features, the information interception module is used for acquiring chat information received by a terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information; the system is based on a large number of actual cases, the text information is intelligently analyzed to obtain the characteristic information, and the chat content on the terminal is identified as fraud according to the characteristic information, so that the fraud information can be effectively pre-warned.
Description
Technical Field
The invention relates to the field of electric digital data processing, in particular to a text feature fraud early warning system based on artificial intelligence.
Background
Currently, many software can push out social functions, which can expand social circles, but the risk of fraud is increased accordingly, especially for the elderly, the elderly can easily get trust through chat and then be fraudulently, so a system is needed to identify fraud and early warn timely on text information in chat content, so that the user can be alerted and the possibility of damage is reduced.
The foregoing discussion of the background art is intended to facilitate an understanding of the present invention only. This discussion is not an admission or admission that any of the material referred to was common general knowledge.
Many fraud early warning systems have been developed, and through a large number of searches and references, the existing fraud early warning systems are found to have a system as disclosed in publication number CN114641004B, and these systems generally include a data acquisition module, a data analysis module and a mobile terminal feedback module; the data acquisition module acquires text data and/or voice data transmitted through a network or a telecommunication in a time period and sends the text data and the voice data in the time period to the data analysis module; the data analysis module analyzes and judges whether the text data and/or the voice number collected in the time period are fraudulent or not according to the stripes; if the fraud mode exists or is used for fraud, the data analysis module compares the existing fraud mode, and if the fraud mode does not exist, the data analysis module alerts the mobile terminal feedback module connected with the data analysis module that the fraud mode does not exist. But the system aims at text content in the process of analyzing the data, so that new fraud content is not easy to identify.
Disclosure of Invention
The invention aims to provide a text characteristic fraud early warning system based on artificial intelligence aiming at the defects.
The invention adopts the following technical scheme:
a text feature fraud early warning system based on artificial intelligence comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
further, the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;
further, the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector;
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vector;
S23, calculating a structural coding value according to the following formula:
;
s24, corresponding structural codes are given according to the range of the structural code values;
further, the process of generating the feature information by the feature generation processor includes the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
;
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
further, the intelligent fraud recognition unit comprises a code conversion processor, a recognition control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the recognition control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the recognition control processor.
The beneficial effects obtained by the invention are as follows:
the system classifies the existing fraud cases, and performs intelligent analysis on the text content to obtain structural feature information, compared with the text feature information, the structural feature information can identify new fraud content more easily, early warning is performed on an end user by displaying the corresponding case content, and the possibility of loss of the user caused by fraud is reduced.
For a further understanding of the nature and the technical aspects of the present invention, reference should be made to the following detailed description of the invention and the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the invention.
Drawings
FIG. 1 is a schematic diagram of the overall structural framework of the present invention;
fig. 2 is a schematic diagram of a case acquisition module according to the present invention;
FIG. 3 is a schematic diagram of a text parsing module according to the present invention;
FIG. 4 is a schematic diagram of a feature analysis unit according to the present invention;
FIG. 5 is a schematic diagram showing the construction of the intelligent fraud recognition unit.
Detailed Description
The following embodiments of the present invention are described in terms of specific examples, and those skilled in the art will appreciate the advantages and effects of the present invention from the disclosure herein. The invention is capable of other and different embodiments and its several details are capable of modification and variation in various respects, all without departing from the spirit of the present invention. The drawings of the present invention are merely schematic illustrations, and are not intended to be drawn to actual dimensions. The following embodiments will further illustrate the related art content of the present invention in detail, but the disclosure is not intended to limit the scope of the present invention.
Embodiment one: the embodiment provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module in combination with fig. 1;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;
the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector;
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vector;
S23, calculating a structural coding value according to the following formula:
;
s24, corresponding structural codes are given according to the range of the structural code values;
the process of generating feature information by the feature generation processor comprises the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
;
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
the intelligent fraud identification unit comprises a code conversion processor, an identification control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor.
Embodiment two: the embodiment comprises the whole content of the first embodiment, and provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
referring to fig. 2, the case collection module includes an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
referring to fig. 3, the text analysis module includes a text extraction unit, a feature analysis unit and a feature storage unit, where the text extraction unit is used to extract text information in each case, the feature analysis unit is used to analyze the text information to obtain text features, and the feature storage unit is used to store the text feature information obtained by analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
the privacy processing unit comprises an exclusive information detection processor and an information replacement processor, wherein the exclusive information detection processor is used for detecting exclusive information in case information, and the information replacement processor is used for replacing the detected exclusive information with a plain replacing word;
the exclusive information comprises a person name, a place name and the like;
the case classification unit comprises a key identification processor and a classification coding processor, wherein the key identification processor is used for identifying key information in case information, and the classification coding processor is used for classifying and coding cases according to the identified key information;
the key information includes, but is not limited to, fraud means, fraud content, fraud means, etc.;
the fraud means refers to an identity relation formed by a fraud person and a person to be spoofed, the fraud content refers to chat content of the fraud person and the person to be spoofed, and the fraud means refers to a profit obtaining way of the fraud person;
the text extraction unit comprises an identity locking processor and a text information register, wherein the identity locking processor is used for locking the identity identification information of a fraudster in each case, and the text information register is used for extracting and storing the text information sent by the identification information;
the text information stored in the text information register is stored in a partition mode according to the classification of the case columns;
referring to fig. 4, the feature analysis unit includes a semantic analysis processor, a structure analysis processor and a feature generation processor, where the semantic analysis processor is used to read and understand text information, the structure analysis processor structurally encodes each paragraph based on semantic information, and the feature generation processor analyzes the structural encoding string to generate feature information;
the semantic analysis processor and the structure analysis processor are used for analyzing one case;
the characteristic generation processor is used for analyzing and processing the same type of cases;
the process of the semantic analysis processor for analyzing the text information comprises the following steps:
s1, identifying keywords in text information;
s2, converting the keywords into semantic codes;
s3, segmenting semantic codes based on coding integrity, wherein the coding integrity refers to the fact that each segment contains necessary semantic coding types;
the semantic information consists of a plurality of sections of semantic codes which are arranged in sequence;
in step S2, the similar keywords are converted into the same semantic codes;
the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector;
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vectorIn the secondary common vector, the element value corresponding to the common code is 1, the element value corresponding to the unusual code is 0, and in the secondary unusual vector, the element value corresponding to the common code is 0, and the element value corresponding to the unusual code is 1;
s23, calculating a structural coding value according to the following formula:
;
s24, corresponding structural codes are given according to the range of the structural code values;
the process of generating feature information by the feature generation processor comprises the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, sequencing the statistical result from high to low to obtain a series { N (i) }, and simultaneously establishing a query table of the statistical item, namely a mapping relation table between the serial number of the statistical item and the adjacent structural codes;
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
;
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
{K i all 1 in the initial state;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
the information management unit comprises a time control processor and an update detection processor, wherein the time control processor is used for recording the non-update time of each piece of chat information and deleting the corresponding chat information when the non-update time exceeds a set value, and the update detection processor is used for sending backup information to the fraud early warning module as an early warning detection target when the chat information update is detected;
the anomaly screening unit comprises a keyword register, a keyword comparison processor and a paragraph screening processor, wherein the keyword register is used for storing keyword information, the keyword comparison processor is used for comparing chat information with keywords, and the paragraph screening processor is used for selecting paragraph information of which the keyword content exceeds an anomaly threshold value and sending the paragraph information to the intelligent fraud identification unit;
referring to fig. 5, the intelligent fraud recognition unit includes a transcoding processor, an identification control processor and a fraud calculation processor, wherein the transcoding processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for calculating the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor;
the fraud calculation processor calculates an evaluation value of the structural coding string, when the evaluation value is larger than the early warning value, the structural coding string is judged to be fraud, and the identification control processor judges a corresponding fraud type according to the type of the transmitted characteristic information;
the early warning feedback unit comprises a case screening processor and a case display processor, wherein the case screening processor is used for screening cases of the same fraud type from the data storage unit, and the case display processor is used for displaying case contents in a pop-up window on the terminal;
the i appearing above is an ordinal number used to represent a sequence number.
The foregoing disclosure is only a preferred embodiment of the present invention and is not intended to limit the scope of the invention, so that all equivalent technical changes made by applying the description of the present invention and the accompanying drawings are included in the scope of the present invention, and in addition, elements in the present invention can be updated as the technology develops.
Claims (5)
1. The text feature fraud early warning system based on artificial intelligence is characterized by comprising a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified.
2. The artificial intelligence based text feature fraud pre-warning system of claim 1, wherein the feature parsing unit includes a semantic parsing processor for reading the understanding text information, a structural parsing processor for structurally encoding each paragraph based on the semantic information, and a feature generation processor for analyzing the structural encoding string to generate feature information.
3. The text feature fraud pre-warning system based on artificial intelligence as claimed in claim 2, wherein the parsing process of each section of the semantic code sequence by the structure parsing processor comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector;
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary extraordinary vector +.>;
S23, calculating a structural coding value according to the following formula:
;
s24, corresponding structural codes are given according to the range of the structural code values.
4. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 3, wherein said feature generation processor generates feature information comprising the steps of:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
;
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i And sending the pre-warning value and the statistical item information to the feature storage unit as feature information.
5. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 4, wherein said intelligent fraud recognition unit includes a transcoding processor for transmitting paragraph information to said feature parsing unit and obtaining structural code string information, an identification control processor for acquiring all types of feature information from said feature storage unit and sequentially transmitting to said fraud calculation processor, and a fraud calculation processor for calculating and feeding back structural code string information to said identification control processor based on feature information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410294859.1A CN117891926B (en) | 2024-03-15 | 2024-03-15 | Text feature fraud early warning system based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410294859.1A CN117891926B (en) | 2024-03-15 | 2024-03-15 | Text feature fraud early warning system based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117891926A true CN117891926A (en) | 2024-04-16 |
CN117891926B CN117891926B (en) | 2024-05-14 |
Family
ID=90647676
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410294859.1A Active CN117891926B (en) | 2024-03-15 | 2024-03-15 | Text feature fraud early warning system based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117891926B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150288791A1 (en) * | 2014-04-03 | 2015-10-08 | Wavemarket, Inc. | Telephone fraud management system and method |
US10460320B1 (en) * | 2016-08-10 | 2019-10-29 | Electronic Arts Inc. | Fraud detection in heterogeneous information networks |
CN111669757A (en) * | 2020-06-15 | 2020-09-15 | 国家计算机网络与信息安全管理中心 | Terminal fraud call identification method based on conversation text word vector |
CN111666765A (en) * | 2020-06-02 | 2020-09-15 | 国家计算机网络与信息安全管理中心 | Fraud topic analysis method and system based on k-means text clustering |
CN113095858A (en) * | 2021-05-07 | 2021-07-09 | 广州市刑事科学技术研究所 | Method for identifying fraud-related short text |
-
2024
- 2024-03-15 CN CN202410294859.1A patent/CN117891926B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150288791A1 (en) * | 2014-04-03 | 2015-10-08 | Wavemarket, Inc. | Telephone fraud management system and method |
US10460320B1 (en) * | 2016-08-10 | 2019-10-29 | Electronic Arts Inc. | Fraud detection in heterogeneous information networks |
CN111666765A (en) * | 2020-06-02 | 2020-09-15 | 国家计算机网络与信息安全管理中心 | Fraud topic analysis method and system based on k-means text clustering |
CN111669757A (en) * | 2020-06-15 | 2020-09-15 | 国家计算机网络与信息安全管理中心 | Terminal fraud call identification method based on conversation text word vector |
CN113095858A (en) * | 2021-05-07 | 2021-07-09 | 广州市刑事科学技术研究所 | Method for identifying fraud-related short text |
Also Published As
Publication number | Publication date |
---|---|
CN117891926B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111694879B (en) | Multielement time sequence abnormal mode prediction method and data acquisition monitoring device | |
CN106202561B (en) | Digitlization contingency management case base construction method and device based on text big data | |
CN111045847B (en) | Event auditing method, device, terminal equipment and storage medium | |
CN111612041B (en) | Abnormal user identification method and device, storage medium and electronic equipment | |
CN111950937A (en) | Key personnel risk assessment method based on fusion space-time trajectory | |
CN111784528A (en) | Abnormal community detection method and device, computer equipment and storage medium | |
CN112347223B (en) | Document retrieval method, apparatus, and computer-readable storage medium | |
CN111753087B (en) | Public opinion text classification method, apparatus, computer device and storage medium | |
CN115296853B (en) | Network attack detection method based on network time-space characteristics | |
CN111368867B (en) | File classifying method and system and computer readable storage medium | |
CN115048464A (en) | User operation behavior data detection method and device and electronic equipment | |
US20220012538A1 (en) | Compact representation and time series segment retrieval through deep learning | |
Onik et al. | An analytical comparison on filter feature extraction method in data mining using J48 classifier | |
Hacker | k-simplex2vec: a simplicial extension of node2vec | |
CN111177367A (en) | Case classification method, classification model training method and related products | |
CN111797177A (en) | Financial time sequence classification method for abnormal financial account detection and application | |
CN116976318A (en) | Intelligent auditing system for switching operation ticket of power grid based on deep learning and model reasoning | |
CN110659997A (en) | Data cluster identification method and device, computer system and readable storage medium | |
CN117891926B (en) | Text feature fraud early warning system based on artificial intelligence | |
CN115344563B (en) | Data deduplication method and device, storage medium and electronic equipment | |
CN116452353A (en) | Financial data management method and system | |
CN115982646A (en) | Multi-source test data management method and system based on cloud platform | |
CN112559823B (en) | Data standardized data acquisition method | |
CN115510248A (en) | Method for constructing and analyzing person behavior characteristic knowledge graph based on deep learning | |
CN114298712A (en) | Encryption currency abnormal transaction detection method and application |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |