CN117891926A - Text feature fraud early warning system based on artificial intelligence - Google Patents

Text feature fraud early warning system based on artificial intelligence Download PDF

Info

Publication number
CN117891926A
CN117891926A CN202410294859.1A CN202410294859A CN117891926A CN 117891926 A CN117891926 A CN 117891926A CN 202410294859 A CN202410294859 A CN 202410294859A CN 117891926 A CN117891926 A CN 117891926A
Authority
CN
China
Prior art keywords
information
fraud
unit
text
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410294859.1A
Other languages
Chinese (zh)
Other versions
CN117891926B (en
Inventor
张卫平
李显阔
邵胜博
王晶
丁洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Global Digital Group Co Ltd
Original Assignee
Global Digital Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Global Digital Group Co Ltd filed Critical Global Digital Group Co Ltd
Priority to CN202410294859.1A priority Critical patent/CN117891926B/en
Publication of CN117891926A publication Critical patent/CN117891926A/en
Application granted granted Critical
Publication of CN117891926B publication Critical patent/CN117891926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Human Computer Interaction (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a text feature fraud early warning system based on artificial intelligence, which relates to the field of electric digital data processing and comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module, wherein the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in a case and analyzing the text information to obtain text features, the information interception module is used for acquiring chat information received by a terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information; the system is based on a large number of actual cases, the text information is intelligently analyzed to obtain the characteristic information, and the chat content on the terminal is identified as fraud according to the characteristic information, so that the fraud information can be effectively pre-warned.

Description

Text feature fraud early warning system based on artificial intelligence
Technical Field
The invention relates to the field of electric digital data processing, in particular to a text feature fraud early warning system based on artificial intelligence.
Background
Currently, many software can push out social functions, which can expand social circles, but the risk of fraud is increased accordingly, especially for the elderly, the elderly can easily get trust through chat and then be fraudulently, so a system is needed to identify fraud and early warn timely on text information in chat content, so that the user can be alerted and the possibility of damage is reduced.
The foregoing discussion of the background art is intended to facilitate an understanding of the present invention only. This discussion is not an admission or admission that any of the material referred to was common general knowledge.
Many fraud early warning systems have been developed, and through a large number of searches and references, the existing fraud early warning systems are found to have a system as disclosed in publication number CN114641004B, and these systems generally include a data acquisition module, a data analysis module and a mobile terminal feedback module; the data acquisition module acquires text data and/or voice data transmitted through a network or a telecommunication in a time period and sends the text data and the voice data in the time period to the data analysis module; the data analysis module analyzes and judges whether the text data and/or the voice number collected in the time period are fraudulent or not according to the stripes; if the fraud mode exists or is used for fraud, the data analysis module compares the existing fraud mode, and if the fraud mode does not exist, the data analysis module alerts the mobile terminal feedback module connected with the data analysis module that the fraud mode does not exist. But the system aims at text content in the process of analyzing the data, so that new fraud content is not easy to identify.
Disclosure of Invention
The invention aims to provide a text characteristic fraud early warning system based on artificial intelligence aiming at the defects.
The invention adopts the following technical scheme:
a text feature fraud early warning system based on artificial intelligence comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
further, the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;
further, the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vector
S23, calculating a structural coding value according to the following formula:
s24, corresponding structural codes are given according to the range of the structural code values;
further, the process of generating the feature information by the feature generation processor includes the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
further, the intelligent fraud recognition unit comprises a code conversion processor, a recognition control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the recognition control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the recognition control processor.
The beneficial effects obtained by the invention are as follows:
the system classifies the existing fraud cases, and performs intelligent analysis on the text content to obtain structural feature information, compared with the text feature information, the structural feature information can identify new fraud content more easily, early warning is performed on an end user by displaying the corresponding case content, and the possibility of loss of the user caused by fraud is reduced.
For a further understanding of the nature and the technical aspects of the present invention, reference should be made to the following detailed description of the invention and the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the invention.
Drawings
FIG. 1 is a schematic diagram of the overall structural framework of the present invention;
fig. 2 is a schematic diagram of a case acquisition module according to the present invention;
FIG. 3 is a schematic diagram of a text parsing module according to the present invention;
FIG. 4 is a schematic diagram of a feature analysis unit according to the present invention;
FIG. 5 is a schematic diagram showing the construction of the intelligent fraud recognition unit.
Detailed Description
The following embodiments of the present invention are described in terms of specific examples, and those skilled in the art will appreciate the advantages and effects of the present invention from the disclosure herein. The invention is capable of other and different embodiments and its several details are capable of modification and variation in various respects, all without departing from the spirit of the present invention. The drawings of the present invention are merely schematic illustrations, and are not intended to be drawn to actual dimensions. The following embodiments will further illustrate the related art content of the present invention in detail, but the disclosure is not intended to limit the scope of the present invention.
Embodiment one: the embodiment provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module in combination with fig. 1;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
the feature analysis unit comprises a semantic analysis processor, a structure analysis processor and a feature generation processor, wherein the semantic analysis processor is used for reading and understanding text information, the structure analysis processor structurally codes each paragraph based on semantic information, and the feature generation processor analyzes the structural code strings to generate feature information;
the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vector
S23, calculating a structural coding value according to the following formula:
s24, corresponding structural codes are given according to the range of the structural code values;
the process of generating feature information by the feature generation processor comprises the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
the intelligent fraud identification unit comprises a code conversion processor, an identification control processor and a fraud calculation processor, wherein the code conversion processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for carrying out calculation processing on the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor.
Embodiment two: the embodiment comprises the whole content of the first embodiment, and provides a text feature fraud early warning system based on artificial intelligence, which comprises a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
referring to fig. 2, the case collection module includes an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
referring to fig. 3, the text analysis module includes a text extraction unit, a feature analysis unit and a feature storage unit, where the text extraction unit is used to extract text information in each case, the feature analysis unit is used to analyze the text information to obtain text features, and the feature storage unit is used to store the text feature information obtained by analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified;
the privacy processing unit comprises an exclusive information detection processor and an information replacement processor, wherein the exclusive information detection processor is used for detecting exclusive information in case information, and the information replacement processor is used for replacing the detected exclusive information with a plain replacing word;
the exclusive information comprises a person name, a place name and the like;
the case classification unit comprises a key identification processor and a classification coding processor, wherein the key identification processor is used for identifying key information in case information, and the classification coding processor is used for classifying and coding cases according to the identified key information;
the key information includes, but is not limited to, fraud means, fraud content, fraud means, etc.;
the fraud means refers to an identity relation formed by a fraud person and a person to be spoofed, the fraud content refers to chat content of the fraud person and the person to be spoofed, and the fraud means refers to a profit obtaining way of the fraud person;
the text extraction unit comprises an identity locking processor and a text information register, wherein the identity locking processor is used for locking the identity identification information of a fraudster in each case, and the text information register is used for extracting and storing the text information sent by the identification information;
the text information stored in the text information register is stored in a partition mode according to the classification of the case columns;
referring to fig. 4, the feature analysis unit includes a semantic analysis processor, a structure analysis processor and a feature generation processor, where the semantic analysis processor is used to read and understand text information, the structure analysis processor structurally encodes each paragraph based on semantic information, and the feature generation processor analyzes the structural encoding string to generate feature information;
the semantic analysis processor and the structure analysis processor are used for analyzing one case;
the characteristic generation processor is used for analyzing and processing the same type of cases;
the process of the semantic analysis processor for analyzing the text information comprises the following steps:
s1, identifying keywords in text information;
s2, converting the keywords into semantic codes;
s3, segmenting semantic codes based on coding integrity, wherein the coding integrity refers to the fact that each segment contains necessary semantic coding types;
the semantic information consists of a plurality of sections of semantic codes which are arranged in sequence;
in step S2, the similar keywords are converted into the same semantic codes;
the parsing process of the structure parsing processor for each section of semantic coding sequence comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary very common vectorIn the secondary common vector, the element value corresponding to the common code is 1, the element value corresponding to the unusual code is 0, and in the secondary unusual vector, the element value corresponding to the common code is 0, and the element value corresponding to the unusual code is 1;
s23, calculating a structural coding value according to the following formula:
s24, corresponding structural codes are given according to the range of the structural code values;
the process of generating feature information by the feature generation processor comprises the following steps:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, sequencing the statistical result from high to low to obtain a series { N (i) }, and simultaneously establishing a query table of the statistical item, namely a mapping relation table between the serial number of the statistical item and the adjacent structural codes;
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
{K i all 1 in the initial state;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i The information of the pre-warning value and the statistics item is used as characteristic information to be sent to the characteristic storage unit;
the information management unit comprises a time control processor and an update detection processor, wherein the time control processor is used for recording the non-update time of each piece of chat information and deleting the corresponding chat information when the non-update time exceeds a set value, and the update detection processor is used for sending backup information to the fraud early warning module as an early warning detection target when the chat information update is detected;
the anomaly screening unit comprises a keyword register, a keyword comparison processor and a paragraph screening processor, wherein the keyword register is used for storing keyword information, the keyword comparison processor is used for comparing chat information with keywords, and the paragraph screening processor is used for selecting paragraph information of which the keyword content exceeds an anomaly threshold value and sending the paragraph information to the intelligent fraud identification unit;
referring to fig. 5, the intelligent fraud recognition unit includes a transcoding processor, an identification control processor and a fraud calculation processor, wherein the transcoding processor is used for sending paragraph information to the feature analysis unit and obtaining structural code string information, the identification control processor is used for acquiring all types of feature information from the feature storage unit and sequentially sending the feature information to the fraud calculation processor, and the fraud calculation processor is used for calculating the structural code string information based on the feature information and feeding the structural code string information back to the identification control processor;
the fraud calculation processor calculates an evaluation value of the structural coding string, when the evaluation value is larger than the early warning value, the structural coding string is judged to be fraud, and the identification control processor judges a corresponding fraud type according to the type of the transmitted characteristic information;
the early warning feedback unit comprises a case screening processor and a case display processor, wherein the case screening processor is used for screening cases of the same fraud type from the data storage unit, and the case display processor is used for displaying case contents in a pop-up window on the terminal;
the i appearing above is an ordinal number used to represent a sequence number.
The foregoing disclosure is only a preferred embodiment of the present invention and is not intended to limit the scope of the invention, so that all equivalent technical changes made by applying the description of the present invention and the accompanying drawings are included in the scope of the present invention, and in addition, elements in the present invention can be updated as the technology develops.

Claims (5)

1. The text feature fraud early warning system based on artificial intelligence is characterized by comprising a case acquisition module, a text analysis module, an information interception module and a fraud early warning module;
the case acquisition module is used for acquiring real fraud case information, the text analysis module is used for extracting text information in the case and analyzing the text information to obtain text characteristics, the information interception module is used for acquiring chat information received by the terminal, and the fraud early warning module is used for carrying out fraud recognition and early warning on the chat information;
the case collection module comprises an information input unit, a privacy processing unit, a case classification unit and a data storage unit, wherein the information input unit is used for inputting case data, the privacy processing unit is used for protecting privacy information in the case data, the case classification unit is used for classifying cases based on fraud types, and the data storage unit is used for storing the case data according to classification results;
the text analysis module comprises a text extraction unit, a feature analysis unit and a feature storage unit, wherein the text extraction unit is used for extracting text information in each case, the feature analysis unit is used for analyzing the text information to obtain text features, and the feature storage unit is used for storing the text feature information obtained through analysis;
the information interception module comprises an application docking unit, an information backup unit and an information management unit, wherein the application docking unit is used for docking with communication software in the terminal, the information backup unit is used for backing up chat information received in the communication software, and the information management unit is used for managing the persistence and transfer of the backup information;
the fraud early warning module comprises an anomaly screening unit, an intelligent fraud identification unit and an early warning feedback unit, wherein the anomaly screening unit screens out anomaly paragraphs from the chat information based on keywords, the intelligent fraud identification unit is used for carrying out feature model processing on the anomaly paragraphs, and the early warning feedback unit screens out matched case information for early warning when fraud is identified.
2. The artificial intelligence based text feature fraud pre-warning system of claim 1, wherein the feature parsing unit includes a semantic parsing processor for reading the understanding text information, a structural parsing processor for structurally encoding each paragraph based on the semantic information, and a feature generation processor for analyzing the structural encoding string to generate feature information.
3. The text feature fraud pre-warning system based on artificial intelligence as claimed in claim 2, wherein the parsing process of each section of the semantic code sequence by the structure parsing processor comprises the following steps:
s21, obtaining the type of each semantic code in the sequence to form a basic vector
Wherein Tp i Representing the type value of the ith semantic code, n being the number of semantic codes in the sequence;
s22, classifying semantic codes into common codes and unusual codes based on commonness, and obtaining secondary common vectors according to positions of the common codes and positions of the unusual codes in the sequenceAnd a secondary extraordinary vector +.>
S23, calculating a structural coding value according to the following formula:
s24, corresponding structural codes are given according to the range of the structural code values.
4. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 3, wherein said feature generation processor generates feature information comprising the steps of:
s31, acquiring all structural coding string information of the same case;
s32, counting the number of all adjacent two structural codes as a statistical item, and sequencing the statistical results from high to low to obtain a number sequence { N (i) };
s33, reserving statistical terms with the number sequence values larger than the effective threshold, and calculating an evaluation value V of each case structural coding string according to the following formula:
where m is the number of retained statistics, k i For the weight coefficient of the ith statistical term, E (i) represents whether the ith statistical term exists in the structural coding string, when E (i) =1 exists, and when E (i) =0 does not exist;
s34, calculating standard deviations of all case evaluation values V;
s35, adjusting weight coefficient { K i Repeating the steps S33 to S34 until the standard deviation of the evaluation value is smaller than the stability threshold value, and taking the minimum evaluation value at the moment as an early warning value;
s36, weighting coefficient { K } i And sending the pre-warning value and the statistical item information to the feature storage unit as feature information.
5. The text feature fraud pre-warning system based on artificial intelligence as recited in claim 4, wherein said intelligent fraud recognition unit includes a transcoding processor for transmitting paragraph information to said feature parsing unit and obtaining structural code string information, an identification control processor for acquiring all types of feature information from said feature storage unit and sequentially transmitting to said fraud calculation processor, and a fraud calculation processor for calculating and feeding back structural code string information to said identification control processor based on feature information.
CN202410294859.1A 2024-03-15 2024-03-15 Text feature fraud early warning system based on artificial intelligence Active CN117891926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410294859.1A CN117891926B (en) 2024-03-15 2024-03-15 Text feature fraud early warning system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410294859.1A CN117891926B (en) 2024-03-15 2024-03-15 Text feature fraud early warning system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN117891926A true CN117891926A (en) 2024-04-16
CN117891926B CN117891926B (en) 2024-05-14

Family

ID=90647676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410294859.1A Active CN117891926B (en) 2024-03-15 2024-03-15 Text feature fraud early warning system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN117891926B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150288791A1 (en) * 2014-04-03 2015-10-08 Wavemarket, Inc. Telephone fraud management system and method
US10460320B1 (en) * 2016-08-10 2019-10-29 Electronic Arts Inc. Fraud detection in heterogeneous information networks
CN111669757A (en) * 2020-06-15 2020-09-15 国家计算机网络与信息安全管理中心 Terminal fraud call identification method based on conversation text word vector
CN111666765A (en) * 2020-06-02 2020-09-15 国家计算机网络与信息安全管理中心 Fraud topic analysis method and system based on k-means text clustering
CN113095858A (en) * 2021-05-07 2021-07-09 广州市刑事科学技术研究所 Method for identifying fraud-related short text

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150288791A1 (en) * 2014-04-03 2015-10-08 Wavemarket, Inc. Telephone fraud management system and method
US10460320B1 (en) * 2016-08-10 2019-10-29 Electronic Arts Inc. Fraud detection in heterogeneous information networks
CN111666765A (en) * 2020-06-02 2020-09-15 国家计算机网络与信息安全管理中心 Fraud topic analysis method and system based on k-means text clustering
CN111669757A (en) * 2020-06-15 2020-09-15 国家计算机网络与信息安全管理中心 Terminal fraud call identification method based on conversation text word vector
CN113095858A (en) * 2021-05-07 2021-07-09 广州市刑事科学技术研究所 Method for identifying fraud-related short text

Also Published As

Publication number Publication date
CN117891926B (en) 2024-05-14

Similar Documents

Publication Publication Date Title
CN111694879B (en) Multielement time sequence abnormal mode prediction method and data acquisition monitoring device
CN106202561B (en) Digitlization contingency management case base construction method and device based on text big data
CN111045847B (en) Event auditing method, device, terminal equipment and storage medium
CN111612041B (en) Abnormal user identification method and device, storage medium and electronic equipment
CN111950937A (en) Key personnel risk assessment method based on fusion space-time trajectory
CN111784528A (en) Abnormal community detection method and device, computer equipment and storage medium
CN112347223B (en) Document retrieval method, apparatus, and computer-readable storage medium
CN111753087B (en) Public opinion text classification method, apparatus, computer device and storage medium
CN115296853B (en) Network attack detection method based on network time-space characteristics
CN111368867B (en) File classifying method and system and computer readable storage medium
CN115048464A (en) User operation behavior data detection method and device and electronic equipment
US20220012538A1 (en) Compact representation and time series segment retrieval through deep learning
Onik et al. An analytical comparison on filter feature extraction method in data mining using J48 classifier
Hacker k-simplex2vec: a simplicial extension of node2vec
CN111177367A (en) Case classification method, classification model training method and related products
CN111797177A (en) Financial time sequence classification method for abnormal financial account detection and application
CN116976318A (en) Intelligent auditing system for switching operation ticket of power grid based on deep learning and model reasoning
CN110659997A (en) Data cluster identification method and device, computer system and readable storage medium
CN117891926B (en) Text feature fraud early warning system based on artificial intelligence
CN115344563B (en) Data deduplication method and device, storage medium and electronic equipment
CN116452353A (en) Financial data management method and system
CN115982646A (en) Multi-source test data management method and system based on cloud platform
CN112559823B (en) Data standardized data acquisition method
CN115510248A (en) Method for constructing and analyzing person behavior characteristic knowledge graph based on deep learning
CN114298712A (en) Encryption currency abnormal transaction detection method and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant