WO2021088499A1 - Procédé et système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique - Google Patents

Procédé et système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique Download PDF

Info

Publication number
WO2021088499A1
WO2021088499A1 PCT/CN2020/113450 CN2020113450W WO2021088499A1 WO 2021088499 A1 WO2021088499 A1 WO 2021088499A1 CN 2020113450 W CN2020113450 W CN 2020113450W WO 2021088499 A1 WO2021088499 A1 WO 2021088499A1
Authority
WO
WIPO (PCT)
Prior art keywords
enterprise
network
day
representation
characterization
Prior art date
Application number
PCT/CN2020/113450
Other languages
English (en)
Chinese (zh)
Inventor
郑庆华
董博
阮建飞
范弘铖
Original Assignee
西安交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西安交通大学 filed Critical 西安交通大学
Publication of WO2021088499A1 publication Critical patent/WO2021088499A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Definitions

  • the invention belongs to the technical field of tax control, and particularly relates to a method and system for identifying false invoice issuance based on dynamic network representation.
  • False invoice issuance refers to the use of various behavioral means by enterprises to issue invoices that are inconsistent with actual business conditions in order to achieve the purpose of tax evasion.
  • network characterization technology provides a solution.
  • the method of identifying false invoice issuance based on network representation can organize isolated report information into a corporate transaction network, thereby systematically verifying all companies, and at the same time, it can also use inter-enterprise contacts to obtain more corporate information to identify false invoice companies.
  • the following patents provide reference methods based on network characterization technology to automatically identify false invoices through computers:
  • Literature 1 A detection method for false VAT invoices based on parallel loop detection (201710147850.8);
  • Document 2 A method for identifying suspicious taxpayers based on the taxpayer’s interest-related network (201410328391.X);
  • Literature 1 organizes invoice information into a static network with enterprises as nodes, and improves loop detection in the network.
  • the improvement method is to distribute computing tasks to multiple computers in a distributed cluster through a distributed parallel computing method to improve efficiency , And finally use an improved loop detection method to detect false VAT invoices.
  • Literature 2 identifies suspicious taxpayers based on the topological characteristics of the taxpayer's interest-related network (TPIN), analyzes the topological characteristics of the taxpayer's interest-related network, and obtains the taxpayer's characterization in the interest-related network, and then uses the C4.5 classifier experiment , So as to realize the function of automatically identifying suspicious taxpayers.
  • TPIN topological characteristics of the taxpayer's interest-related network
  • Literature 1 can only detect the false invoice issuance behavior of funds returning to the source account after passing through multiple accounts, and the invoice false issuance has various forms and is not limited to the loop form.
  • the method of identification The type is too single, and the generalization ability of the model is poor;
  • Literature 2 is only based on the topological structure of the taxpayer and the interest relationship, ignoring the attribute information of the enterprise, and homogenizing the enterprise, which cannot be analyzed from the perspective of enterprise scale, market share, etc.;
  • Literature 1 and Literature 2 are both limited to static networks, unable to dynamically analyze the changes in corporate transactions combined with historical information, and unable to accurately grasp the dynamic changes, which allows some companies to take advantage of them.
  • the purpose of the present invention is to provide a method and system for identifying false invoices based on dynamic network representation.
  • the invention adopts dynamic network representation, dynamically analyzes the enterprise transaction network in combination with historical information, and accurately grasps the dynamic changes of enterprise transactions; and can identify different invoice false issuing behaviors based on the related information between enterprises; at the same time, it draws on the distributed optimization algorithm to The calculation function is decomposed into independent sub-functions to be executed in parallel, which improves the efficiency of identifying false invoices.
  • a method for identifying false invoice issuance based on dynamic network representation First, the company’s transaction information is organized into a static network with the company as the node and transaction records as the edge; second, the company’s transaction network representation is established with each day as the time node.
  • a 30-day time sequence window in which 30-day static network representations are merged each time within the time sequence window, and the static network representations of all time nodes are gradually merged through the moving time sequence window to obtain the final dynamic network representation results; again, borrowing from the distributed
  • the optimization algorithm decomposes the objective function of the characterization into independent sub-functions, and optimizes the sub-functions in parallel to improve the learning efficiency of the model; finally, a two-classifier is constructed based on LightGBM to identify the enterprises suspected of false invoices.
  • the method specifically includes the following implementation steps:
  • the data is preprocessed, and then the basic information of the company is extracted.
  • the basic information of the company is roughly divided into three types: the text data is converted into a vector by the word2vec algorithm, the categorical data is coded with One-Hot, and the numerical data is standardized deal with;
  • Step 2 Feature extraction based on dynamic network representation
  • the enterprise After extracting the basic characteristics of the enterprise, the enterprise is the node, the basic information of the enterprise is the node attribute, the transaction record is the edge, and the transaction information is the attribute of the edge, and each day is the time node, and the enterprise transaction information is organized into a static network; then 30 A time sequence window is established in units of days, and 30-day static network representations are merged within the window each time, and static network representations at all times are gradually merged through the moving time sequence window to optimize the objective function of the network representation, and finally obtain the optimal dynamic enterprise transaction network Characterization
  • Step 3 Based on distributed algorithm optimization
  • Step 4 Build a classifier to identify false invoices
  • step 1 The implementation method of step 1 is as follows:
  • Step 101 data preprocessing
  • Step 102 processing text data
  • the processing of text information in the enterprise basic information table includes:
  • Step 103 processing logo type data
  • Use One-Hot coding for the discrete category data in the basic information table of the enterprise use the number of attribute values as the length to establish a status bit to mark each specific state;
  • Step 104 processing numerical data
  • step 2 The implementation method of step 2 is as follows:
  • Step 201 Establish a static corporate transaction network
  • a representation model of the corporate transaction network is established every day, so that companies with similar topological structures or higher transaction weights are closer in the representation space.
  • the objective optimization function is:
  • H i and H j characterize enterprise i and j;
  • w ij is the weight between the trading enterprise; minimize w ij
  • Step 202 Dynamically integrate historical information
  • is a parameter that defines the structural characteristics of the model and the degree of contribution to the degree of the original matrix. The larger the ⁇ the more the model pays attention to the time-series network representation, the smaller the more the node Characterization
  • step 3 The implementation method of step 3 is as follows:
  • Step 301 Decompose the objective function
  • Step 302 execute multiple sub-functions in parallel
  • Step 303 comprehensively sort the parallel results
  • Step 4 the implementation method is as follows:
  • Step 401 Combine the basic features obtained in step 1 and the dynamic network features obtained in step 3 as the learning data of the classifier;
  • Step 402 Construct a two-classification model based on LightGBM, and set the main parameters of the classifier as follows: the number of leaves is 13, the learning rate is 0.1, and the number of iterations is 100;
  • Step 403 Take the characterization results obtained from the sample set of enterprises marked as false invoices and the sample set of normal enterprises as basic features, and randomly divide them into two groups as the training set and the test set at a ratio of 3:1, and then randomly divide the training set into two groups. Use the training set to train the classification model of step 2 and use the verification set to adjust the training. If over-fitting occurs, perform pruning operations; select the optimal model to verify the algorithm in the test set accuracy;
  • Step 404 Input the characterization result of the unmarked enterprise sample into the LightGBM-based prediction model of the suspected false invoice issuance enterprise, and finally, based on the output of the prediction model, determine whether the target company has false invoice issuance behavior.
  • the present invention has the following beneficial effects:
  • the present invention is a method for identifying enterprises suspected of issuing false invoices based on the idea of dynamic network representation learning, and has the following advantages:
  • the calculation function is decomposed into independent sub-functions for parallel execution, which reduces the time complexity of computing network representation and improves the efficiency of identifying false invoices.
  • Figure 1 is the overall framework flow chart
  • Figure 2 is a schematic diagram of the basic feature extraction process
  • Figure 3 is a schematic diagram of a feature extraction process based on dynamic network representation
  • Figure 4 is a schematic diagram of the optimization process of the network characterization algorithm
  • Figure 5 is a schematic diagram of the process of constructing a classifier to identify false invoices
  • Fig. 6 is a schematic diagram of a system for identifying false invoice issuance based on dynamic network representation according to an embodiment of the present invention.
  • the method for identifying false invoices based on dynamic network representation includes the following steps:
  • the basic information of the company is roughly divided into three types: the text data is converted into a vector by the word2vec algorithm, the categorical data is coded with One-Hot, and the numerical data is standardized. .
  • the basic feature extraction implementation process specifically includes the following steps:
  • Step 1 Extract the "Taxpayer Electronic File Number" as the unique identifier of the company's characteristics, and delete all other attributes that cannot describe the company's own distribution rules;
  • Step 2 When the attribute contains a large number of missing values and only a few valid values, for example, the attributes of "taxpayer tax agency code", "financial report type” and “accounting form” are less than 10% of the enterprises with value. Choose to directly delete this feature; when the attribute has a small number of missing values, for example, "employees" and "registered capital” attributes have missing values in individual companies, choose the same mean imputation method to fill in the missing values.
  • Step 1 Use the Jieba word segmentation tool for word segmentation, construct a suitable stop table, and remove the stop words in the text.
  • the content of the "business scope" field of an enterprise in this embodiment is "production, sales: ceramics and products; goods import and export, technology import and export”.
  • the result is "production, sales, ceramics and products, goods import and export, technology import and export”;
  • Step 2 Use the dictionary tree to count the results of step 1, and select words with larger weights as keywords;
  • Step 3 Convert the N types of keywords extracted in step 2 into vectors based on word2vec.
  • One-Hot coding is used for the discrete categorical data "enterprise type” and "enterprise status" in the enterprise basic information table.
  • the number of possible values of the attribute is expressed as the length of the status bit, one of which is marked as 1 and the other is marked as 0 to indicate a specific state.
  • the "enterprise type” field has four possible values “individual proprietorship”, “partnership”, “limited liability company” and “limited liability company”. Therefore, the length of the status bit of "enterprise type” is 4, where 1000 means “sole proprietorship", 0100 means “partnership”, 0010 means “limited liability company”, and 0001 means "limited liability company”.
  • Step 1 Obtain the mean value of the "registered capital” attribute
  • n represents the number of basic information samples of the enterprise
  • x j represents the value of the j-th "registered capital”attribute
  • Step 2 Get the variance of each attribute
  • ⁇ 2 be the variance of the "registered capital” attribute, and its specific calculation form is:
  • Mean and variance are the basic indicators of numerical attributes, and numerical attributes can be standardized through the mean and variance;
  • Step 1 Establish a static corporate transaction network
  • the characterization h of each enterprise on the day can be obtained, so that the enterprises with similar transaction structure or significant transaction rights are closer in the characterization space, and then the characterization of the entire enterprise transaction network on that day can be obtained.
  • Step 2 Dynamically integrate historical information
  • the length of the timing window is 30 days. Within the timing window, 30 days of static network characteristics are merged each time, and then the timing window is moved to gradually merge all static network characteristics to minimize the target
  • the specific steps of the distributed algorithm optimization implementation process include:
  • the gradient descent algorithm is used to solve equation (4).
  • the current or Stop updating at time indicating that they are approximately equal when the representation is the representation of the corporate transaction network on that day. Therefore, for the dynamic trading network distributed on the first to T days, the characterization of the network can be obtained by calculating in order.
  • the basic feature vector of the enterprise obtained in S101 is directly placed after the dynamic network feature vector obtained in S103, and then combined into a new vector as the learning data of the classifier
  • the main parameters for setting the classifier are: the number of leaves is 13, the learning rate is 0.1, and the number of iterations is 100;
  • Step 1 Take the characterization results obtained by the sample set of enterprises marked as false invoices and the sample set of normal enterprises as basic features, and randomly divide them into two groups as the training set and the test set at a ratio of 3:1.
  • Step 2 Randomly select 10% of the data in the training set as the validation set.
  • Step 3 Use the training set to train the classification model built by S502, use the validation set to adjust the training, and perform pruning when over-fitting occurs;
  • Step 4 Iterative calculation. Since the number of iterations is set to 100, if the convergence condition is not reached for 100 iterations, the iteration is forced to stop, and the result of the last iteration is the calculated representation.
  • Step 5 Select the optimal model to verify the accuracy of the algorithm in the test set.
  • the accuracy rate verified in this embodiment is 0.957, the precision is 0.921, and the recall rate is 0.87, indicating that the model has a very good effect on the test set and can reach Requirements for the identification of false invoices in actual tax scenarios.
  • the accuracy rate is 0.876, the accuracy is 0.856, and the recall rate is 0.794.
  • the method of the present invention has improved recognition accuracy rate of 9.25%, accuracy of 7.6%, and recall rate of 9.57%. .
  • the running time of the distributed algorithm for the data sample in this embodiment is 684.57s, which is more
  • the running time of the distributed algorithm is reduced by 28.56% in 958.19s.
  • Input the characterization results of the unlabeled enterprise samples into the trained prediction model of the suspected false invoice issuance enterprise. Based on the output of the prediction model, determine whether the target enterprise has false invoice issuance behavior. In this embodiment, the predicted value is sorted from high to low. , And take the top ten percent as a suspected enterprise of false invoices
  • a system for identifying false invoices based on dynamic network representation includes:
  • the enterprise attribute feature extraction module is used to extract the basic information of the enterprise after preprocessing the data.
  • the basic information of the enterprise is roughly divided into three types: the text data is converted into a vector by the word2vec algorithm, and the categorical data is encoded by One-Hot , To standardize numerical data;
  • the dynamic network characterization building module is used to process the attribute characteristics of the enterprise to obtain the static transaction network characterization of the enterprise with each day as the time node, and then establish a 30-day time sequence window, and integrate the static network characterization through the regular term in the window, and pass Sliding the window on the time series to gradually merge all static network representations to obtain dynamic network representations;
  • Parallel optimization of the dynamic network characterization module is used to decompose the goal of enterprise dynamic network characterization into independent sub-goals.
  • Parallel optimization of the sub-objectives improves the efficiency of dynamic network characterization and obtains the final characterization result more efficiently;
  • the invoice false issuance recognition module is used to use the obtained enterprise dynamic network as the characteristics of the invoice false issuance behavior and input it into the binary classifier based on LightGBM, and use the marked enterprise sample set to train the invoice false issuance recognition model.
  • the characterization results of the sample set of enterprises for prediction are input into the trained model for prediction, and then the enterprises suspected of issuing false invoices are obtained.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention concerne un procédé et un système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique. Le procédé consiste à : premièrement, organiser des informations de transaction d'entreprise dans un réseau statique en considérant une entreprise en tant que nœud et un enregistrement de transaction en tant que bordure ; deuxièmement, établir une représentation d'un réseau de transaction d'entreprise en prenant chaque jour en tant que nœud de temps, établir une fenêtre de séquence temporelle ayant une durée de 30 jours, fusionner une représentation de réseau statique de 30 jours dans la fenêtre à chaque fois et fusionner progressivement la représentation de réseau statique de tous les nœuds de temps au moyen d'un déplacement de la fenêtre de séquence temporelle de façon à obtenir un résultat de représentation de réseau dynamique final ; troisièmement, au moyen d'un algorithme d'optimisation distribué, décomposer une fonction cible représentée en sous-fonctions indépendantes et optimiser les sous-fonctions en parallèle pour améliorer l'efficacité d'apprentissage d'un modèle ; et enfin, construire un classificateur binaire sur la base de LightGBM pour identifier une entreprise qui est suspectée d'émettre de fausses factures. Dans la présente invention, une entreprise qui est suspectée d'émettre de fausses factures est identifiée sur la base d'une représentation de réseau dynamique, ce qui permet d'améliorer l'efficacité et la précision de l'identification d'émission de fausse facture.
PCT/CN2020/113450 2019-11-04 2020-09-04 Procédé et système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique WO2021088499A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911066791.7A CN110852856B (zh) 2019-11-04 2019-11-04 一种基于动态网络表征的发票虚开识别方法
CN201911066791.7 2019-11-04

Publications (1)

Publication Number Publication Date
WO2021088499A1 true WO2021088499A1 (fr) 2021-05-14

Family

ID=69598895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113450 WO2021088499A1 (fr) 2019-11-04 2020-09-04 Procédé et système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique

Country Status (2)

Country Link
CN (1) CN110852856B (fr)
WO (1) WO2021088499A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326377A (zh) * 2021-06-02 2021-08-31 上海生腾数据科技有限公司 一种基于企业关联关系的人名消歧方法及系统
CN113642735A (zh) * 2021-07-28 2021-11-12 浪潮软件科技有限公司 虚开纳税人识别的持续学习方法
CN114219287A (zh) * 2021-12-15 2022-03-22 中国软件与技术服务股份有限公司 一种基于图神经网络的纳税人风险评测方法
CN114693309A (zh) * 2021-06-22 2022-07-01 山东浪潮爱购云链信息科技有限公司 一种虚假企业识别方法、设备及介质
CN115334005A (zh) * 2022-03-31 2022-11-11 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN117876140A (zh) * 2024-03-13 2024-04-12 杭州工猫科技有限公司 税务信息处理方法、系统与存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110852856B (zh) * 2019-11-04 2022-10-25 西安交通大学 一种基于动态网络表征的发票虚开识别方法
CN111382843B (zh) * 2020-03-06 2023-10-20 浙江网商银行股份有限公司 企业上下游关系识别模型建立、关系挖掘的方法及装置
CN111966889B (zh) * 2020-05-20 2023-04-28 清华大学深圳国际研究生院 一种图嵌入向量的生成方法以及推荐网络模型的生成方法
CN111724241B (zh) * 2020-06-05 2024-03-29 西安交通大学 基于动态边特征的图注意力网络的企业发票虚开检测方法
CN112215616B (zh) * 2020-11-30 2021-04-30 四川新网银行股份有限公司 一种基于网络的自动识别资金异常交易的方法和系统
CN114445210A (zh) * 2021-10-14 2022-05-06 中国工商银行股份有限公司 异常交易行为的检测方法及其检测装置、电子设备
CN114297319A (zh) * 2021-12-23 2022-04-08 税友信息技术有限公司 一种数据识别方法及相关装置
CN115346668A (zh) * 2022-07-29 2022-11-15 京东城市(北京)数字科技有限公司 健康风险等级评估模型的训练方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108461A1 (en) * 2009-01-07 2014-04-17 Oracle International Corporation Generic Ontology Based Semantic Business Policy Engine
CN104103011A (zh) * 2014-07-10 2014-10-15 西安交通大学 一种基于纳税人利益关联网络的可疑纳税人识别方法
CN106920162A (zh) * 2017-03-14 2017-07-04 西京学院 一种基于并行环路检测的虚开增值税专用发票检测方法
CN109583978A (zh) * 2018-11-30 2019-04-05 税友软件集团股份有限公司 一种识别虚开发票企业的方法、装置及设备
CN110852856A (zh) * 2019-11-04 2020-02-28 西安交通大学 一种基于动态网络表征的发票虚开识别方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2679209C2 (ru) * 2014-12-15 2019-02-06 Общество с ограниченной ответственностью "Аби Продакшн" Обработка электронных документов для распознавания инвойсов
CN106780001A (zh) * 2016-12-26 2017-05-31 税友软件集团股份有限公司 一种发票虚开企业监控识别方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108461A1 (en) * 2009-01-07 2014-04-17 Oracle International Corporation Generic Ontology Based Semantic Business Policy Engine
CN104103011A (zh) * 2014-07-10 2014-10-15 西安交通大学 一种基于纳税人利益关联网络的可疑纳税人识别方法
CN106920162A (zh) * 2017-03-14 2017-07-04 西京学院 一种基于并行环路检测的虚开增值税专用发票检测方法
CN109583978A (zh) * 2018-11-30 2019-04-05 税友软件集团股份有限公司 一种识别虚开发票企业的方法、装置及设备
CN110852856A (zh) * 2019-11-04 2020-02-28 西安交通大学 一种基于动态网络表征的发票虚开识别方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU HONGCHAO YUHONGCHAO@STU.XJTU.EDU.CN; HE HUAN HEHUAN@MAIL.XJTU.EDU.CN; ZHENG QINGHUA QHZHENG@MAIL.XJTU.EDU.CN; DONG BO DONG.BO@M: "TaxVis: a Visual System for Detecting Tax Evasion Group", THE WORLD WIDE WEB CONFERENCE, ACM, 2 PENN PLAZA, SUITE 701NEW YORKNY10121-0701USA, 13 May 2019 (2019-05-13) - 17 May 2019 (2019-05-17), 2 Penn Plaza, Suite 701New YorkNY10121-0701USA, pages 3610 - 3614, XP058471442, ISBN: 978-1-4503-6674-8, DOI: 10.1145/3308558.3314144 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326377A (zh) * 2021-06-02 2021-08-31 上海生腾数据科技有限公司 一种基于企业关联关系的人名消歧方法及系统
CN113326377B (zh) * 2021-06-02 2023-10-13 上海生腾数据科技有限公司 一种基于企业关联关系的人名消歧方法及系统
CN114693309A (zh) * 2021-06-22 2022-07-01 山东浪潮爱购云链信息科技有限公司 一种虚假企业识别方法、设备及介质
CN113642735A (zh) * 2021-07-28 2021-11-12 浪潮软件科技有限公司 虚开纳税人识别的持续学习方法
CN113642735B (zh) * 2021-07-28 2023-07-18 浪潮软件科技有限公司 虚开纳税人识别的持续学习方法
CN114219287A (zh) * 2021-12-15 2022-03-22 中国软件与技术服务股份有限公司 一种基于图神经网络的纳税人风险评测方法
CN115334005A (zh) * 2022-03-31 2022-11-11 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN115334005B (zh) * 2022-03-31 2024-03-22 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN117876140A (zh) * 2024-03-13 2024-04-12 杭州工猫科技有限公司 税务信息处理方法、系统与存储介质

Also Published As

Publication number Publication date
CN110852856B (zh) 2022-10-25
CN110852856A (zh) 2020-02-28

Similar Documents

Publication Publication Date Title
WO2021088499A1 (fr) Procédé et système d'identification d'émission de fausse facture basés sur une représentation de réseau dynamique
Zhao et al. Distributed feature selection for efficient economic big data analysis
CN110415111A (zh) 基于用户数据与专家特征合并逻辑回归信贷审批的方法
Mehta et al. Stock price prediction using machine learning and sentiment analysis
CN108734567A (zh) 一种基于大数据人工智能风控的资产管理系统及其评估方法
CN111783829A (zh) 一种基于多标签学习的财务异常检测方法及装置
CN104850868A (zh) 一种基于k-means和神经网络聚类的客户细分方法
CN110689437A (zh) 一种基于随机森林的通信施工项目财务风险预测方法
CN111754317A (zh) 一种金融投资数据测评方法及系统
CN111724241A (zh) 基于动态边特征增强的图注意力网络的企业发票虚开检测方法
CN111626331B (zh) 一种自动化行业分类装置及其工作方法
CN113590807A (zh) 一种基于大数据挖掘的科技企业信用评价方法
CN111625578A (zh) 适用于文化科技融合领域时间序列数据的特征提取方法
CN116542800A (zh) 基于云端ai技术的智能化财务报表分析系统
Ding et al. A novel hybrid method for oil price forecasting with ensemble thought
CN112329862A (zh) 基于决策树的反洗钱方法及系统
Zhang A model combining LightGBM and neural network for high-frequency realized volatility forecasting
Zhai et al. Big data analysis of accounting forecasting based on machine learning
Mao et al. Information system construction and research on preference of model by multi-class decision tree regression
Guo et al. Statistical decision research of long-term deposit subscription in banks based on decision tree
CN114187081A (zh) 估值表处理方法、装置、电子设备及计算机可读存储介质
CN111967937A (zh) 一种基于时间序列分析的电商推荐系统及实现方法
Shen et al. Stock trends prediction by hypergraph modeling
CN111191688A (zh) 一种用户分期期数管理方法、装置和电子设备
CN113962568B (zh) 基于支持向量机的模型标签标注方法、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20884592

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20884592

Country of ref document: EP

Kind code of ref document: A1