CN110019404B - System and method for determining tax-recommending classification code of commodity - Google Patents
System and method for determining tax-recommending classification code of commodity Download PDFInfo
- Publication number
- CN110019404B CN110019404B CN201711450703.4A CN201711450703A CN110019404B CN 110019404 B CN110019404 B CN 110019404B CN 201711450703 A CN201711450703 A CN 201711450703A CN 110019404 B CN110019404 B CN 110019404B
- Authority
- CN
- China
- Prior art keywords
- commodity
- utilization rate
- value
- invoice data
- classification code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/12—Accounting
- G06Q40/123—Tax preparation or submission
Abstract
The invention provides a system for determining a recommended tax classification code of a commodity, which comprises: the invoice data acquisition unit acquires taxpayer information and value-added tax invoice data; the invoice data cleaning unit is used for preprocessing the value-added tax invoice data acquired by the invoice data acquisition unit and cleaning redundant data without utilization value in the invoice data; the invoice data analysis unit is used for calculating the utilization rate of each sort of classified code once issued by each commodity in the invoice data; the invoice model establishing unit is used for correcting the utilization rate of the classification codes of each commodity according to the weight value of the taxpayer on the utilization rate of the classification codes of each commodity, and normalizing the corrected utilization rate of the classification codes to establish a mathematical model; and the test unit is used for importing invoice data with known commodity classification codes into the established invoice model for inspection to obtain the optimal value of the weight value and determine the recommended classification codes of each commodity.
Description
Technical Field
The present invention relates to the field of tax control, and more particularly, to a system and method for determining a recommended tax classification code for a good.
Background
According to the relevant requirements in the bulletin about the tax collection management items related to the comprehensive push away business tax improvement value-added tax test points (No. 23 in 2016 of the State tax administration bulletin), the State tax administration starts to try out the relevant functions of adding tax classification codes and assigning codes in invoicing software at 6 months of 2016. According to the regulation, the taxpayer must assign codes to each commodity invoiced in the invoicing process in future to normally issue the value-added tax invoice. The operation of commodity coding greatly increases the workload of invoicing of taxpayers and reduces the invoicing efficiency, so that the resistance is higher in the process of pushing the taxpayers for nearly one year, and the taxpayers in many places even refuse to upgrade.
According to the operation flow of issuing the value-added tax invoice in the past, the invoicer can directly issue the invoice in the invoicing interface, no matter the invoice to be issued, whether the commodity is coded or not. However, after the tax classification coding function is added, the invoicer must assign codes to the commodities to be invoiced on the commodity classification coding setting interface before invoicing, and then invoices can be invoiced. When assigning codes to commodities, the selection must be carried out in thousands of commodity classification codes. Therefore, the billing workload of the drawer is increased, the user experience of the taxpayer is reduced, the taxpayer is inaccurate in assigning codes, and the data accuracy of the tax classification codes is greatly reduced.
Disclosure of Invention
In order to solve the technical problems of large workload and low accuracy in determining the classification code of the commodity when a taxpayer issues an invoice in the background art, the invention provides a system for determining the recommended tax classification code of the commodity, which comprises the following steps:
the invoice data acquisition unit is used for acquiring taxpayer information and value-added tax invoice data, and in practical application, as the national tax administration starts to try in 2016 (6) months to add related functions of tax classification coding and assigning codes in invoicing software, the acquired data of the invention is all value-added tax invoice data which is issued after 2016 (6) months;
the invoice data cleaning unit is used for preprocessing the value-added tax invoice data acquired by the invoice data acquisition unit, cleaning redundant data without utilization value in the invoice data, and effectively improving the efficiency of invoice data analysis of subsequent data by invoice cleaning;
the invoice data analysis unit is used for calculating the utilization rate of each classification code once issued by each commodity in the invoice data;
an invoice model establishing unit for correcting the utilization rate of the classification code of each commodity according to the weight value of the taxpayer on the utilization rate of the classification code of each commodity and normalizing the corrected utilization rate of the classification code to establish a mathematical model, wherein when the industry to which the commodity belongs and the operating range of the taxpayer both conform to the commodity, the weight value of the utilization rate is set to be alpha, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the utilization rate is set to be beta, and when the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the utilization rate is set to be gamma;
and the testing unit is used for importing invoice data with known commodity classification codes into the established invoice model, setting different alpha, beta and gamma, then testing, solving the optimal values of the industry and the operation range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate, and calculating the utilization rate of each tax classification code of each commodity based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
Further, the data collected by the invoice data collection unit comprises a gold tax third period, invoicing software, taxpayer information of an invoice platform and value-added tax invoice data.
Further, the invoice data cleaning unit is used for preprocessing invoice data collected by the invoice data collecting unit and led into a Hadoop data platform, and redundant data in the invoice data are cleaned through a Spark program.
Further, the formula for the invoice data analysis unit to calculate the usage rate of each classification code of each commodity is as follows:
wherein, PiIs the first of each commodityi utilization of the classified coding, AiThe total number of invoices of all taxpayers of the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Further, the formula for the invoice model establishing unit to correct the usage rate of the classification code of each commodity according to the weight value of the taxpayer on the usage rate of the classification code of each commodity is as follows:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, and B is the sum of the invoices of all taxpayers of all classification codes of each commodity.
Further, the invoice model building unit normalizes the corrected classification code usage to build a mathematical model according to the following formula:
wherein, Pi"is the usage rate of each commodity after the ith classification code is normalized,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Further, the system also comprises a commodity tax classification code recommending unit, wherein the commodity tax classification code recommending unit is used for sequencing the normalized utilization rate of different classification codes of each commodity from large to small, and feeding back 1 to 3 tax classification codes with the maximum utilization rate as recommended tax classification codes to the drawer client.
In the application, a tax classification code recommending interface is designed for the invoicing software, when an invoicer inputs a commodity name at a client, the client sends a tax classification code recommending request to the system, the system feeds back the determined tax classification code to the client by a commodity tax classification code recommending unit according to the result of data model calculation, and the background automatically and intelligently matches and recommends the tax classification code of the commodity when the invoicer fills the commodity name of the invoice, so that the invoicer invoicing smoothness and the accuracy of the tax classification code are improved.
Further, the value of α of the invoice model creation unit is 1, the value of β is 0.5, and the value of γ is 0.2.
According to another aspect of the present invention, there is provided a method of determining a recommended tax classification code for an item, the method comprising:
collecting taxpayer information and value-added tax invoice data;
preprocessing collected value-added tax invoice data, and cleaning redundant data without utilization value in the invoice data;
aiming at each commodity in the invoice data after the redundant data is eliminated, calculating the utilization rate of each classification code issued by the commodity;
correcting the utilization rate of the classification codes of each commodity according to the weight value of the taxpayer on the utilization rate of the classification codes of each commodity, and normalizing the corrected utilization rate of the classification codes to establish a mathematical model, wherein when the industry to which the commodity belongs and the operating range of the taxpayer both conform to the commodity, the weight value of the utilization rate is set to be alpha, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the utilization rate is set to be beta, and when the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the utilization rate is set to be gamma;
and importing invoice data with known commodity classification codes into the established invoice model, setting different alpha, beta and gamma, then testing, solving the optimal values of the industry and the operation range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate, and calculating the utilization rate of each tax classification code of each commodity based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
Further, the collected value-added tax invoice data is preprocessed by importing the invoice data collected by the invoice data collection unit into a Hadoop data platform and cleaning redundant data in the invoice data by using a Spark program.
Further, the formula for calculating the usage rate of each category code of each commodity is:
wherein, PiIs the utilization rate of the ith classification code of each commodity, AiThe total number of invoices of all taxpayers of the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Further, the formula for correcting the usage rate of the classification code of each commodity according to the weight value of the taxpayer on the usage rate of the classification code of each commodity is as follows:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, and B is the sum of the invoices of all taxpayers of all classification codes of each commodity.
Further, the formula for normalizing the corrected usage rate of the classified codes to establish the mathematical model is as follows:
wherein, Pi"is the usage rate of each commodity after the ith classification code is normalized,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Further, the method also comprises the steps of sorting the usage rates of different classification codes of each commodity after normalization from large to small, and feeding back 1 to 3 tax classification codes with the maximum usage rate value as the recommended tax classification codes to the drawer client.
Further, α is 1, β is 0.5, and γ is 0.2.
In conclusion, the invention provides a model for determining the tax classification code recommended to the commodity, which is continuously improved by introducing known invoice data for learning, determines the optimal values of the weighted values of the utilization rates of the three commodity tax classification codes, and automatically and intelligently recommends the commodity tax classification code through a commodity tax classification code recommending unit, thereby effectively improving the filling accuracy and efficiency of the tax classification code.
Drawings
A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:
FIG. 1 is a block diagram of a system for determining a recommended tax classification code for a good in accordance with an embodiment of the present invention;
FIG. 2 is a flowchart of a method for determining a recommended tax classification code for a good according to an embodiment of the present invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.
Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
FIG. 1 is a block diagram of a system for determining a recommended tax classification code for a good according to an embodiment of the present invention. As shown in FIG. 1, the system 100 for determining a recommended tax classification code of a commodity according to the present invention comprises:
an invoice data acquisition unit 101 for acquiring taxpayer information and value-added tax invoice data;
the invoice data cleaning unit 102 is used for preprocessing the value-added tax invoice data acquired by the invoice data acquisition unit, cleaning redundant data without value in the invoice data, and effectively improving the efficiency of invoice data analysis of subsequent data by invoice cleaning;
the invoice data analysis unit 103 is used for calculating the utilization rate of each classification code once issued by each commodity in the invoice data;
an invoice model establishing unit 104 for correcting the usage rate of the classification code of each commodity according to the magnitude of the weight value of the taxpayer on the usage rate of the classification code of each commodity, and normalizing the corrected usage rate of the classification code to establish a mathematical model, wherein when both the industry to which the commodity belongs and the operating range of the taxpayer conform to the commodity, the weight value of the usage rate is set to be α, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the usage rate is set to be β, and when both the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the usage rate is set to be γ;
and the testing unit 105 is used for importing invoice data with known commodity classification codes into the established invoice model, setting different alpha, beta and gamma, then testing, solving optimal values of the industry and the operation range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate, and calculating the utilization rate of each tax classification code of each commodity based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
The testing unit can further correct the values of alpha, beta and gamma by adding known invoice data for testing, thereby improving the accuracy of determining the recommended classification code of the commodity.
Preferably, the data collected by the invoice data collection unit 101 includes the third period of the gold tax, the invoicing software, taxpayer information of the invoice platform, and value-added tax invoice data.
Preferably, the invoice data cleaning unit 102 performs preprocessing to introduce the invoice data collected by the invoice data collection unit 101 into a Hadoop data platform, and uses a Spark program to clean redundant data in the invoice data.
Preferably, the formula for the invoice data analysis unit 103 to calculate the usage rate of each classification code of each commodity is:
wherein, PiIs the utilization rate of the ith classification code of each commodity, AiThe total number of invoices of all taxpayers of the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Preferably, the formula for the invoice model building unit 104 to correct the usage rate of the classification code of each commodity according to the weight value of the taxpayer on the usage rate of the classification code of each commodity is as follows:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, and B is the sum of the invoices of all taxpayers of all classification codes of each commodity.
Preferably, the invoice model building unit 104 normalizes the corrected classification code usage to build the mathematical model according to the formula:
wherein, Pi"is the usage rate of each commodity after the ith classification code is normalized,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Preferably, the system further comprises a commodity tax classification code recommending unit 106, which is used for sorting the normalized utilization rate of different classification codes of each commodity from large to small, and feeding back 1 to 3 tax classification codes with the largest utilization rate value as the recommended tax classification codes to the drawer client.
Preferably, the invoice model creation unit 104 has a value of 1 for α, 0.5 for β, and 0.2 for γ.
FIG. 2 is a flowchart of a method for determining a recommended tax classification code for a good according to an embodiment of the present invention. As shown in FIG. 2, the method 200 for determining a recommended tax classification code for an item of merchandise according to the present invention begins at step 201.
In step 201, taxpayer information and value-added tax invoice data are collected;
in step 202, preprocessing the collected value-added tax invoice data, and cleaning redundant data without utilization value in the invoice data;
in step 203, calculating the utilization rate of each classification code once issued by each commodity in the invoice data after the redundant data is eliminated;
in step 204, correcting the utilization rate of the classification codes of each commodity according to the weight value of the taxpayer on the utilization rate of the classification codes of each commodity, and normalizing the corrected utilization rate of the classification codes to establish a mathematical model, wherein when the industry to which the commodity belongs and the operating range of the taxpayer both conform to the commodity, the weight value of the utilization rate is set to be alpha, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the utilization rate is set to be beta, and when the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the utilization rate is set to be gamma;
in step 205, the invoice data with known commodity classification codes is imported into the established invoice model, different alpha, beta and gamma are set, then testing is performed, the optimal values of the industry and the business range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate are solved, and the utilization rate of each tax classification code of each commodity is calculated based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
Preferably, the pretreatment of the collected value-added tax invoice data is to introduce the invoice data collected by the invoice data collection unit into a Hadoop data platform, and use a Spark program to clean redundant data in the invoice data.
Preferably, the formula for calculating the usage rate of each category code for each commodity is:
wherein, PiIs the utilization rate of the ith classification code of each commodity, AiThe total number of invoices of all taxpayers of the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Preferably, the formula for correcting the usage rate of the classification code of each commodity according to the weight value of the taxpayer on the usage rate of the classification code of each commodity is as follows:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, and B is the sum of the invoices of all taxpayers of all classification codes of each commodity.
Preferably, the formula for normalizing the corrected usage of classified codes to create the mathematical model is:
wherein, Pi"is the usage rate of each commodity after the ith classification code is normalized,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number.
Preferably, the method further comprises the step 206 of sorting the normalized utilization rates of the different classification codes of each commodity from large to small, and feeding back the 1 to 3 tax classification codes with the maximum utilization rate value as the recommended tax classification codes to the drawer client.
Preferably, the value of α is 1, the value of β is 0.5 and the value of γ is 0.2.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ means, component, etc. ] are to be interpreted openly as referring to at least one instance of said means, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
Claims (9)
1. A system for determining a recommended tax classification code for an item, the system comprising:
the invoice data acquisition unit is used for acquiring taxpayer information and value-added tax invoice data;
the invoice data cleaning unit is used for preprocessing the value-added tax invoice data acquired by the invoice data acquisition unit and cleaning redundant data without utilization value in the invoice data;
the invoice data analysis unit is used for calculating the utilization rate of each classification code once issued by each commodity in the invoice data, and the calculation formula is as follows:
wherein, PiIs the utilization rate of the ith classification code of each commodity, AiThe total number of invoicing times of all taxpayers of the ith classification code of each commodity, B is the sum of the invoicing times of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number;
an invoice model establishing unit for correcting the utilization rate of the classification codes of each commodity according to the weight value of the taxpayer on the utilization rate of the classification codes of each commodity and normalizing the corrected utilization rate of the classification codes to establish a mathematical model, wherein when the industry to which the commodity belongs and the operating range of the taxpayer both conform to the commodity, the weight value of the utilization rate is set to be alpha, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the utilization rate is set to be beta, when the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the utilization rate is set to be gamma, and then a formula for correcting the utilization rate of the classification codes of each commodity and a formula of the mathematical model are respectively:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, Pi"is the usage rate of each commodity after the ith classification code is normalized,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number;
and the testing unit is used for importing invoice data with known commodity classification codes into the established invoice model, setting different alpha, beta and gamma, then testing, solving the optimal values of the industry and the operation range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate, and calculating the utilization rate of each tax classification code of each commodity based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
2. The system according to claim 1, wherein the data collected by the invoice data collection unit comprises tax payer information and value added tax invoice data of a gold tax three-phase, invoicing software and invoice platform.
3. The system according to claim 1, wherein the invoice data cleaning unit preprocesses the invoice data collected by the invoice data collection unit to be imported into a Hadoop data platform, and redundant data in the invoice data is cleaned by using a Spark program.
4. The system of claim 1, further comprising a commodity tax classification code recommending unit, configured to perform normalized usage ranking on different classification codes of each commodity, and feed back a tax classification code corresponding to a maximum value as a recommended tax classification code to the drawer client.
5. The system of claim 1, wherein the invoice model building block has a value for α of 1, a value for β of 0.5 and a value for γ of 0.2.
6. A method of determining a recommended tax classification code for a commodity, the method comprising:
collecting taxpayer information and value-added tax invoice data;
preprocessing collected value-added tax invoice data, and cleaning redundant data without utilization value in the invoice data;
for each commodity in the invoice data after the redundant data is eliminated, calculating the utilization rate of each classification code issued by the commodity, wherein the calculation formula is as follows:
wherein, PiIs the utilization rate of the ith classification code of each commodity, AiThe total number of invoicing times of all taxpayers of the ith classification code of each commodity, B is the sum of the invoicing times of all taxpayers of all classification codes of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number;
correcting the utilization rate of the classification codes of each commodity according to the weight value of the taxpayer on the utilization rate of the classification codes of each commodity, and normalizing the corrected utilization rate of the classification codes to establish a mathematical model, wherein when the industry to which the commodity belongs and the operating range of the taxpayer both conform to the commodity, the weight value of the utilization rate is set to be alpha, when one of the industry to which the commodity belongs and the operating range of the taxpayer conforms to the commodity, the weight value of the utilization rate is set to be beta, and when the industry to which the commodity belongs and the operating range of the taxpayer do not conform to the commodity, the weight value of the utilization rate is set to be gamma, and then the formula for correcting the utilization rate of the classification codes of each commodity and the formula of the mathematical model are respectively:
wherein, Pi' is the corrected usage rate, X, of the ith classification code of each commodityiIs the total number of invoicing times of taxpayers with the weight value of alpha in the ith classification code of each commodity, YiIs the total number of invoicing times of taxpayers with the weight value of beta in the ith classification code of each commodity, ZiIs the total number of invoices of taxpayers with the weight value of gamma in the ith classification code of each commodity, B is the sum of the invoices of all taxpayers of all classification codes of each commodity, Pi"is the ith seed of each commodityThe usage rate of the class code after normalization,is the sum of the corrected utilization rates of each classified code of each commodity, i is more than or equal to 1 and less than or equal to n, and n is a natural number;
and importing invoice data with known commodity classification codes into the established invoice model, setting different alpha, beta and gamma, then testing, solving the optimal values of the industry and the operation range of the commodities in the invoice model to the weight values alpha, beta and gamma of the utilization rate, and calculating the utilization rate of each tax classification code of each commodity based on the determined optimal values of the weight values to determine the recommended classification code of each commodity.
7. The method as claimed in claim 6, wherein the preprocessing of the collected value-added tax invoice data is to import the invoice data collected by the invoice data collection unit into a Hadoop data platform, and to clean redundant data in the invoice data by using a Spark program.
8. The method of claim 6, wherein the normalized usage rates of the different taxation codes of each commodity are ranked, and wherein the taxation code corresponding to the maximum value is the recommended taxation code for the commodity.
9. The method of claim 6, wherein α has a value of 1, β has a value of 0.5, and γ has a value of 0.2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711450703.4A CN110019404B (en) | 2017-12-27 | 2017-12-27 | System and method for determining tax-recommending classification code of commodity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711450703.4A CN110019404B (en) | 2017-12-27 | 2017-12-27 | System and method for determining tax-recommending classification code of commodity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110019404A CN110019404A (en) | 2019-07-16 |
CN110019404B true CN110019404B (en) | 2022-01-07 |
Family
ID=67187046
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711450703.4A Active CN110019404B (en) | 2017-12-27 | 2017-12-27 | System and method for determining tax-recommending classification code of commodity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019404B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377801A (en) * | 2019-07-24 | 2019-10-25 | 浙江诺诺网络科技有限公司 | A kind of product name bearing calibration, device and computer readable storage medium |
CN110597995B (en) * | 2019-09-20 | 2022-03-11 | 税友软件集团股份有限公司 | Commodity name classification method, commodity name classification device, commodity name classification equipment and readable storage medium |
CN113052616A (en) * | 2021-03-15 | 2021-06-29 | 北京金和网络股份有限公司 | Cold chain product tracing method, device and system |
CN115809887B (en) * | 2022-12-09 | 2023-10-10 | 蔷薇大树科技有限公司 | Method and device for determining main business scope of enterprise based on invoice data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102004979A (en) * | 2009-09-03 | 2011-04-06 | 叶克 | System and method for providing commodity matching and promoting services |
CN104102833A (en) * | 2014-07-10 | 2014-10-15 | 西安交通大学 | Intensive interval discovery based tax index normalization and fusion calculation method |
CN105117426A (en) * | 2015-07-31 | 2015-12-02 | 重庆龙工场跨境电子商务投资有限公司 | Intelligent search system for HSCODE |
CN105631742A (en) * | 2015-12-24 | 2016-06-01 | 安徽融信金模信息技术有限公司 | Small and medium enterprise credit evaluation method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200411457A (en) * | 2002-12-20 | 2004-07-01 | Hon Hai Prec Ind Co Ltd | Notes receivable management system and method |
-
2017
- 2017-12-27 CN CN201711450703.4A patent/CN110019404B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102004979A (en) * | 2009-09-03 | 2011-04-06 | 叶克 | System and method for providing commodity matching and promoting services |
CN104102833A (en) * | 2014-07-10 | 2014-10-15 | 西安交通大学 | Intensive interval discovery based tax index normalization and fusion calculation method |
CN105117426A (en) * | 2015-07-31 | 2015-12-02 | 重庆龙工场跨境电子商务投资有限公司 | Intelligent search system for HSCODE |
CN105631742A (en) * | 2015-12-24 | 2016-06-01 | 安徽融信金模信息技术有限公司 | Small and medium enterprise credit evaluation method |
Non-Patent Citations (1)
Title |
---|
《开票不难!商品及税收分类编码选择技巧》;吴海燕;《https://www.dongao.com/c/2017-12-13/829765.shtml》;20171213;第1-7页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110019404A (en) | 2019-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110019404B (en) | System and method for determining tax-recommending classification code of commodity | |
CN104572449A (en) | Automatic test method based on case library | |
CN104036420A (en) | Method for batch checking, downloading and utilizing invoices based on national network invoice platform | |
CN109766384A (en) | The method and apparatus of automatic conversion data metering unit in a kind of visualization system | |
CN106251178A (en) | Data digging method and device | |
CN105975486A (en) | Information recommendation method and apparatus | |
CN110019798B (en) | Method and system for measuring commodity type difference of sale and sale items | |
CN114398560B (en) | Marketing interface setting method, device, equipment and medium based on WEB platform | |
CN114372731B (en) | Post target making method, device, equipment and storage medium based on big data | |
CN112307098A (en) | Cost consultation management method, system, electronic equipment and computer readable storage medium | |
CN110009796B (en) | Invoice category identification method and device, electronic equipment and readable storage medium | |
CN110032513B (en) | Data verification method and device and electronic equipment | |
CN112861500A (en) | Engineering pricing table generation method and device based on engineering quantity list | |
CN112052310A (en) | Information acquisition method, device, equipment and storage medium based on big data | |
CN114676931B (en) | Electric quantity prediction system based on data center technology | |
CN111460293B (en) | Information pushing method and device and computer readable storage medium | |
CN114781855A (en) | DEA model-based logistics transmission efficiency analysis method, device, equipment and medium | |
CN113487256A (en) | Purchase, sale and storage management method, device and equipment and storage medium | |
CN113592479A (en) | Charging method and device based on multi-stage increasing rate | |
CN113988800A (en) | Method and device for checking abnormal electric quantity user, computer equipment and storage medium | |
CN113743894A (en) | Method and system for establishing rechecking rule model for rechecking electric bill | |
CN111179046A (en) | Method and system for realizing automatic payment and posting of sales cost based on invoice data | |
CN109584029B (en) | Method, device, medium and electronic equipment for auditing electronic invoices | |
CN115578007A (en) | Method and system for integrating calculation of points and task in tax industry | |
CN113673891A (en) | Planning method and device for iterative delivery mode |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |