CN116956353B - Multi-channel data acquisition method and device based on digital economy - Google Patents

Multi-channel data acquisition method and device based on digital economy Download PDF

Info

Publication number
CN116956353B
CN116956353B CN202311203721.8A CN202311203721A CN116956353B CN 116956353 B CN116956353 B CN 116956353B CN 202311203721 A CN202311203721 A CN 202311203721A CN 116956353 B CN116956353 B CN 116956353B
Authority
CN
China
Prior art keywords
information
data
product
value
digital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311203721.8A
Other languages
Chinese (zh)
Other versions
CN116956353A (en
Inventor
丁新云
杨作铭
刘卫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eden Information Service Ltd
Original Assignee
Eden Information Service Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eden Information Service Ltd filed Critical Eden Information Service Ltd
Priority to CN202311203721.8A priority Critical patent/CN116956353B/en
Publication of CN116956353A publication Critical patent/CN116956353A/en
Application granted granted Critical
Publication of CN116956353B publication Critical patent/CN116956353B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of digital economy, and particularly discloses a multichannel data acquisition method and device based on digital economy. The method comprises the steps of collecting and acquiring public information on a network through a web crawler, screening the collected digital economic data information, carrying out privacy treatment on the classified digital economic data information, carrying out desensitization treatment on the digital economic data information in the privacy treatment process, carrying out legal judgment on the desensitized information, and taking the legal data information as the digital economic data information required to be extracted, so that the acquired data can be prevented from avoiding user privacy, and further the problem of infringement of acquired information can be avoided.

Description

Multi-channel data acquisition method and device based on digital economy
Technical Field
The invention relates to the technical field of digital economy, in particular to a multichannel data acquisition method and device based on digital economy.
Background
Digital economies refer to the economic activity of digitally producing, distributing and consuming goods and services based on digital technology. It covers the aspects of digitized business model, digitized production mode, digitized market and digitized transaction process. The development of digital economies depends on advanced information and communication technologies including computers, the internet, mobile communications, big data, artificial intelligence, internet of things, etc. The rapid development and wide application of the technologies change the traditional economic organization mode and operation rule, and promote the digitization and intelligent process of economic activities;
in the process of digital economic data acquisition, a web crawler is generally adopted to classify and extract information disclosed on a network, but the web crawler cannot avoid user privacy in the process of acquiring data, so that the problem of information infringement acquisition is caused.
Disclosure of Invention
The invention aims to provide a multichannel data acquisition method according to digital economy, which comprises the following steps:
the method comprises the steps of obtaining digital economic data information captured by a web crawler, screening the digital economic data information to obtain screening information, wherein the screening information comprises product information, market research information and market value information, inputting the digital economic data information into a classification model, and extracting the digital economic data information through a support vector machine to obtain product region range information;
carrying out logistic regression processing on the product region range information, and extracting to obtain product subdivision information;
inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
coding the grouping product information to obtain a product coding value;
judging whether the product coding value is matched with a preset characteristic value or not;
if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be determined;
carrying out information extraction on the captured digital economic data information according to the screening information to obtain private data information;
acquiring a data link corresponding to the private data information, and performing desensitization processing on the digital economic data information corresponding to the data link to obtain desensitized digital economic data information;
judging whether the desensitized digital economic data information is legal data or not;
if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as digital economic data information;
if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
Preferably, after the step of defining the product code value as the data code value corresponding to the first data information to be data, the method includes:
acquiring a preset data coding value corresponding to the digital economic data information;
inputting the data coding numerical value and the preset data coding numerical value into a market similarity model, and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,ai is a data coding value, bi is a preset data coding value;
judging whether the calculated similarity value is equal to a preset value or not;
and if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
Preferably, before the step of screening the digital economic data information, the method further comprises:
acquiring the disclosure time corresponding to the digital economic data information, and acquiring data structure information according to the disclosure time;
inputting the data structure information into a traceability model to obtain node data information;
verifying the node data information and the original log information to obtain original data storage information;
extracting the original data storage information to obtain an original data chain;
judging whether the digital economic data information is matched with the original data storage information or not;
and if so, screening the digital economic data information.
Preferably, the step of screening the digital economic data information to obtain screening information includes:
acquiring product unit price corresponding to the product information to be acquired;
acquiring current market sales corresponding to market research information to be acquired;
acquiring historical market sales corresponding to market research information to be acquired;
calculating a market growth estimated value according to the product unit price, the current market sales and the historical market sales, wherein the calculation formula is as follows:
wherein A is a market growth forecast value, b is a historical market sales, c is a current market sales, and e is a product unit price.
Preferably, the step of extracting the information of the captured digital economic data information according to the screening information to obtain private data information includes:
acquiring personal identity information corresponding to the digital economic data information, and extracting the personal identity information to obtain an identity number;
obtaining a disturbance base number;
calculating a disturbance value corresponding to the desensitized digital economic data information according to the identity number and the disturbance base, wherein a calculation formula is as follows:
wherein Q is a disturbance value, n is an identity number, j is a disturbance base,presetting a tolerable deviation coefficient;
and encrypting the desensitized digital economic data information according to the disturbance value to obtain private data information.
The application also provides a multi-channel data acquisition device according to digital economy, comprising:
the system comprises a first acquisition module, a second acquisition module and a support vector machine, wherein the first acquisition module is used for acquiring digital economic data information captured by a web crawler, screening the digital economic data information to obtain screening information, the screening information comprises product information, market research information and market value information, inputting the digital economic data information into a classification model, and extracting the digital economic data information through the support vector machine to obtain product region range information;
the first extraction module is used for carrying out logistic regression processing on the product region range information and extracting to obtain product subdivision information;
the second extraction module is used for inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
the first coding module is used for coding the grouping product information to obtain a product coding value;
the first judging module is used for judging whether the product coding value is matched with a preset characteristic value or not;
if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be determined;
the third extraction module is used for extracting the information of the captured digital economic data information according to the screening information to obtain private data information;
the second acquisition module is used for acquiring a data link corresponding to the private data information and performing desensitization processing on the digital economic data information corresponding to the data link to obtain desensitized digital economic data information;
the second judging module is used for judging whether the desensitized digital economic data information is legal data or not;
if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as digital economic data information;
if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
Preferably, the first judging module includes:
the first acquisition unit is used for acquiring preset data coding values corresponding to the digital economic data information;
the first calculation unit is used for inputting the data coding numerical value and the preset data coding numerical value into a market similarity model and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,ai is a data coding value, bi is a preset data coding value;
a first judging unit for judging whether the calculated similarity value is equal to a preset value;
and if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
The present application also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the above method when executing the computer program.
The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.
The beneficial effects of this application are: according to the invention, the web crawlers acquire and acquire public information on the network, then screen the acquired digital economic data information, then carry out privacy treatment on the classified digital economic data information, and in the privacy treatment process, the digital economic data information is required to be subjected to desensitization treatment, the information subjected to the desensitization treatment is subjected to legal judgment, and then the legal data information is taken as the digital economic data information required to be extracted, so that the acquired data can be prevented from avoiding user privacy, and further the problem of infringement of acquired information can be avoided.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of an apparatus structure according to an embodiment of the present application.
Fig. 3 is a schematic diagram of an internal structure of a computer device according to an embodiment of the present application.
The realization, functional characteristics and advantages of the present application will be further described with reference to the embodiments, referring to the attached drawings.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1-3, the present application provides a multi-channel data acquisition method according to digital economy, comprising:
s1, acquiring digital economic data information captured by a web crawler, and screening the digital economic data information to obtain screening information, wherein the screening information comprises product information, market research information and market value information;
s2, carrying out information extraction on the captured digital economic data information according to the screening information to obtain private data information;
s3, acquiring a data link corresponding to the private data information, and performing desensitization processing on the digital economic data information corresponding to the data link to obtain desensitized digital economic data information;
s4, judging whether the desensitized digital economic data information is legal data or not;
s5, if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as the digital economic data information;
and S6, if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
As described in the above steps S1-S6, in the process of collecting digital economic data, a web crawler is generally used to classify and extract information disclosed on a network, but the web crawler cannot avoid privacy of a user in the process of obtaining data, so as to cause the problem of obtaining information infringement.
In one embodiment, the step of screening the digital economic data information to obtain screening information includes:
s7, screening the digital economic data information to obtain screening information, wherein the step of screening the digital economic data information comprises the following steps:
s8, inputting the digital economic data information into a classification model, and extracting to obtain product region range information through a support vector machine;
s9, carrying out logistic regression processing on the product area range information, and extracting to obtain product subdivision information;
s10, inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
s11, coding the grouping product information to obtain a product coding value;
s12, judging whether the product coding value is matched with a preset characteristic value or not;
s13, if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
and S14, if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be tested.
When the digital economic data information is screened in the steps S7-S14, the digital economic data information acquired by the web crawler is input into a classification model, wherein the classification model can be a logistic regression, a decision tree, a random forest, a support vector machine and a clustering model, the support vector machine is selected for screening, the support vector machine is a common classification model, the product area information is extracted through the support vector machine, wherein the product information comprises a product name, a product description, a product specification, a product price, a product model, a product picture and the like, the range of the digital economic data information can be constrained through the support vector machine, then the data in the range of the digital economic data information is subdivided, and the digital economic data information in the constrained range is processed through the logistic regression, the method comprises the steps of dividing the information corresponding to the fixed products into K clusters, each cluster is represented by a centroid, the clustering model algorithm continuously optimizes the centroid position in an iterative mode to minimize the distance between a sample in the same cluster and the centroid, for example, for the grouping of the product subdivision information, the product is regarded as each cluster as a centroid, then the information corresponding to the product is regarded as a sample, the product is close to the information corresponding to the product, thus the same product can be gathered, a plurality of gathered products are further grouped, then the grouped product information is obtained, the grouped product information is encoded, the grouped product information is endowed with a corresponding number, obtaining a product code value, judging whether the product code value is matched with a preset characteristic value, wherein the preset characteristic value can be a data table arranged in history acquisition, if so, transmitting the product code value into a mapping function, so that the mapping function can convert the product code value into corresponding product information, and generating the product information through a designated logic, wherein the designated logic can be digital conversion, standardization, normalization and the like, and further, the required product information can be obtained after screening the digital economic data acquired by the web crawler.
Preferably, after the step of defining the product code value as the data code value corresponding to the first data information to be data, the method includes:
s15, obtaining a preset data coding value corresponding to the digital economic data information;
s16, inputting the data coding numerical value and the preset data coding numerical value into a market similarity model, and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,ai is a data coding value, bi is a preset data coding value;
s17, judging whether the calculated similarity value is equal to a preset value;
and S18, if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
After the product information is obtained by screening, the similarity analogy is performed on the data code values corresponding to the remaining first data information to be tested, the preset data code values corresponding to the digital economic data information are obtained first, wherein the preset data code values can be codes corresponding to market research information set in history acquisition, the corresponding codes comprise economic indexes, demographic data, market scale, consumption behavior, competition conditions and the like, the data code values and the preset data code values are input into a market similarity model, so that the newly obtained digital economic data information can be input into the market similarity model trained by the history data, whether the similarity value is equal to 1 is calculated, if so, the information corresponding to the first data code values is identical to the market research information in the history record, then the information corresponding to the first data code values is marked as the market research information, and whether the obtained digital economic data information is the required market research information can be rapidly judged through the similarity model.
In one embodiment, before the step of screening the digital economic data information, the step of screening the digital economic data information comprises:
s19, acquiring the disclosure time corresponding to the digital economic data information, and acquiring data structure information according to the disclosure time;
s20, inputting the data structure information into a traceability model to obtain node data information;
s21, verifying the node data information and the original log information to obtain original data storage information;
s22, extracting the original data storage information to obtain an original data chain;
s23, judging whether the digital economic data information is matched with the original data storage information or not;
and S24, screening the digital economic data information if the digital economic data information is matched with the digital economic data information.
As described in the above steps S19-S24, the disclosure time corresponding to the digital economic data information is required to be obtained before the digital economic data information is screened, so that timeliness of obtaining the digital economic data information can be ensured through the corresponding disclosure time, and the digital economic data information in the timeliness can be subjected to source tracing verification, so that the obtained digital economic data information can be prevented from being tampered, and further the obtained digital economic data information has no credible value, and therefore, authenticity of the digital economic data information can be ensured through the source tracing processing.
In one embodiment, the step of screening the digital economic data information to obtain screening information includes:
s25, acquiring product unit price corresponding to the product information to be acquired;
s26, acquiring current market sales corresponding to the market research information to be acquired;
s27, acquiring historical market sales corresponding to market research information to be acquired;
s28, calculating a market growth estimated value according to the product unit price, the current market sales and the historical market sales, wherein the calculation formula is as follows:
wherein A is a market growth forecast value, b is a historical market sales, c is a current market sales, and e is a product unit price.
After screening the digital economic data information, subtracting the current market sales corresponding to the required collected market research information from the historical market sales corresponding to the required collected market research information, dividing the difference obtained by subtraction by the current market sales corresponding to the required collected market research information, and multiplying the product unit price corresponding to the required collected product information by the quotient, wherein the product of the multiplication is equal to the market growth predicted value, and further, the digital economic can be accurately predicted through the market growth predicted value.
In one embodiment, the step of extracting the information of the captured digital economic data information according to the screening information to obtain the private data information includes:
s29, acquiring personal identity information corresponding to the digital economic data information, and extracting the personal identity information to obtain an identity number;
s30, obtaining a disturbance base number;
s31, calculating a disturbance value corresponding to desensitized digital economic data information according to the identity number and the disturbance base number, wherein a calculation formula is as follows:
wherein Q is a disturbance value, n is an identity number, j is a disturbance base,presetting a tolerable deviation coefficient;
s32, encrypting the desensitized digital economic data information according to the disturbance value to obtain private data information.
As described in the above steps S29-S32, when the private data information is acquired, the acquired private data information needs to be encrypted, so that lawless persons can be prevented from acquiring the private information of the user and further causing disclosure of the private information of the user, and in the process of preventing disclosure of the private information of the user, the corresponding identity number and the disturbance base number are generated by the user to calculate the disturbance value corresponding to the desensitized digital economic data information, and then the disturbance value is used for encrypting the desensitized digital economic data information and further obtaining the private data information, so that the trouble of the user caused by disclosure of the private data information can be avoided.
The application also provides a multi-channel data acquisition device according to digital economy, comprising:
the first acquisition module 1 is used for acquiring digital economic data information captured by a web crawler, screening the digital economic data information to obtain screening information, wherein the screening information comprises product information, market research information and market value information, inputting the digital economic data information into a classification model, and extracting the digital economic data information through a support vector machine to obtain product region range information;
the first extraction module 2 is used for carrying out logistic regression processing on the product region range information, and extracting to obtain product subdivision information;
the second extraction module 3 is used for inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
the first coding module 4 is used for coding the grouping product information to obtain a product coding value;
the first judging module 5 is used for judging whether the product coding value is matched with a preset characteristic value or not;
if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be determined;
the third extraction module 6 is used for extracting the information of the captured digital economic data information according to the screening information to obtain private data information;
the second obtaining module 7 is configured to obtain a data link corresponding to the private data information, and perform desensitization processing on the digital economic data information corresponding to the data link, so as to obtain desensitized digital economic data information;
a second judging module 8, configured to judge whether the desensitized digital economic data information is legal data;
if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as digital economic data information;
if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
In one embodiment, the first determining module includes:
the first acquisition unit is used for acquiring preset data coding values corresponding to the digital economic data information;
the first calculation unit is used for inputting the data coding numerical value and the preset data coding numerical value into a market similarity model and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,ai is a data coding value, bi is a preset data coding value;
a first judging unit for judging whether the calculated similarity value is equal to a preset value;
and if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
As shown in fig. 3, the present application further provides a computer device, which may be a server, and the internal structure of which may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing all data required by the process of the automatic output power adjustment method of the energy storage inverter. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method for automatically adjusting the output power of an energy storage inverter.
Those skilled in the art will appreciate that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present application and is not intended to limit the computer device to which the present application is applied.
An embodiment of the present application further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the above-described digital-economic multi-channel data collection methods,
those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes using the teachings of the present invention and the accompanying drawings, or direct or indirect application in other related arts, are included in the scope of the present invention
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application, and those skilled in the art will appreciate that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flowchart and/or block of the flowchart illustrations and/or block diagrams, and combinations of flowcharts and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (9)

1. A method of multi-channel data acquisition in accordance with digital economics, comprising:
the method comprises the steps of obtaining digital economic data information captured by a web crawler, screening the digital economic data information to obtain screening information, wherein the screening information comprises product information, market research information and market value information, inputting the digital economic data information into a classification model, and extracting the digital economic data information through a support vector machine to obtain product region range information;
carrying out logistic regression processing on the product region range information, and extracting to obtain product subdivision information;
inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
coding the grouping product information to obtain a product coding value;
judging whether the product coding value is matched with a preset characteristic value or not;
if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be determined;
carrying out information extraction on the captured digital economic data information according to the screening information to obtain private data information;
acquiring a data link corresponding to the private data information, and performing desensitization processing on the digital economic data information corresponding to the data link to obtain desensitized digital economic data information;
judging whether the desensitized digital economic data information is legal data or not;
if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as digital economic data information;
if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
2. The method for digital economic multi-channel data acquisition according to claim 1, wherein after the step of defining the product code value as the data code value corresponding to the first data information to be data, the method comprises:
acquiring a preset data coding value corresponding to the digital economic data information;
inputting the data coding numerical value and the preset data coding numerical value into a market similarity model, and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,for similarity value, ai is a data coding value, bi is a preset data codingA numerical value;
judging whether the calculated similarity value is equal to a preset value or not;
and if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
3. The digital economic multi-channel data collection method according to claim 1, wherein before the step of screening the digital economic data information, comprising:
acquiring the disclosure time corresponding to the digital economic data information, and acquiring data structure information according to the disclosure time;
inputting the data structure information into a traceability model to obtain node data information;
verifying the node data information and the original log information to obtain original data storage information;
extracting the original data storage information to obtain an original data chain;
judging whether the digital economic data information is matched with the original data storage information or not;
and if so, screening the digital economic data information.
4. The method for digital economic multi-channel data collection according to claim 1, wherein after the step of screening the digital economic data information to obtain screening information, the method comprises:
acquiring product unit price corresponding to the product information to be acquired;
acquiring current market sales corresponding to market research information to be acquired;
acquiring historical market sales corresponding to market research information to be acquired;
calculating a market growth estimated value according to the product unit price, the current market sales and the historical market sales, wherein the calculation formula is as follows:
wherein A is a market growth forecast value, b is a historical market sales, c is a current market sales, and e is a product unit price.
5. The method for collecting multi-channel data according to digital economy according to claim 1, wherein the step of extracting the captured digital economy data information according to the screening information to obtain private data information includes:
acquiring personal identity information corresponding to the digital economic data information, and extracting the personal identity information to obtain an identity number;
obtaining a disturbance base number;
calculating a disturbance value corresponding to the desensitized digital economic data information according to the identity number and the disturbance base, wherein a calculation formula is as follows:
wherein Q is a disturbance value, n is an identity number, j is a disturbance base,presetting a tolerable deviation coefficient;
and encrypting the desensitized digital economic data information according to the disturbance value to obtain private data information.
6. A digital economic multi-channel data acquisition device, comprising:
the system comprises a first acquisition module, a second acquisition module and a support vector machine, wherein the first acquisition module is used for acquiring digital economic data information captured by a web crawler, screening the digital economic data information to obtain screening information, the screening information comprises product information, market research information and market value information, inputting the digital economic data information into a classification model, and extracting the digital economic data information through the support vector machine to obtain product region range information;
the first extraction module is used for carrying out logistic regression processing on the product region range information and extracting to obtain product subdivision information;
the second extraction module is used for inputting the product subdivision information into a clustering model, and carrying out grouping extraction on the product subdivision information through K-means clustering to obtain grouping product information;
the first coding module is used for coding the grouping product information to obtain a product coding value;
the first judging module is used for judging whether the product coding value is matched with a preset characteristic value or not;
if the product code values are matched, the product code values are transmitted to a mapping function, and product information is generated through a designated logic;
if the data information is not matched with the first data information, defining the product code value as a data code value corresponding to the first data information to be determined;
the third extraction module is used for extracting the information of the captured digital economic data information according to the screening information to obtain private data information;
the second acquisition module is used for acquiring a data link corresponding to the private data information and performing desensitization processing on the digital economic data information corresponding to the data link to obtain desensitized digital economic data information;
the second judging module is used for judging whether the desensitized digital economic data information is legal data or not;
if the desensitized digital economic data information is legal data, the desensitized digital economic data is used as digital economic data information;
if the desensitized digital economic data information is illegal data, deleting the desensitized digital economic data.
7. The digital economic multi-channel data acquisition device according to claim 6, wherein the first judging module comprises:
the first acquisition unit is used for acquiring preset data coding values corresponding to the digital economic data information;
the first calculation unit is used for inputting the data coding numerical value and the preset data coding numerical value into a market similarity model and outputting a similarity value, wherein the function of the market similarity model is as follows:
wherein,ai is a data coding value, bi is a preset data coding value;
a first judging unit for judging whether the calculated similarity value is equal to a preset value;
and if the information is equal to the preset value, calibrating the information corresponding to the first data coding value as market research information.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 5 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 5.
CN202311203721.8A 2023-09-19 2023-09-19 Multi-channel data acquisition method and device based on digital economy Active CN116956353B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311203721.8A CN116956353B (en) 2023-09-19 2023-09-19 Multi-channel data acquisition method and device based on digital economy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311203721.8A CN116956353B (en) 2023-09-19 2023-09-19 Multi-channel data acquisition method and device based on digital economy

Publications (2)

Publication Number Publication Date
CN116956353A CN116956353A (en) 2023-10-27
CN116956353B true CN116956353B (en) 2024-01-12

Family

ID=88454873

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311203721.8A Active CN116956353B (en) 2023-09-19 2023-09-19 Multi-channel data acquisition method and device based on digital economy

Country Status (1)

Country Link
CN (1) CN116956353B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927400A (en) * 2014-05-07 2014-07-16 重庆邮电大学 Web site product detailed information classification crawling and product information base establishing method
CN109583226A (en) * 2018-10-26 2019-04-05 平安科技(深圳)有限公司 Data desensitization process method, apparatus and electronic equipment
CN110087237A (en) * 2019-04-30 2019-08-02 苏州大学 Method for secret protection, device and associated component based on disturbance of data
CN115481434A (en) * 2022-09-15 2022-12-16 重庆长安汽车股份有限公司 Private data protection method, device, equipment and storage medium of cloud platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210141929A1 (en) * 2019-11-12 2021-05-13 Pilot Travel Centers Llc Performing actions on personal data stored in multiple databases

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927400A (en) * 2014-05-07 2014-07-16 重庆邮电大学 Web site product detailed information classification crawling and product information base establishing method
CN109583226A (en) * 2018-10-26 2019-04-05 平安科技(深圳)有限公司 Data desensitization process method, apparatus and electronic equipment
CN110087237A (en) * 2019-04-30 2019-08-02 苏州大学 Method for secret protection, device and associated component based on disturbance of data
CN115481434A (en) * 2022-09-15 2022-12-16 重庆长安汽车股份有限公司 Private data protection method, device, equipment and storage medium of cloud platform

Also Published As

Publication number Publication date
CN116956353A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN112818023B (en) Big data analysis method and cloud computing server in associated cloud service scene
CN109274843B (en) Key prediction method, device and computer readable storage medium
CN112488712A (en) Safety identification method and safety identification system based on block chain big data
CN109063984B (en) Method, apparatus, computer device and storage medium for risky travelers
CN110135943B (en) Product recommendation method, device, computer equipment and storage medium
US11693959B2 (en) Systems and methods for intelligent cybersecurity alert similarity detection and cybersecurity alert handling
CN114511019A (en) Sensitive data classification and grading identification method and system
CN112685774B (en) Payment data processing method based on big data and block chain finance and cloud server
Fayaz et al. Performance evaluation of GINI index and information gain criteria on geographical data: An empirical study based on JAVA and Python
CN110990560B (en) Judicial data processing method and system
CN116956353B (en) Multi-channel data acquisition method and device based on digital economy
Jabeen et al. Data mining in crime analysis
CN113657318A (en) Pet classification method, device, equipment and storage medium based on artificial intelligence
CN113223502A (en) Speech recognition system optimization method, device, equipment and readable storage medium
CN114495137B (en) Bill abnormity detection model generation method and bill abnormity detection method
CN115828901A (en) Sensitive information identification method and device, electronic equipment and storage medium
CN112487430A (en) Android malicious software detection method
Harikrishnan et al. Insurance Customer Authentication Using SVM and Financial Time Series Analysis for Mobile Applications.
CN111027296A (en) Report generation method and system based on knowledge base
CN116305226B (en) Dynamic invisible black box method for data isolation
CN110781348B (en) Video file analysis method
CN113269247B (en) Training method and device for complaint early warning model, computer equipment and storage medium
CN113792683B (en) Training method, training device, training equipment and training storage medium for text recognition model
WO2022227223A1 (en) Voice verification model training method and apparatus, and computer device
Rossini et al. CA-Smooth: content adaptive smoothing of time series leveraging locally salient temporal features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant