CN106709805B - User income data acquisition method and system - Google Patents

User income data acquisition method and system Download PDF

Info

Publication number
CN106709805B
CN106709805B CN201610493459.9A CN201610493459A CN106709805B CN 106709805 B CN106709805 B CN 106709805B CN 201610493459 A CN201610493459 A CN 201610493459A CN 106709805 B CN106709805 B CN 106709805B
Authority
CN
China
Prior art keywords
data
user
information
account
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610493459.9A
Other languages
Chinese (zh)
Other versions
CN106709805A (en
Inventor
麦金凯
何锐邦
戴云峰
罗谚君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610493459.9A priority Critical patent/CN106709805B/en
Publication of CN106709805A publication Critical patent/CN106709805A/en
Application granted granted Critical
Publication of CN106709805B publication Critical patent/CN106709805B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a user income data acquisition method and a user income data acquisition system. The method comprises the following steps: acquiring current data issued by a plurality of data sources through a plurality of distributed servers; if the data issued by the data sources have conflict, correcting the data issued by the data sources according to the weight of each data source; acquiring account data information of a user; and obtaining the current all income data of the user account according to the account data information of the user and the current data issued by the plurality of data sources. According to the method, the current data issued by the plurality of data sources are acquired through the plurality of distributed servers, mass data can be rapidly captured from the plurality of data sources, and the updating speed of user income data is greatly improved; and when the data issued by a plurality of data sources conflict, the data can be automatically corrected, and the overall income data of the user can be automatically calculated.

Description

User income data acquisition method and system
Technical Field
The invention relates to the technical field of internet, in particular to a user income data acquisition method and a user income data acquisition system.
Background
In the prior art, particularly when the user profits are calculated through a terminal such as a mobile phone, the user generally inputs the benefit data manually, and then the daily profits are updated according to the data input by the user. The main disadvantages of this are: the manual income data entry operation of the user is complicated, and errors are easy to occur; the revenue data may be incorrect and not updated in a timely manner. Moreover, the financial product management system in the prior art is single, and usually only single income information can be obtained, so that all income data of all financial products of a user cannot be obtained.
Disclosure of Invention
In view of this, the present invention provides a method for obtaining user income data, which can obtain the current all income data of a user in time. The invention is realized by the following steps:
a user revenue data acquisition method, comprising:
acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
if the data issued by the data sources have conflict, correcting the data issued by the data sources according to the weight of each data source;
acquiring account data information of a user;
and obtaining the current all income data of the user account according to the account data information of the user and the current data issued by the plurality of data sources.
The invention also provides a system for acquiring the user income data, which comprises the following steps:
the current data acquisition module is used for acquiring current data issued by a plurality of data sources;
the current data acquisition module comprises:
the data acquisition unit is used for acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
the collision early warning unit is used for judging whether data issued by a plurality of data sources have collision or not;
the data correction unit is used for correcting the data issued by the data sources according to the weight of each data source when the data issued by the data sources have conflict or not;
the user information management module is used for acquiring account data information of a user;
and the profit data acquisition module is used for acquiring current all profit data of the user account according to the account data information of the user and the current data issued by the data sources.
The implementation of the invention has the following beneficial effects:
(1) the invention provides a user income data acquisition method, which comprises the following steps of firstly, acquiring current data issued by a plurality of data sources; these current data include prices of various financial products issued by multiple data sources; secondly, acquiring account data information of the user; the account data information of the user comprises account data information of the user related to financial management, such as account types and account amounts; and finally, calculating the current all income data of the account of the user according to the account data information of the user and the current data. The invention provides a method for integrating and managing a plurality of financial products, which can automatically calculate the overall profits of different financial products and effectively solve the problems of incomplete and non-systematic user benefit data caused by single variety of financial product management tools in the prior art. The invention can automatically update the daily income information of the user and provide complete and comprehensive income information for the user.
(2) The invention acquires the current data issued by a plurality of data sources through a plurality of distributed servers, can quickly acquire mass data from a plurality of data sources due to the adoption of the plurality of distributed servers for acquiring the data, greatly improves the updating speed of the user income data, and can acquire comprehensive data due to the adoption of the plurality of distributed servers, thereby being convenient for systematically and completely calculating the user income data.
(3) Because the information issued by a plurality of data sources to the same financing product may be inconsistent or the conflict exists among the data issued by a plurality of data sources, when the conflict exists among the data issued by a plurality of data sources, namely the conflict occurs among the financing data captured by a plurality of data sources, the data acquired from the data sources is corrected through a multi-data-source cross correction algorithm according to the weight of each data source, so that the correctness of capturing the data from the data sources is ensured.
(4) According to the correction result, the invention dynamically adjusts and updates the weight of each data source, increases the weight of the data source with more adopted times, and reduces the weight of the data source with less adopted times, thereby improving the reliability of the acquired current data.
(5) Because each financial product of the user corresponds to one account, the invention firstly obtains the information flow related to the client of the user and obtains the account data information of the user according to the information flow, wherein the account data information comprises a short message bill and a mail bill. The invention can automatically acquire the financial product information of the user according to the bill of the user, does not need the manual input of the financial product information of the user, simplifies the process of acquiring the income of the financial product by the user and improves the user experience.
(6) The invention adopts a bill automatic analysis method based on the same-root backtracking positioning method, and sets an extraction expression according to the account data information of the user to be extracted; searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting the element which has the same ancestor with the extracted expression and has the identification characteristic as a reference point; searching the nearest ancestor of the reference point in the information flow; in the ancestor range, searching information related to the account data information of the user through a selector of the CSS; and extracting the account data information of the user through the regular expression in the searched information. The search range is reduced by determining the ancestor range; the search range is further reduced through a selector of the CSS, and finally, the account data information of the user is accurately searched by using a regular expression. The method can quickly and accurately search the account data information of the user, and improves the efficiency of extracting the account data information of the user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of the method of example 1 of the present invention;
FIG. 2 is a flowchart of step S102 of the method of embodiment 1 of the present invention;
FIG. 3 is a flowchart of step S103 of the method of embodiment 1 of the present invention;
FIG. 4 is a schematic diagram of a system of embodiment 2 of the present invention;
FIG. 5 is a schematic diagram of a data acquisition unit of the system of embodiment 2 of the present invention;
FIG. 6 is a schematic diagram of the current data acquisition module of the system of embodiment 2 of the present invention;
FIG. 7 is a schematic diagram of a weight adjustment unit of the system according to embodiment 2 of the present invention;
fig. 8 is a schematic diagram of an information parsing unit of the system of embodiment 2 of the present invention;
FIG. 9 is another schematic view of the system of embodiment 2 of the present invention;
fig. 10 is a block diagram of a computer terminal according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Example 1:
as shown in FIG. 1, embodiment 1 of the present invention provides a user revenue data acquisition method, it should be noted that the steps shown in the flowchart of the figure may be performed in a computer system such as one or more sets of computer executable instructions, and that while a logical order is shown in the flowchart, in some cases the steps shown or described may be performed in an order different than here.
The method of the invention comprises the following steps:
s101, obtaining current data issued by a plurality of data sources through a plurality of distributed servers.
As an alternative implementation, step S101 includes:
s1011, obtaining current original data from a plurality of preset data sources through a plurality of distributed servers;
the data source is a data source for issuing financial product data, such as a website for issuing financial products and financial data; the current data in the present invention refers to the current data related to the price, exchange rate, etc. of fund, bond, stock, foreign exchange, future, P2P, etc. financing products.
And S1012, deleting data irrelevant to the account data information in the original data.
The data unrelated to the account data information includes messy codes, advertisements, garbage and the like in a data source (such as a webpage).
The invention acquires the current data issued by a plurality of data sources through a plurality of distributed servers, can quickly acquire mass data from a plurality of data sources due to the adoption of the plurality of distributed servers for acquiring the data, greatly improves the updating speed of the financial product data, and can acquire comprehensive financial product data due to the adoption of the plurality of distributed servers, thereby being convenient for systematically and completely calculating the financial benefits of users.
And S102, judging whether the data issued by the data sources conflict or not, and if so, correcting the data issued by the data sources according to the weight of each data source.
As shown in fig. 2, as an alternative embodiment, step S102 includes:
and S1021, acquiring a preset initial weight of each data source. The initial weight of each data source may be set before steps S101 and S102, and is not particularly limited.
And S1022, grouping the plurality of data sources according to the issued data, and adding the weight of each group of corresponding data sources.
Specifically, grouping the plurality of data sources according to the published data comprises: and grouping the data sources with the same published data.
And S1023, setting the group of data with the maximum weight after addition as final data.
For example, assume that the fund data is obtained primarily from three data sources, the three data sources being a day fund, a good buy fund, and a few meters fund, respectively. The data credibility weights of the three preset data sources are respectively as follows: 40% of heaven fund, 30% of good purchase fund and 30% of several meters fund. Referring to table one, the net value of the chinese growth (000001) fund captured by the three data sources on a certain day is:
watch 1
Net unit value 1.0680 1.0680 1.0681
Cumulative net worth 3.3690 3.3691 3.3691
Data source Heaven foundation Purchase fund Several meters of gold
And adding the weights of the data sources corresponding to each group according to the distributed data groups by the plurality of data sources.
Specifically, data sources having the same value of net worth data are grouped together, for example, a data source day-to-day fund and a good buy fund having net worth data unit of 1.0680 are grouped together, and the corresponding weights are added: 40% + 30%; the data source is a few meters of funds in units of net 1.0681, with a weight of 30%.
The set of data having the largest weight after the addition is set as final data.
Unit net value: 1.0680 (40% + 30%) >1.0681 (30%);
and (4) accumulating the net value: 3.3690 (40%) <3.3691 (30% + 30%).
Therefore, as shown in Table two, the final data for net worth in units is 1.0680, and the final data for net worth in totals is 3.3691.
Watch two
Figure BDA0001034077820000051
As an optional implementation manner, in order to improve the reliability of obtaining data from the data source, the method may further include:
and S1024, adjusting the weight of each data source according to the correction result.
Specifically, if the data acquired by the data source is set as the final data, the weight of the data source is increased; if the data acquired by the data source is not set as the final data, the weight of the data source is reduced.
And S103, acquiring account data information of the user.
The account data information of the user comprises account data information of the user related to financial management, such as account type, account number and the like. The account categories include names and categories of financial or financial products purchased by the user, such as funds, bonds, stocks, foreign exchange, futures, P2P, deposits from which interest can be obtained, and the like. The account amount includes amount data for each financial and financing product in which the user participates.
The account data information of the user may be information manually input by the user.
As an alternative implementation, step S103 includes:
the method comprises the steps of obtaining information flow related to a user client, and obtaining account data information of a user according to the information flow, wherein the account data information comprises a short message bill and a mail bill.
The information flow related to the user client in the invention refers to all information received or sent by the user client, including short messages, mails, messages received and sent by instant messaging software, and the like.
The bill page is usually generated by a system, has a complex structure and no characteristics, and is difficult to extract the bill information through matching analysis of a conventional regular expression. To solve this problem, as shown in fig. 3, in an alternative embodiment, step S103 further includes:
and S1031, setting an extraction expression according to the account data information of the user required to be extracted.
S1032, searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting an element which has the same ancestor with the extracted expression and has an identification characteristic as a reference point;
s1033, searching an ancestor closest to the reference point in the information flow;
s1034, in the ancestor range, searching information related to the account data information of the user through a selector of the CSS;
and S1035, extracting the account data information of the user through the regular expression in the searched information.
For example, in the bill of a certain bank, the user has the lowest payment amount in the current period: 50.00. the method for acquiring the account data information of the user comprises the following steps: setting the lowest payment amount of the current period to be extracted from the bill: 50.00.
the inventor tries to extract the 'lowest payoff amount in the current period' through regular expression analysis, and finds that the matched regular expression is more and more complex due to too complex html page structure of the mail bill, and the scheme is not feasible. In addition, the inventor tries to analyze and extract by means of a CSS selector, and since html pages of the email bill are all laid out by using tables, the content similarity is extremely high, the identification is not available, and the scheme is not feasible. The method of the invention can well solve the problem. Specifically, the method is as follows:
firstly, determining an extraction expression;
secondly, finding out elements which have the same ancestor with the lowest payoff amount and have identification characteristics as reference points; the reference point is 'the lowest payoff amount of the current period'. The ancestor here is the previous table on which the "lowest payoff amount" is located.
Thirdly, finding out the ancestors which are the same as the two ancestors; because it is in the same table, the most recent same ancestor is the table;
fourthly, approaching the data of the 'lowest payment amount' through a selector of CSS (Cascading Style Sheets Chinese name: Cascading Style sheet); the selector of the CSS can be an adjacent selector, and can also be other CSS selectors such as a descendant selector and the like;
and finally, extracting the data of the lowest payoff amount through a regular expression.
The configuration expression of the present invention may be as follows:
Figure BDA0001034077820000071
because each financial product of the user corresponds to one account, the invention firstly obtains the information flow related to the client of the user and obtains the account data information of the user according to the information flow, wherein the account data information comprises a short message bill and a mail bill. The invention can automatically acquire the financial product information of the user according to the bill of the user, does not need the manual input of the financial product information of the user, simplifies the process of acquiring the income of the financial product by the user and improves the user experience.
The invention adopts a bill automatic analysis method based on the same-root backtracking positioning method, and sets an extraction expression according to the account data information of the user to be extracted; searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting the element which has the same ancestor with the extracted expression and has the identification characteristic as a reference point; searching the nearest ancestor of the reference point in the information flow; in the ancestor range, searching information related to the account data information of the user through a selector of the CSS; and extracting the account data information of the user through the regular expression in the searched information. The search range is reduced by determining the ancestor range; the search range is further reduced through a selector of the CSS, and finally, the account data information of the user is accurately searched by using a regular expression. The method can quickly and accurately search the account data information of the user, and well solves the problem that the bill information is difficult to extract due to complex bill content; the efficiency of extracting the data information of the user account is improved.
And S104, obtaining the current all income data of the user account according to the account data information of the user and the current data issued by the plurality of data sources.
Specifically, step S104 includes: and searching current data corresponding to the account data information of the user from the current data acquired in the step S102 according to the account data information of the user, and respectively calculating current income data corresponding to each account of the user. Of course, in order to provide more intuitive revenue data for the user, the current revenue data for each account of the user may be integrated to obtain the total revenue of the user account.
The invention provides a user income data acquisition method, which comprises the following steps of firstly, acquiring current data issued by a plurality of data sources; these current data include prices of various financial products issued by multiple data sources; secondly, acquiring account data information of the user; the account data information of the user comprises account data information of the user related to financial management, such as account types and account amounts; and finally, calculating the current all income data of the account of the user according to the account data information of the user and the current data. The invention provides a method for integrating and managing a plurality of financial products, which can automatically calculate the overall profits of different financial products and effectively solve the problems of incomplete and non-systematic user benefit data caused by single variety of financial product management tools in the prior art. The invention can automatically update the daily income information of the user and provide complete and comprehensive income information for the user.
Because the information issued by a plurality of data sources to the same financing product may be inconsistent or is called that the data issued by a plurality of data sources have conflict, when the data issued by a plurality of data sources have conflict, namely the financing data captured by a plurality of data sources have conflict, the original data of the financing product is corrected through a multi-data-source cross correction algorithm according to the weight of each data source, thereby ensuring the correctness of data capture from the plurality of data sources.
According to the correction result, the invention dynamically adjusts and updates the weight of each data source, increases the weight of the data source with more adopted times, and reduces the weight of the data source with less adopted times, thereby improving the reliability of the acquired current data.
Example 2:
as shown in fig. 4, the present invention provides a user revenue data acquisition system, comprising:
the current data acquisition module is used for acquiring current data issued by a plurality of data sources;
the current data acquisition module includes:
the data acquisition unit is used for acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
the collision early warning unit is used for judging whether data issued by a plurality of data sources have collision or not;
the data correction unit is used for correcting the data issued by the data sources according to the weight of each data source when the data issued by the data sources have conflict or not;
the user information management module is used for acquiring account data information of a user;
and the profit data acquisition module is used for acquiring current all profit data of the user account according to the account data information of the user and the current data issued by the data sources.
As an alternative embodiment, fig. 5 is a schematic diagram of a data acquisition unit of the present invention, as shown in fig. 5, the data acquisition unit includes:
the system comprises an original data acquisition subunit, a data processing subunit and a data processing unit, wherein the original data acquisition subunit is used for acquiring current original data from a plurality of preset data sources through a plurality of distributed servers;
and the cleaning subunit is used for deleting data irrelevant to the account data information in the original data.
As an alternative embodiment, fig. 6 is another schematic structural diagram of the current data acquisition module of the present invention, and as shown in fig. 6, the current data acquisition module includes:
the data acquisition unit is used for acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
the collision early warning unit is used for judging whether data issued by a plurality of data sources have collision or not;
and the data correction unit is used for correcting the data issued by the data sources according to the weight of each data source when the data issued by the plurality of data sources have conflict.
And the weight adjusting unit is used for adjusting the weight of each data source according to the correction result.
As an alternative embodiment, fig. 7 is a schematic diagram of a weight adjusting unit of the present invention, as shown in fig. 7, the weight adjusting unit includes:
the initial value setting subunit is used for acquiring the preset initial weight of each data source;
the grouping calculation subunit is used for grouping the plurality of data sources according to the issued data and adding the weight of each group of corresponding data sources;
and a syndrome unit for setting a group of data having the largest weight after the addition as final data.
As an alternative embodiment, the user information management module includes: the information analysis unit is used for acquiring information flow associated with a user client and acquiring account data information of the user according to the information flow, wherein the account data information comprises a short message bill and a mail bill.
As an alternative embodiment, as shown in fig. 8, the information parsing unit includes:
the expression setting subunit is used for setting an extraction expression according to the account data information of the user needing to be extracted;
the datum point setting subunit is used for searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting the element which has the same ancestor with the extracted expression and has the identification feature as a datum point;
an ancestor finding subunit, configured to find, in the information stream, an ancestor closest to the reference point;
the approximation subunit is used for searching information related to the account data information of the user in the ancestor range through a selector of the CSS;
and the extracting subunit is used for extracting the account data information of the user through the regular expression in the searched information.
Fig. 9 is a block diagram of the system of the present invention in a specific application scenario.
The system of the invention can be applied to terminal management software, such as Tencent mobile phone management, and is convenient for calculating the profits of all financial products of users.
When the number of client terminals is very large, the kinds of financial and financial products involved with the users of the respective clients are also diversified. If the server acquires the original data from the data source once every time the revenue data of one client is acquired, huge pressure is applied to the server, and the server is too busy. In order to solve the problem, the invention adopts a server cluster consisting of a plurality of distributed servers to obtain the current data issued by a plurality of data sources and puts the obtained data into a database. Therefore, when the income data of each client is obtained, only the required data needs to be extracted from the database, the data sharing is realized, the efficiency of the system is improved, and the burden of the server is greatly reduced.
The data source is a data source for issuing financial product data, such as a website for issuing financial products and financial data; the current data in the present invention refers to the current data related to the price, exchange rate, etc. of fund, bond, stock, foreign exchange, future, P2P, etc. financing products.
The current data acquisition module of the invention can also be called as a financial data acquisition background, and comprises a data cleaning module and a data normalization module, which are used for firstly cleaning and normalizing the acquired data, such as: useless data such as advertisements and messy codes are removed, and effective data is obtained.
The current data acquisition module can also comprise a database, data processed by the data cleaning module and the data normalization module can be stored in the database, and the database can be called a financial management database.
The current data acquisition module further comprises a conflict early warning machine which is used for acquiring data from the database, comparing and checking conflict data in the acquired data, automatically correcting the conflict data, carrying out conflict early warning, storing the corrected data into a financial database and waiting for calling. Specifically, the conflict early warning machine is used for judging whether data issued by a plurality of data sources conflict or not, if yes, the conflict early warning is sent out, and the financial data collection background automatically corrects conflict data after receiving the conflict early warning. The automatic correction process comprises the following steps: and correcting the data issued by the data sources according to the weight of each data source. The correction process comprises the following steps: and acquiring a preset initial weight of each data source in advance. And adding the weights of the data sources corresponding to each group according to the distributed data groups by the plurality of data sources. Specifically, grouping the plurality of data sources according to the published data comprises: and grouping the data sources with the same published data. The set of data whose added weight is the largest is set as corrected data. The corrected data is stored in a financial database.
In order to improve the reliability of obtaining data from the data sources, the collision warning apparatus may further include a weight adjustment module configured to adjust a weight of each data source according to the correction result. Specifically, if the data acquired by the data source is set as the final data, the weight of the data source is increased; if the data acquired by the data source is not set as the final data, the weight of the data source is reduced.
And after the collision early warning machine finishes data correction and weight adjustment of the data source, storing the updated data into the financial database.
The current data acquisition module may acquire current data issued by a plurality of data sources at predetermined time intervals, for example, once a day, and of course, for revenue data with a faster update rate, the data acquisition frequency may also be increased. Because the user generally only needs to check the income data regularly, the income data required by the user can be provided by acquiring the data at a preset time interval, and meanwhile, the server only needs to acquire the data regularly from the data source, so that the expense of the server can be saved.
The user information management module can also be called a user financial data management system, and the hardware is one or more servers. The user information management module can receive account data information input by a user, for example, the user logs in an account system of a Tencent mobile phone manager, then inputs financial product information purchased by the user, and authorizes the Tencent mobile phone manager to manage. In addition, under the condition of user authorization, the user information management module can also automatically help the user to import the information of all purchased financial products including stocks, funds, p2p and the like by acquiring the mail bill and the short message bill of the user through an automatic analysis algorithm based on the same-root backtracking positioning method. The user only needs to log in the Tengcong mobile phone manager and authorize the Tengcong mobile phone manager to manage the financial products, and the mobile phone manager can update the income data of the user regularly through the Tengcong financial income calculation system.
Specifically, the user information management module comprises a user financial information database for storing account information of the user. The account data information of the user comprises account data information of the user related to financial management, such as account type, account number and the like. The account categories include names and categories of financial or financial products purchased by the user, such as funds, bonds, stocks, foreign exchange, futures, P2P, deposits from which interest can be obtained, and the like. The account amount includes amount data for each financial and financing product in which the user participates.
The account data information of the user can be acquired through two modes, the first mode is input for the user, the user information management module provides an interface for the user to input the information, and the user inputs the account data information through the interface. The second is automatic acquisition by the system authorized by the user.
The user information management module comprises an information analysis unit and is used for acquiring information flow associated with a user client after the user is authorized, and acquiring account data information of the user according to the information flow, wherein the account data information comprises a short message bill and a mail bill.
The information flow related to the user client in the invention refers to all information received or sent by the user client, including short messages, mails, messages received and sent by instant messaging software, and the like.
The information analysis unit includes:
the expression setting subunit is used for setting an extraction expression according to the account data information of the user needing to be extracted;
the datum point setting subunit is used for searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting the element which has the same ancestor with the extracted expression and has the identification feature as a datum point;
an ancestor finding subunit, configured to find, in the information stream, an ancestor closest to the reference point;
the approximation subunit is used for searching information related to the account data information of the user in the ancestor range through a selector of the CSS;
and the extracting subunit is used for extracting the account data information of the user through the regular expression in the searched information.
The profit data acquisition module, which may also be referred to as a financial profit calculation engine, has hardware that is also a server. And calculating current income data corresponding to each account of the user according to the account data information of the user and current data corresponding to the account data information of the user, which is acquired from a data source. Of course, to provide the user with more intuitive revenue data, the current revenue data for each account of the user may be integrated to form a revenue sum for the user account. Specifically, the profit data acquisition module calculates profits of all financial products of the user according to the financial products purchased by the user by combining a certain calculation formula, integrates all profit conditions, and pushes the profits to the user in a unified manner, so that the user can clearly know all profits of the financial products. The revenue data can be pushed to the user at a preset time interval, and the revenue data acquisition module can push the revenue data to the user according to the time interval set by the user, so that the user experience is enhanced.
Example 3
The embodiment of the invention also provides a computer terminal, which can be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
Alternatively, fig. 10 is a block diagram of a structure of a computer terminal according to an embodiment of the present invention. As shown in fig. 10, the computer terminal a may include: one or more processors 101 (only one shown), a memory 103, and a transmission device 105.
The memory 103 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for short text classification in the embodiments of the present invention, and the processor 101 executes various functional applications and data processing by running the software programs and modules stored in the memory 103, that is, the short text classification is implemented. The memory 103 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 103 may further include memory located remotely from the processor 101, which may be connected to the computer terminal a via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 105 is used to receive or transmit data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 105 includes a network adapter that can be connected to a router via a network cable to communicate with the internet or a local area network. In one example, the transmission device 105 is a radio frequency module that is used to communicate with the internet by wireless means.
Specifically, the memory 103 is used for storing preset action conditions, information of preset authorized users, and application programs.
The processor 101 may call the information and application stored in the memory 103 through the transmission device to perform the following steps:
optionally, the processor 101 may further execute program codes of the following steps:
acquiring current data issued by a plurality of data sources;
judging whether data issued by a plurality of data sources have conflict or not;
when the data issued by the data sources have conflict, correcting the data issued by the data sources according to the weight of each data source;
acquiring account data information of a user;
and obtaining the current all income data of the user account according to the account data information of the user and the current data issued by the plurality of data sources.
Optionally, the specific examples in this embodiment may refer to the examples described in embodiments 1 to 2 above, and this embodiment is not described herein again.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (11)

1. A user revenue data acquisition method, comprising:
acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
if the data issued by the data sources have conflict, correcting the data issued by the data sources according to the weight of each data source;
acquiring information flow associated with a user client, and setting an extraction expression according to account data information of a user needing to be extracted;
searching an element matched with the extraction expression in the information flow according to the extraction expression, and setting the element which has the same ancestor with the extraction expression and has an identification characteristic as a reference point;
looking up the nearest ancestor of the reference point in the information flow;
in the ancestor range, searching information related to the account data information of the user through a selector of the CSS;
extracting account data information of the user from the searched information through a regular expression, wherein the account data information comprises a short message bill and a mail bill;
and obtaining the current all income data of the user account according to the account data information of the user and the current data issued by the plurality of data sources.
2. The method of claim 1, wherein the correcting the data issued by the data sources according to the weight of each data source comprises:
acquiring a preset initial weight of each data source;
grouping a plurality of data sources according to the issued data, and adding the weight of each group of corresponding data sources;
the set of data having the largest weight after the addition is set as final data.
3. The method for obtaining user revenue data according to claim 2, wherein after correcting the data issued by the data sources according to the weight of each data source, the method further comprises:
and adjusting the weight of each data source according to the correction result.
4. The method of claim 3, wherein said adjusting the weight of each data source according to the calibration result comprises:
if the data acquired by the data source is set as final data, increasing the weight of the data source; if the data acquired by the data source is not set as the final data, the weight of the data source is reduced.
5. The method for obtaining user revenue data according to claim 1, wherein the obtaining current data published by a plurality of data sources through a plurality of distributed servers comprises:
acquiring current original data from a plurality of preset data sources through a plurality of distributed servers;
and deleting data irrelevant to the account data information in the original data.
6. A user revenue data acquisition system, comprising:
the current data acquisition module is used for acquiring current data issued by a plurality of data sources;
the current data acquisition module comprises:
the data acquisition unit is used for acquiring current data issued by a plurality of data sources through a plurality of distributed servers;
the collision early warning unit is used for judging whether data issued by a plurality of data sources have collision or not;
the data correction unit is used for correcting the data issued by the data sources according to the weight of each data source when the data issued by the data sources have conflict or not;
the user information management module is used for acquiring account data information of a user;
the user information management module includes: the information analysis unit is used for acquiring information flow associated with a user client and acquiring account data information of a user according to the information flow, wherein the account data information comprises a short message bill and a mail bill;
the information analysis unit includes: the expression setting subunit is used for setting an extraction expression according to the account data information of the user needing to be extracted; the datum point setting subunit is used for searching an element matched with the extracted expression in the information flow according to the extracted expression, and setting the element which has the same ancestor with the extracted expression and has the identification feature as a datum point; an ancestor finding subunit, configured to find, in the information stream, an ancestor closest to the reference point; the approximation subunit is used for searching information related to the account data information of the user in the ancestor range through a selector of the CSS; the extracting subunit is used for extracting the account data information of the user through the regular expression in the searched information;
and the profit data acquisition module is used for acquiring current all profit data of the user account according to the account data information of the user and the current data issued by the data sources.
7. The user revenue data acquisition system of claim 6, wherein the current data collection module further includes a weight adjustment unit for adjusting the weight of each data source according to the correction result.
8. The user revenue data acquisition system of claim 6, wherein the weight adjustment unit includes:
the initial value setting subunit is used for acquiring the preset initial weight of each data source;
the grouping calculation subunit is used for grouping the plurality of data sources according to the issued data and adding the weight of each group of corresponding data sources;
and a syndrome unit for setting a group of data having the largest weight after the addition as final data.
9. The user revenue data acquisition system of claim 7, wherein the weight adjustment unit is further configured to: if the data acquired by the data source is set as final data, increasing the weight of the data source; if the data acquired by the data source is not set as the final data, the weight of the data source is reduced.
10. The user revenue data acquisition system of claim 6, wherein the data collection unit includes:
the system comprises an original data acquisition subunit, a data processing subunit and a data processing unit, wherein the original data acquisition subunit is used for acquiring current original data from a plurality of preset data sources through a plurality of distributed servers;
and the cleaning subunit is used for deleting data irrelevant to the account data information in the original data.
11. A computer-readable storage medium having stored therein at least one instruction for execution by a computer device to perform the method of any one of claims 1-5.
CN201610493459.9A 2016-06-29 2016-06-29 User income data acquisition method and system Active CN106709805B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610493459.9A CN106709805B (en) 2016-06-29 2016-06-29 User income data acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610493459.9A CN106709805B (en) 2016-06-29 2016-06-29 User income data acquisition method and system

Publications (2)

Publication Number Publication Date
CN106709805A CN106709805A (en) 2017-05-24
CN106709805B true CN106709805B (en) 2020-09-25

Family

ID=58939748

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610493459.9A Active CN106709805B (en) 2016-06-29 2016-06-29 User income data acquisition method and system

Country Status (1)

Country Link
CN (1) CN106709805B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764348B (en) * 2018-05-30 2020-07-10 口口相传(北京)网络技术有限公司 Data acquisition method and system based on multiple data sources
CN110517083A (en) * 2019-08-27 2019-11-29 秒针信息技术有限公司 A kind of method and device of determining customer attribute information
CN110502521B (en) * 2019-08-28 2023-05-09 上海寰创通信科技股份有限公司 Method for establishing archive
CN111563778B (en) * 2020-05-12 2021-08-03 北京口袋财富信息科技有限公司 Information pushing method and device
CN116089907B (en) * 2023-04-13 2023-06-23 民航成都信息技术有限公司 Fusion method and device of aviation multi-source data, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050171884A1 (en) * 2004-02-04 2005-08-04 Research Affiliates, Llc Non-capitalization weighted indexing system, method and computer program product
CN101576990A (en) * 2008-05-06 2009-11-11 中国建设银行股份有限公司 Banking service processing system
CN103593368A (en) * 2012-08-16 2014-02-19 深圳市世纪光速信息技术有限公司 Method, server, terminal and system for selecting data sources
CN104978688A (en) * 2014-04-02 2015-10-14 陈衡 Unbidden fund value increasing device, unbidden fund value increasing method and financing system
CN105323654A (en) * 2014-08-05 2016-02-10 优视科技有限公司 Method and device for displaying content data from network
CN105427166A (en) * 2015-11-13 2016-03-23 中国建设银行股份有限公司 Bank account type detection method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050171884A1 (en) * 2004-02-04 2005-08-04 Research Affiliates, Llc Non-capitalization weighted indexing system, method and computer program product
CN101576990A (en) * 2008-05-06 2009-11-11 中国建设银行股份有限公司 Banking service processing system
CN103593368A (en) * 2012-08-16 2014-02-19 深圳市世纪光速信息技术有限公司 Method, server, terminal and system for selecting data sources
CN104978688A (en) * 2014-04-02 2015-10-14 陈衡 Unbidden fund value increasing device, unbidden fund value increasing method and financing system
CN105323654A (en) * 2014-08-05 2016-02-10 优视科技有限公司 Method and device for displaying content data from network
CN105427166A (en) * 2015-11-13 2016-03-23 中国建设银行股份有限公司 Bank account type detection method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于DOM节点文本密度的网页核心块抽取算法研究";孙飞;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120715(第7期);I138-2649页 *

Also Published As

Publication number Publication date
CN106709805A (en) 2017-05-24

Similar Documents

Publication Publication Date Title
CN106709805B (en) User income data acquisition method and system
US10423664B2 (en) Method and system for providing recommended terms
CN110020122B (en) Video recommendation method, system and computer readable storage medium
US20150227608A1 (en) System and method for performing set operations with defined sketch accuracy distribution
CN102426610A (en) Microblog rank searching method and microblog searching engine
WO2015196793A1 (en) Hotspot information analysis method and device and computer storage medium
CN104182506A (en) Log management method
CN111666492A (en) Information pushing method, device and equipment based on user behaviors and storage medium
CN111447575B (en) Short message pushing method, device, equipment and storage medium
CN104881734A (en) Method, device and system for guiding product improvement based on gray release
CN112116436A (en) Intelligent recommendation method and device, computer equipment and readable storage medium
CN103414693A (en) Dotting method and dotting device
CN113190562A (en) Report generation method and device and electronic equipment
CN112328805A (en) Entity mapping method of vulnerability description information and database table based on NLP
CN105574091B (en) Information-pushing method and device
CN108959289B (en) Website category acquisition method and device
CN111831817B (en) Questionnaire generation analysis method, device, computer device and readable storage medium
CN102760127A (en) Method, device and equipment for determining resource type based on extended text information
CN110196950B (en) Processing method and device for propagating account
CN108985805A (en) A kind of method and apparatus that selectivity executes push task
CN111695077A (en) Asset information pushing method, terminal equipment and readable storage medium
CN115470279A (en) Data source conversion method, device, equipment and medium based on enterprise data
CN111899057B (en) Customer portrait data cluster analysis system based on edge cloud node data collection
CN110472137B (en) Negative sample construction method, device and system of recognition model
CN108009927A (en) One B shareB methods of marking and platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant