CN112737925B - Data acquisition method, device and system - Google Patents

Data acquisition method, device and system Download PDF

Info

Publication number
CN112737925B
CN112737925B CN202011607392.XA CN202011607392A CN112737925B CN 112737925 B CN112737925 B CN 112737925B CN 202011607392 A CN202011607392 A CN 202011607392A CN 112737925 B CN112737925 B CN 112737925B
Authority
CN
China
Prior art keywords
task list
request
account
subtasks
partition module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011607392.XA
Other languages
Chinese (zh)
Other versions
CN112737925A (en
Inventor
彭纪钢
谢波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202011607392.XA priority Critical patent/CN112737925B/en
Publication of CN112737925A publication Critical patent/CN112737925A/en
Application granted granted Critical
Publication of CN112737925B publication Critical patent/CN112737925B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application provides a data acquisition method, a data acquisition device and a data acquisition system. According to the method, based on reasonable configuration of virtual machine resources and network resources in the data acquisition system, statistics data of public numbers and/or applets with management authorities in a plurality of accounts can be automatically acquired, and processing efficiency and processing speed of enterprises in the aspects of operation of the public numbers and/or applets, marketing of related products, advertisement delivery, government supervision and the like are improved.

Description

Data acquisition method, device and system
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a data acquisition method, device and system.
Background
With the continuous development of technology, weChat applications have grown and have evolved as the mainstream instant messaging tools for users. The WeChat public platform is established on the basis of WeChat application programs, and is a self-media platform and a brand propaganda popularization platform with wide audience. An enterprise, a company, an organization or a person can apply for and operate a WeChat public number or a WeChat applet through a WeChat public platform, and realize omnibearing communication and interaction with a user in the forms of characters, pictures, voice, video and the like. The user can subscribe the interesting WeChat public number or WeChat applet to obtain the relevant information.
Currently, in the field of financial technology (Fintech), some businesses have applied for a vast number of WeChat public numbers and/or WeChat applets. Therefore, how to timely, accurately and comprehensively acquire the statistics data of the WeChat public numbers and/or WeChat applets from the WeChat public platform is of great significance to the operations of the WeChat public numbers and/or WeChat applets, related product marketing, advertisement delivery, government supervision and the like.
Disclosure of Invention
The embodiment of the application provides a data acquisition method, a device and a system, which can realize automatic acquisition of statistical data of a plurality of accounts and reasonably utilize virtual machine resources and network resources.
In a first aspect, an embodiment of the present application provides a data acquisition method.
The method comprises the following steps: the first partition module receives a first request, wherein the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management rights in a first account and a second account, and the first account is different from the second account. The first partition module sends a second request to the second partition module in response to receiving the first request. The first partition module receives a third request from the second partition module, wherein the third request carries a first task list and first authentication information corresponding to a first account and a second task list and second authentication information corresponding to a second account, each subtask in the first task list is a public number or an applet of which the first account has management authority, each subtask in the second task list is a public number or an applet of which the second account has management authority, and the first task list, the first authentication information, the second task list and the second authentication information are acquired from a third party platform after the first account and the second account are logged in response to receiving the second request. The first partition module sends a fourth request to the third partition module, the fourth request is obtained based on the first task list, the first authentication information, the second task list and the second authentication information, the fourth request is used for notifying the third partition module to collect statistics data of all subtasks in a target task list, all the subtasks in the target task list belong to the first task list or the second task list, and the total amount of the subtasks corresponding to the target task list is smaller than or equal to the total amount of idle requests corresponding to idle network segment containers in the third partition module; updating all the subtasks in the target task list into subtasks which do not appear in the target task list and are less than or equal to the total idle request amount corresponding to the idle network segment containers in the first task list and the second task list when the target task list does not traverse all the subtasks in the first task list and the second task list, and stopping sending a fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list; the first partition module receives a fifth request from the third partition module, wherein the fifth request carries statistics data of all subtasks in the target task list, and the statistics data of all subtasks in the target task list are obtained from the third party platform by the third partition module in response to receiving each fourth request by traversing all subtasks in the target task list. The first partitioning module gathers the statistical data of all subtasks in the first task list and the second task list after determining that the statistical data of all subtasks in the first task list and the second task list are collected.
By the method provided by the first aspect, the first partition module can obtain authentication information for requesting the statistics data of the account number from the third party platform by scheduling the second partition module, and the first partition module can obtain the statistics data of a plurality of account numbers from the third party platform by scheduling the third partition module. Therefore, the first partition module can automatically collect the statistical data of a plurality of accounts, is favorable for the processing efficiency and processing speed of enterprises in the aspects of public numbers and/or operation of small programs, related product marketing, advertisement delivery, government supervision and the like, and reduces the cost pressure of the enterprises for obtaining the statistical data.
In one possible design, the method specifically includes: when the first partition module generates an idle network segment container in the third partition module, a target statistical task is taken out from a priority queue, the initial state of the priority queue is constructed by the statistical task corresponding to the first task list and the statistical task corresponding to the second task list, and the initial state of the target statistical task is the statistical task corresponding to the first task list or the statistical task corresponding to the second task list. And when the total idle request amount corresponding to the idle network segment container is greater than or equal to the total subtask amount of the target statistical task, the first partition module determines a task list corresponding to the target statistical task as a target task list. Or when the total idle request amount corresponding to the idle network segment container is smaller than the subtask total amount of the target statistical task, the first partition module determines that the same number of subtasks as the idle request amount corresponding to the idle network segment container in the target statistical task are used as a target task list, and forms a new statistical task from the rest subtasks except for the same number of subtasks as the idle request amount corresponding to the idle network segment container in the target statistical task and adds the new statistical task into the priority queue. The first partition module sends a fourth request to the third partition module until no statistical task exists in the priority queue, and stops sending the fourth request to the third partition module.
Therefore, the first partition module comprehensively considers the idle condition of the network segment container in the third partition module, is favorable for timely acquiring the same number of subtasks as the total idle request quantity corresponding to the idle network segment container by utilizing the idle network segment container, realizes the maximum statistical processing of the idle network segment container, and avoids the waste of virtual machine resources of the idle network segment container.
In one possible design, the method further comprises: the first partition module determines that the statistical task with the largest weight value in the priority queue is a target statistical task. Therefore, the first partition module can process the statistical task with the largest weight value preferentially so as to acquire the statistical data of the corresponding account in time.
In one possible design, the weight value of a statistical task is related to the priority of an account corresponding to the statistical task, the queuing order of an account corresponding to the statistical task, and the total subtask amount of an account corresponding to the statistical task. Thus, the weight value of a statistical task can be set based on the omnibearing consideration of the statistical task.
In one possible design, a first partition module receives a first request, including: the first partition module receives a first request from a device that manages an account. Or the first partition module receives a first request sent according to a preset period from equipment to which the first partition module belongs. Thus, the first partition module is provided with a plurality of possibilities for receiving the first request.
In one possible design, when the first request carries the first account and the name of the first subtask corresponding to the first account, the subtasks in the first task list include subtasks corresponding to the names of the first subtasks. Or when the first request carries the first account, the subtasks in the first task list comprise public numbers and applets with management authorities preset in the first account.
In one possible design, when the first request carries the second account and the name of the second subtask corresponding to the second account, the subtasks in the second task list include subtasks corresponding to the names of the second subtasks. Or when the first request carries the second account, subtasks in the second task list comprise public numbers and applets with management authorities preset in the second account.
Thus, the administrator may request the public number and/or applet in the account number that needs to collect statistics from the first partition module via the first request, i.e., the first partition module may determine the public number and/or applet based on the wishes of the administrator. Or the first partition module can adopt default setting, and adopt statistics data of all public numbers and applets in the account number, and registration by an administrator is not needed, namely the first partition module can determine the public numbers and/or the applets based on the default setting.
In one possible design, the first partition module sends a statistical response to the device managing the account, where the statistical response carries statistics of all subtasks in the first task list and/or statistics of all subtasks in the second task list. Therefore, the interactivity between the first partition module and the administrator is improved, so that the administrator can timely acquire the statistical data of a plurality of accounts, and the subsequent processing of the administrator is facilitated.
In a second aspect, an embodiment of the present application provides a data acquisition device, which is applied to a first partition module.
The device comprises:
the receiving module is used for receiving a first request, the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management authorities in the first account and the second account, and the first account is different from the second account.
And the sending module is used for responding to the received first request and sending a second request to the second partition module.
The receiving module is further configured to receive a third request from the second partition module, where the third request carries a first task list and first authentication information corresponding to the first account, and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or an applet of the first account having management authority, each subtask in the second task list is a public number or an applet of the second account having management authority, and the first task list, the first authentication information, the second task list and the second authentication information are acquired from the third party platform after the first account and the second account have been logged in by the second partition module in response to receiving the second request.
The sending module is further used for sending a fourth request to the third partition module, the fourth request is obtained based on the first task list, the first authentication information, the second task list and the second authentication information, the fourth request is used for notifying the third partition module to collect statistics data of all subtasks in a target task list, all the subtasks in the target task list belong to the first task list or the second task list, and the total amount of the subtasks corresponding to the target task list is smaller than or equal to the total amount of idle requests corresponding to idle network segment containers in the third partition module; and when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into the subtasks which do not appear in the target task list and the second task list and have the quantity smaller than or equal to the total quantity of idle requests corresponding to the idle network segment containers, and stopping sending a fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list. The receiving module is further configured to receive a fifth request from the third partition module, where the fifth request carries statistics data of all subtasks in the target task list, and the statistics data of all subtasks in the target task list is obtained from the third party platform by the third partition module traversing all subtasks in the target task list in response to receiving each fourth request.
And the summarizing module is used for summarizing the statistical data of all the subtasks in the first task list and the second task list after the statistical data of all the subtasks in the first task list and the second task list are determined to be collected.
In one possible design, the sending module is specifically configured to take out a target statistical task from the priority queue when an idle network segment container appears in the third partition module, where an initial state of the priority queue is constructed by a statistical task corresponding to the first task list and a statistical task corresponding to the second task list, and an initial state of the target statistical task is a statistical task corresponding to the first task list or a statistical task corresponding to the second task list;
the sending module is further specifically configured to determine, when the total amount of idle requests corresponding to the idle network segment containers is greater than or equal to the total amount of subtasks of the target statistical task, that a task list corresponding to the target statistical task is a target task list;
Or the sending module is further specifically configured to determine that, when the total amount of idle requests corresponding to the idle network segment container is smaller than the total amount of subtasks of the target statistical task, the same number of subtasks as the total amount of idle requests corresponding to the idle network segment container in the target statistical task are the target task list, and form a new statistical task from the remaining subtasks of the target statistical task except for the same number of subtasks as the total amount of idle requests corresponding to the idle network segment container, and add the new statistical task to the priority queue;
the sending module is further specifically configured to send a fourth request to the third partition module, until no statistical task exists in the priority queue, and stop sending the fourth request to the third partition module.
In one possible design, the apparatus further comprises: and the determining module is used for determining the statistical task with the largest weight value in the priority queue as the target statistical task.
In one possible design, the weight value of a statistical task is related to the priority of an account corresponding to the statistical task, the queuing order of an account corresponding to the statistical task, and the total subtask amount of an account corresponding to the statistical task.
In one possible design, the receiving module has a module for receiving a first request from a device that manages an account number. Or receiving a first request sent according to a preset period from equipment to which the first partition module belongs.
In one possible design, when the first request carries the first account and the name of the first subtask corresponding to the first account, the subtasks in the first task list include subtasks corresponding to the names of the first subtasks. Or when the first request carries the first account, the subtasks in the first task list comprise public numbers and applets with management authorities preset in the first account.
In one possible design, when the first request carries the second account and the name of the second subtask corresponding to the second account, the subtasks in the second task list include subtasks corresponding to the names of the second subtasks. Or when the first request carries the second account, subtasks in the second task list comprise public numbers and applets with management authorities preset in the second account.
In one possible design, the sending module is further configured to send a statistical response to the device for managing accounts, where the statistical response carries statistical data of all subtasks in the first task list and/or statistical data of all subtasks in the second task list.
The advantages of the data acquisition device according to the second aspect and the possible designs of the second aspect may be referred to the advantages of the first aspect and the possible implementations of the first aspect, and are not described herein.
In a third aspect, an embodiment of the present application provides a data acquisition system. The data acquisition system comprises: the first partition module, the second partition module and the third partition module are obtained by partitioning virtual machine resources of the data acquisition system by the data acquisition system;
The first partition module is used for receiving a first request, the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management rights in the first account and the second account, and the first account is different from the second account.
The first partition module is further configured to send a second request to the second partition module in response to receiving the first request.
And the second partitioning module is used for responding to the second request and sending a sixth request to the third party platform after the first account number and the second account number are logged in.
The second partition module is further configured to receive a seventh request from the third party platform, where the seventh request carries a first task list and first authentication information corresponding to the first account, and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or an applet of the first account having management authority, and each subtask in the second task list is a public number or an applet of the second account having management authority.
The second partition module is further configured to send a third request to the first partition module in response to receiving the seventh request, where the third request carries the first task list, the first authentication information, the second task list and the second authentication information.
The first partition module is further configured to send a fourth request to the third partition module, where the fourth request is obtained based on the first task list, the first authentication information, the second task list and the second authentication information, and the fourth request is used to inform the third partition module to collect statistics data of all subtasks in the target task list, all subtasks in the target task list belong to the first task list or the second task list, and the total amount of subtasks in the target task list is less than or equal to the total amount of idle requests corresponding to the idle network containers in the third partition module; and when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into the subtasks which do not appear in the target task list and in the second task list and the total number of the idle requests corresponding to the idle network containers in the third partition module is less than or equal to the total number of the idle requests, until the target task list traverses all the subtasks in the first task list and the second task list, and stopping sending a fourth request to the third partition module.
And the third partitioning module is used for traversing all the subtasks in the target task list in response to receiving each fourth request and acquiring the statistical data of all the subtasks in the target task list from the third party platform.
The third partition module is further configured to send a fifth request to the first partition module, where the fifth request carries statistics data of all subtasks in the target task list.
The first partitioning module is further configured to aggregate the statistics data of all the subtasks in the first task list and the second task list after determining that the statistics data of all the subtasks in the first task list and the second task list are collected based on the fifth request.
With the system provided in the third aspect, virtual machine resources of the data acquisition system may be divided into a first partition module (i.e., a dispatch center mentioned below), a second partition module (i.e., a WeChat login area mentioned below), and a third partition module (i.e., a data request area mentioned below). Therefore, based on the partition processing of the virtual machine resources of the data acquisition system, the virtual machine resources of different partitions correspond to different functions in the statistical data acquisition process, the virtual machine resources in the statistical data acquisition process are balanced, and the virtual machine resources are reasonably utilized.
In one possible design, when the first segment container and the second segment container in the third segment module are deployed in multiple segments of different operators,
The first network segment container is used for traversing partial subtasks in the target task list in response to receiving the fourth request and acquiring statistical data of the partial subtasks in the target task list from the third party platform;
the first network segment container is further configured to send a first sub-request to the first partition module, where the first sub-request carries statistics data of part of sub-tasks in the target task list;
the second network segment container is used for traversing the rest subtasks in the target task list in response to receiving the fourth request and acquiring the statistical data of the rest subtasks in the target task list from the third party platform;
the second network segment container is further configured to send a second sub-request to the first partition module, where the second sub-request carries statistics data of remaining sub-tasks in the target task list;
wherein the first sub-request and the second sub-request constitute a fifth request.
Therefore, the plurality of network segment containers in the third partition module are deployed in a plurality of network segments of different operators, and virtual machine resources and network resources are reasonably utilized. The method not only avoids the IP section blocking caused by a security mechanism because a large number of requests appear in the same network section container, but also avoids the waste of virtual machine resources and network resources caused by stacking a large number of statistical tasks in the same network section container or the same virtual machine. Therefore, the statistical data of a plurality of WeChat account numbers are timely, comprehensively and accurately acquired.
In one possible design, the second partition module is further configured to send, when the first account and/or the second account are not logged in, an eighth request to the device for managing accounts, where the eighth request carries login information of the unregistered account, and the eighth request is used to remind that the first account and/or the second account are not logged in.
Therefore, when the second partition module receives the second request from the first partition module, the login condition of the first account and the second account can be detected, so that timely acquisition of the first task list and the first authentication information corresponding to the first account and the second task list and the second authentication information corresponding to the second account is facilitated, and the fault tolerance and the availability of the statistical data of the plurality of accounts are improved.
In one possible design, the specific process of the statistical data of one subtask in the target task list obtained by the third partition module from the third party platform is as follows:
The third partition module is specifically configured to send a ninth request to the third party platform, where the ninth request carries an identifier of a subtask and authentication information corresponding to the subtask, and the ninth request is used to request statistical data of the subtask; the third partition module is further specifically configured to receive a tenth request from the third party platform, where the tenth request carries statistics data of one subtask.
Thus, the third partition module may traverse each subtask in the plurality of accounts upon receiving the fourth request from the first partition module to obtain corresponding statistics from the third method platform.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor; the memory is used for storing program instructions; the processor is configured to invoke the program instructions in the memory to cause the electronic device to perform the data collection method of the first aspect and any of the possible designs of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the data acquisition method of the first aspect and any one of the possible designs of the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product which, when run on a computer, causes the computer to perform the data acquisition method of the first aspect and any one of the possible designs of the first aspect.
In a seventh aspect, an embodiment of the present application provides a chip system, including: a processor; the electronic device performs the data collection method of the first aspect and any of the possible designs of the first aspect when the processor executes computer instructions stored in the memory.
Drawings
FIG. 1 is a schematic flow chart of a WeChat public number data acquisition method;
FIG. 2 is a schematic diagram of virtual machine resources of a data acquisition system according to an embodiment of the present application;
Fig. 3 is a signaling interaction schematic diagram of a data acquisition method according to an embodiment of the present application;
FIG. 4 is a flowchart of a data collection method according to an embodiment of the present application;
FIG. 5 is a flowchart of a data acquisition method according to an embodiment of the present application;
FIG. 6 is a flowchart of a data acquisition method according to an embodiment of the present application;
FIG. 7 is a flowchart of a data collection method according to an embodiment of the present application;
Fig. 8 is a schematic structural diagram of a data acquisition device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data acquisition device according to an embodiment of the present application.
Detailed Description
Referring to fig. 1, fig. 1 is a flow chart of a method for collecting WeChat public number data. As shown in fig. 1, the method for acquiring the micro-signal public signal data comprises the following steps:
Step 1, collecting WeChat public account numbers needing to be concerned;
Step 2, logging in more than one WeChat account number (shown as WeChat account number 1, weChat account number 2 and WeChat account number 3 in fig. 1) on a virtual machine at the same time, and enabling the virtual machine to pay attention by adding the collected WeChat public number account number to a task queue of the WeChat account number through key eidolon simulation operation;
Step 3, the virtual machine actively requests the WeChat public number data by clicking the WeChat public number through key eidolon simulation operation, and when the virtual machine interacts with the Internet data, the WeChat public number interaction data packet interacted by the virtual machine and the Internet is monitored and downloaded;
the account number 4 scans the WeChat public number interaction data packet on the virtual machine and extracts the needed WeChat related information;
and 5, analyzing the WeChat public number index information according to the WeChat related information required by extraction, and accessing the WeChat public number index information to obtain specific WeChat information of the WeChat public number.
In summary, the method for collecting the WeChat public number data shown in fig. 1 collects the content and the reading number of the WeChat public number by means of packet capturing, and predicts the number of users concerned about the WeChat public number according to the reading number. Typically, the number of users of interest is about 15 times the reading number.
However, the WeChat public number data acquisition method shown in FIG. 1 has the following problems:
Problem 1, scalability and resource utilization are poor. When the request frequency of the WeChat client and the interface (Application Program Interface, API) of the WeChat public platform is limited, the virtual machine is difficult to be added to cope with the statistical tasks corresponding to a plurality of WeChat account numbers:
A. The WeChat client limits the number of processes corresponding to the WeChat application program and only allows one process corresponding to the WeChat application program to be opened. Every time a WeChat account number needing to be counted is added, a virtual machine is needed to be added. The statistics tasks corresponding to the single WeChat account numbers are concentrated on one virtual machine, the WeChat account numbers with more statistics tasks need to be executed for a long time, and the WeChat account numbers with fewer statistics tasks also need to occupy one virtual machine.
B. The API of the WeChat public platform may limit the number of requests and protocol (Internet Protocol, IP) segments that are interconnected between networks. In general, access to the same IP segment is denied if the access is too large. If a large number of virtual machines are placed in the same IP segment, the IP segment is inevitably disabled or waiting for a long time, resulting in a large number of statistical tasks being blocked.
Problem 2, data cannot be accurately acquired. The number of users of interest of the WeChat public number can be estimated only by reading the number. WeChat public numbers that are relatively stable to the user population will not be able to acquire data. In addition, statistics of WeChat applet cannot be obtained.
In order to solve the above problems, embodiments of the present application provide a data collection method, device, system, and computer storage medium, where an execution body of the data collection method in the embodiments of the present application is applied to a data collection system (e.g., composed of virtual machine resources provided by one or more servers), and is applied to the field of financial science and technology, etc., and based on reasonable configuration of virtual machine resources and network resources in the data collection system, statistical data of a plurality of WeChat accounts can be automatically collected, which is favorable for processing efficiency and processing speed of enterprises in aspects of WeChat public numbers and/or WeChat applet operation, related product marketing, advertisement delivery, government supervision, etc., and reduces cost pressure of enterprises for obtaining statistical data.
The WeChat account numbers mentioned in the embodiment of the application can be represented by numbers, letters, characters, symbols and the like. In addition, the application does not limit the specific content of the statistics data of the WeChat account. For example, statistics of WeChat accounts may include statistics of WeChat public numbers and statistics of WeChat applets. Statistics of WeChat public numbers may include, but are not limited to, the number of users of interest, the amount of reading in the text, and the like. Statistics of the WeChat applet may include, but are not limited to, page View (PV), individual visitor count (UV), number of users, etc.
Referring to fig. 2, fig. 2 is a schematic diagram of virtual machine resources of a data acquisition system according to an embodiment of the application. As shown in fig. 2, the data acquisition system may partition virtual machine resources of the data acquisition system such that the data acquisition system may include: the first partition module (i.e. a dispatch center), the second partition module (i.e. a WeChat login area) and the third partition module (i.e. a data request area).
1. WeChat login area
The virtual machine in the WeChat login area is used for logging in the WeChat account number which needs statistical data, and the client service is built in the virtual machine in the WeChat login area. The client service can log in the WeChat account through robot process automation (robotic process automation, RPA), operate the WeChat account to access the WeChat access WeChat public number data assistant or WeChat applet data assistant, trigger the WeChat public number data assistant or WeChat applet data assistant to initiate a request to the API of the WeChat public platform, capture and acquire authentication information corresponding to the WeChat account through proxy request software, and return the received authentication information to the dispatching center.
2. Data request area
The data request area deploys containers into multiple segments of different operators (here containers, i.e., segment containers). The network segment containers in the data request area are internally provided with a plurality of request services, and each request service corresponds to one network segment container, so that the request service can be automatically expanded to adapt to statistical tasks with different task quantities. Each request service can initiate a request to an API of the WeChat public platform by using authentication information captured by the WeChat login area, pull the statistical data of each WeChat account number in batches, and return the statistical data of each WeChat account number to the dispatching center.
For ease of illustration, in fig. 2, the data request area is illustrated with a plurality of segment containers located in segment a and a plurality of segment containers located in segment B. The network segment A and the network segment B belong to different network segments, and the network segment container corresponding to the network segment A is different from the network segment container corresponding to the network segment B.
3. Dispatching center
The dispatching center is divided into two parts of a management platform and a dispatching service. The management platform is used for providing an administrator with registration and management of WeChat account numbers, administrators and other information. The dispatch service mainly provides the following three functions:
1. And reasonably distributing and arranging the statistical tasks corresponding to the WeChat accounts.
The scheduling service can calculate the total idle request amount of each network segment container in unit time according to the idle condition of each network segment container in the data request area, and allocate one or more statistical tasks corresponding to WeChat account numbers in the network segment container with more idle request amount.
2. And calculating the priority order of the WeChat accounts.
The scheduling service can be constructed into a priority queue, the statistical tasks to be executed are placed into the queue for queuing, and when the idle network segment container appears, the corresponding statistical tasks are taken out according to the priority sequence of a plurality of WeChat accounts for distribution.
The specific process of the algorithm of the priority queue is as follows: the scheduling service can maintain a priority queue by using the structure of the heap, the queue elements are stored in the array, the largest element is taken out from the head of the array for carrying out the allocation of the statistical task when the queue elements are dequeued each time, and the elements are added from the tail of the array when the queue elements are dequeued, but the structure of the heap is required to be adjusted according to the inserted elements.
Therefore, the scheduling service can combine the algorithm of the priority queue, and the algorithm of the big top heap is applied to calculate the weight value of the statistical task corresponding to each WeChat account, so that each statistical task performs corresponding position adjustment based on the weight value of the statistical task when the statistical task is listed, and the statistical task with the maximum weight value is ensured to be positioned at the top of the priority queue.
For example, where the first and second segment containers in the third partition module are deployed in multiple segments of different operators, the dispatch service may assign all sub-tasks in the target task list to the first and second segment containers.
The allocation process can be flexibly allocated according to preset configuration or actual conditions. For example, when the total amount of idle requests corresponding to the first segment container is greater than the total amount of idle requests corresponding to the second segment container, the scheduling service may assign a majority of the subtasks in the target task list to the first segment container, and the scheduling service may assign a minority of the subtasks in the target task list to the second segment container.
3. And initiating a statistical task according to the configuration of the management platform, monitoring and recording the execution condition of the statistical task corresponding to each WeChat account, and sorting and summarizing the statistical data of each WeChat account.
In the embodiment of the application, the virtual machine resources of the data acquisition system (such as the virtual machine resources provided by one or more servers) are divided into a dispatching center, a WeChat login area and a data request area. The dispatching center can trigger a WeChat login area, the WeChat login area is responsible for logging in each WeChat account, and a task list and authentication information of each WeChat account are acquired from a WeChat public platform. The WeChat login area transmits the task list and authentication information of each WeChat account to the dispatching center, and the dispatching center can uniformly manage the task list and authentication information of each WeChat account. The dispatching center can trigger the data request area based on the task condition of each WeChat account and the idle condition of the network segment container, and the data request area obtains the statistical data of each WeChat account from the WeChat public platform. Wherein the network segment containers in the data request area are distributed among a plurality of network segments of different operators. The data request area transmits the statistical data of each WeChat account number to the dispatching center. In addition, the WeChat login area can confirm whether the WeChat account is disconnected or not in real time. Once the WeChat account is disconnected or not logged in, the WeChat login area can be pushed to an administrator in a mode of a login two-dimensional code, the administrator remotely repairs the login state of the WeChat account, and availability is provided for collecting the statistics data of the WeChat account.
Next, a specific implementation process of the data acquisition method provided by the embodiment of the present application is described with reference to fig. 3 by using the first partition module, the second partition module, and the third partition module in the data acquisition system shown in fig. 2 as execution bodies.
Referring to fig. 3, fig. 3 is a signaling interaction schematic diagram of a data acquisition method according to an embodiment of the application. As shown in fig. 3, the data acquisition method provided by the embodiment of the present application may include:
S101, a first partition module receives a first request, wherein the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management rights in a first account and a second account, and the first account is different from the second account.
When the first partition module (i.e. the dispatching center) receives the first request, the dispatching center can be triggered to collect the statistics data of public numbers and/or applets with management authorities in the first account and the second account, namely, the statistics tasks corresponding to the first account and the statistics tasks corresponding to the second account are initiated.
The embodiment of the application does not limit the specific content and the sending mode of the first request. The implementation manner of the first account and the second account can be specifically referred to the description of the WeChat account mentioned in the embodiment of the present application, which is not repeated here.
In some embodiments, when the administrator needs to collect statistics of the first account and the second account, the administrator may send a first request to the dispatch center through a device that manages the accounts. The device for managing the account number in the embodiment of the application can be a terminal device or a device forming a data acquisition system. In addition, the device that manages the account number may be one or more devices. When there are multiple devices managing the account, the dispatch center may receive a first request from each device managing the account.
Therefore, the first partition module can determine the public number and/or the applet corresponding to the account number which needs to collect the statistical data based on the wish of the administrator.
In other embodiments, the first partition module may be pre-configured with a software module. The software module may send a first request to the dispatch center according to a preset period, so that the dispatch center periodically collects statistical data of the first account number and the second account number. The specific value of the preset period is not limited in the embodiment of the application.
Therefore, the first partition module can determine the public number and/or the applet corresponding to the account number needing collecting statistical data based on the default setting.
S102, the first partition module sends a second request to the second partition module in response to receiving the first request.
And S103, the second partition module responds to the second request, and a sixth request is sent to the third party platform when the first account number and the second account number are logged in.
The first partition module (i.e., the dispatch center) may send a second request to the second partition module (i.e., the WeChat login area) after receiving the first request. After receiving the second request, the WeChat login area can judge whether the first account and the second account are logged in, namely the login states of the first account and the second account. When the first account number and/or the second account number are/is not logged in, the WeChat login area can create a virtual machine, install a client, configure a network environment, and log in on the client through the RPA operation of the unregistered account numbers (such as the first account number and/or the second account number). And, the WeChat login area can record the login state of the account, so as to judge the login state of each account subsequently.
The way of logging in the account number on the client by the WeChat login area is not limited. In some embodiments, when the first account and/or the second account are not logged in, the WeChat login area may send an eighth request to the device for managing accounts, where the eighth request carries login information of the unregistered account, and the eighth request is used to remind that the first account and/or the second account are not logged in. Therefore, before the data acquisition method provided by the embodiment of the application is executed, the WeChat login area can detect the login condition of the account based on the login state of each account stored by the WeChat login area, and the fault tolerance and the usability of the data acquisition method provided by the embodiment of the application are improved.
The embodiment of the application does not limit the specific implementation manner of the login information and the eighth request. For example, the login information may be a login two-dimensional code picture corresponding to the account. An administrator can scan and log in the two-dimension code picture through the equipment for managing the account. After the two-dimension code picture is successfully scanned, the equipment for managing the account numbers can send a login response to the WeChat login area, so that the WeChat login area can know that the account numbers which are not logged in are successfully logged in currently.
Therefore, when the WeChat login area logs in the first account and the second request indicates that the statistics data of the public numbers with the management authorities in the first account and the second account are collected, the WeChat login area can send a sixth request to the third party platform through the WeChat client through the public number data assistant.
Or when the WeChat login area is logged in the first account and the second account, and the second request indicates that the statistics data of the applet having the management authority in the first account and the second account are collected, the WeChat login area can send a sixth request to the third party platform through the applet data assistant by the WeChat client.
Or when the WeChat login area is logged in the first account and the second account, and the second request indicates that the statistics data of the public numbers and the applets with the management authorities in the first account and the second account are collected, the WeChat login area can send a sixth request to the third party platform through the public number data assistant and the applet data assistant by the WeChat client.
The embodiment of the present application does not limit the specific implementation manner of the sixth request. The implementation manner of the third party platform can be specifically referred to the description of the WeChat public platform mentioned in the embodiment of the present application, and will not be described herein.
S104, the second partition module receives a seventh request from the third party platform.
Based on step S103, the third party platform carries the first task list and the first authentication information corresponding to the first account number, and the second task list and the second authentication information corresponding to the second account number in the seventh request based on the sixth request, and the third party platform may send the seventh request to the WeChat login area, so that the WeChat login area obtains the first task list and the first authentication information corresponding to the first account number, and the second task list and the second authentication information corresponding to the second account number.
Each subtask in the first task list is a public number or an applet of which the first account has management authority. Each subtask in the second task list is a public number or applet for which the account has administrative rights. Typically, the first task list and the second task list store unique identifications of the subtasks.
The first authentication information is a ticket required by the third party platform for authenticating a request carrying a first task list corresponding to the first account number when the data request area accesses an API of the third party platform. The second authentication information is a ticket which is required by the third party platform to authenticate the request carrying the second task list corresponding to the second account number when the data request area accesses the API of the third party platform. Typically, the first authentication information and the second authentication information are a piece of encrypted information.
In addition, the embodiment of the present application does not limit the specific implementation manner of the third request.
In some embodiments, when the first request carries the first account and the identifier of the first subtask corresponding to the first account, the subtasks in the first task list include subtasks corresponding to the names of the first subtasks.
The number and types of the subtasks in the first subtask are not limited in the embodiment of the application. And the identification of the first subtask can uniquely identify the subtask in the first subtask, wherein the subtask in the first subtask is a part of the subtasks with the management authority in the first account. In addition, the specific implementation manner of the identification of the first subtask is not limited in the embodiment of the present application.
Correspondingly, when the first request carries the second account and the name of the second subtask corresponding to the second account, the subtasks in the second task list comprise subtasks corresponding to the names of the second subtasks.
The number and types of the subtasks in the second subtask are not limited in the embodiment of the application. And, the identifier of the second subtask may uniquely identify a subtask in the second subtask, where the subtask in the second subtask is a subtask having management rights in part of the second account. In addition, the specific implementation manner of the identification of the second subtask is not limited in the embodiment of the application.
In other embodiments, when the first request carries the first account, the subtasks in the first task list include all public numbers and applets with management rights in the first account.
Correspondingly, when the first request carries the second account, all subtasks in the second task list comprise public numbers and applets with management authorities in the second account.
S105, the second partition module sends a third request to the first partition module in response to receiving the seventh request.
Based on step S104, the second partition module may obtain the first task list, the first authentication information, the second task list, and the second authentication information. Therefore, after receiving the seventh request, the WeChat login area can carry the first task list and the first authentication information corresponding to the first account, and the second task list and the second authentication information corresponding to the second account in the third request and transmit the third request to the dispatching center.
S106, the first partition module sends a fourth request to the third partition module, wherein the fourth request is used for informing the third partition module to collect statistical data of all subtasks in the target task list; and when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into the subtasks which do not appear in the target task list and in the second task list and the total number of the idle requests corresponding to the idle network containers in the third partition module is less than or equal to the total number of the idle requests, until the target task list traverses all the subtasks in the first task list and the second task list, and stopping sending a fourth request to the third partition module.
Based on the description of step S105, the first partition module (i.e. the scheduling center) may obtain the first task list and the first authentication information corresponding to the first account, and the second task list and the second authentication information corresponding to the second account. And, the dispatching center can determine the total amount of idle requests corresponding to the idle network containers in the third partition module (namely, the data request area).
In summary, the scheduling center may select one task list from the first task list or the second task list as the target task list in combination with the total amount of idle requests corresponding to the idle network container. And the fourth request is used for all subtasks in the target task list to belong to the first task list or the second task list, and the total subtask amount of the target task list is smaller than or equal to the total idle request amount corresponding to the idle network container in the third partition module.
Thus, the dispatch center may send a fourth request to the data request area informing the data request area to collect statistics of all subtasks in the target task list. The fourth request carries the first task list, the first authentication information, the second task list and the second authentication information, so that the data request area can sequentially send the request which can be used for the third party platform to successfully receive the request to the third party platform.
The dispatching center continues to wait for the empty network container in the data request area after the preset time length. When the target task list does not traverse all the subtasks in the first task list and the second task list, the scheduling center can update all the subtasks in the target task list into the subtasks which do not appear in the target task list and the second task list and have the quantity smaller than or equal to the total quantity of idle requests corresponding to the idle network containers in the data request area at the current moment, until the target task list traverses all the subtasks in the first task list and the second task list, and the scheduling center can stop sending a fourth request to the data request area.
And S107, the third partitioning module traverses all the subtasks in the target task list in response to receiving each fourth request, and acquires the statistical data of all the subtasks in the target task list from the third party platform.
Based on step S106, the third partition module (i.e., the data request area) may obtain one or more fourth requests. Thus, the data request area may traverse all the subtasks in the target task list received each time, i.e., traverse all the subtasks in the first task list and the second task list, so that the data request area receives statistics of all the subtasks in the first task list and the second task list from the third party platform.
The specific process of receiving statistical data of a subtask in a target task list from a third party platform by an idle network container in the data request area is as follows:
s1071, the third partition module sends a ninth request to the third party platform, wherein the ninth request carries an identification of one subtask and authentication information corresponding to the subtask, and the ninth request is used for requesting statistical data of the subtask.
S1072, the third partition module receives a tenth request from the third party platform, wherein the tenth request carries statistical data of a subtask.
The data request section may send a ninth request to the third party platform to request statistics of one of the subtasks from the third party platform. After the third party platform receives the ninth request, the ninth request carrying the identifier of one sub-task may be authenticated based on authentication information (such as the first authentication information or the second authentication information) corresponding to the one sub-task, so as to determine whether the fourth request meets the verification condition. And when the ninth request meets the verification condition, the third party platform acquires the statistical data of one subtask based on the identification of the subtask, carries the statistical data of one subtask in the tenth request and sends the statistical data of one subtask to the data request area.
The embodiment of the present application is not limited to the specific implementation manner of the ninth request and the tenth request.
It should be noted that, steps S1071-S1072 are one possible implementation of receiving statistics of one subtask in the target task list from the third party platform for the data request area. In addition, when the third party platform is provided with statistics of all subtasks in the first task list and the second task list provided to the data request area simultaneously, the data request area may receive statistics of all subtasks in the first task list and the second task list from the third party platform.
S108, the third partition module sends a fifth request to the first partition module, wherein the fifth request carries statistics data of all subtasks in the target task list.
Based on step S107, the data request area may obtain statistics of all subtasks in the first task list and the second task list from the third party platform. Thus, the data request area carries statistics of all subtasks in the first task list and the second task list in a fifth request and sends the fifth request to the dispatch center.
The embodiment of the present application does not limit the specific implementation manner of the fifth request.
For example, the fifth request carries statistics of all subtasks in the first task list and the second task list.
For another example, the fifth request may include a plurality of sub-requests, each sub-request including statistics of a portion of the sub-tasks, a sum of statistics of sub-tasks in the middle of all of the sub-tasks comprising statistics of all of the sub-tasks in the first task list and the second task list.
In some embodiments, the first partition module may send a fourth request to the third partition module when the total amount of subtasks of the target task list is less than or equal to the total amount of idle requests corresponding to the first segment container and the second segment container.
The third partition module may assign subtasks to the first segment container and the second segment container in response to receiving the fourth request when the first segment container and the second segment container in the third partition module are deployed in multiple segments of different operators.
The first network segment container may traverse part of the subtasks in the target task list, obtain statistics of the part of the subtasks in the target task list from the third party platform, and send a first sub-request to the first partition module, where the first sub-request carries statistics of the part of the subtasks in the target task list.
The second network segment container may traverse the remaining subtasks in the target task list, obtain statistics of the remaining subtasks in the target task list from the third party platform, and send a second sub-request to the first partition module, where the second sub-request carries statistics of the remaining subtasks in the target task list.
Wherein the first sub-request and the second sub-request constitute a fifth request.
And S109, after the first partition module determines that the statistical data of all the subtasks in the first task list and the second task list are collected based on the fifth request, the statistical data of all the subtasks in the first task list and the second task list are summarized.
After receiving the fifth request, the dispatching center can judge whether the statistics data of all subtasks in the first task list and the second task list are collected. Therefore, after the statistical data of all the subtasks in the first task list and the second task list are determined to be collected, the dispatching center can collect, sort and record the statistical data of all the subtasks in the first task list and the second task list.
S110, the first partition module sends a statistical response to the equipment for managing the account, wherein the statistical response carries statistical data of all subtasks in the first task list and/or statistical data of all subtasks in the second task list.
The first partition module (i.e., the dispatch center) may carry statistics of all subtasks in the first task list and the second task list in a statistics response and send the statistics response to the device managing the account. The statistical data of all subtasks in the first task list and the second task list can be represented in the form of folders, messages or pictures. And, the embodiment of the application does not limit the specific implementation manner of the statistical response.
For example, the scheduling center may send a statistical response to the device that manages the first account based on the first account, so that an administrator can timely obtain statistical data of all subtasks in the first task list when logging in the first account in the device that manages the first account.
It should be noted that step S109 is an optional step.
In summary, the data acquisition method provided by the embodiment of the application can be divided into two stages: a registration phase and a data acquisition phase.
1. Registration phase (or data entry phase)
In connection with fig. 4, a specific implementation of the registration phase is described.
Referring to fig. 4, fig. 4 is a flow chart of a data acquisition method according to an embodiment of the application. As shown in fig. 4, before statistical data is collected by using the data collection method provided by the embodiment of the present application, an administrator needs to register data on a management platform, such as entering a public number and/or name of an applet to be collected and an account number having management authority, which corresponds to step S101.
The dispatch center may notify the WeChat login area to invoke an automated script in the WeChat login area to create a virtual machine, configure a network environment, install an application, and collect client services. After the virtual machine is created, the network environment is configured, the application program is installed and the client service is acquired, the application program can be started through the PRA, and the login two-dimensional code picture is captured. And the WeChat login area sends the login two-dimensional code picture to a dispatching center. The dispatching center sends a request for reminding a login account to equipment for managing the account based on the account registered by the administrator.
When the administrator does not agree to login or login failure, the WeChat login area may determine that the login fails, and may send a message to the dispatch center indicating that the account cannot login. After an administrator scans the login two-dimensional code picture to realize successful login of the account, the WeChat login area can receive corresponding information from equipment for managing the account, and the WeChat login area can send information for indicating that the account is successfully logged in to a dispatching center, so that the dispatching center can determine that the account is logged in.
2. Data collection phase
With reference to fig. 5 and 6, a specific implementation of the data collection phase is described.
Referring to fig. 5 and fig. 6, fig. 5 and fig. 6 are schematic flow diagrams of a data acquisition method according to an embodiment of the application. As shown in fig. 5 and fig. 6, the collection behavior of the statistical data is initiated by the dispatching center, and the management platform can configure the execution time of the statistical task corresponding to each account, so that the situation that a large number of virtual machine resources and network resources are consumed due to accumulation of a large number of statistical tasks in the data request area because all the statistical tasks are executed together in the same time can be avoided.
The scheduling center sends a second request carrying the identification of the first account number and the identification of the second account number to the WeChat login area, which corresponds to step S102. When the first account and the second account are logged in, the WeChat login area may open an applet data assistant and/or a public number data assistant (abbreviated as a PC in FIG. 6) by an RPA simulation operation through a client service program (abbreviated as a client in FIG. 6) in the WeChat login area, and send a sixth request to the third party platform, which corresponds to step S103. Thus, the third party platform may send a seventh request to the WeChat login area, so that the WeChat login area captures the list of public numbers and/or applets to be counted and authentication information from the seventh request through the proxy request software, corresponding to step S104. And, the WeChat login area analyzes the response data of the list of public numbers and/or applets to be counted and the authentication information, and carries the processed data in the third request through the client service program in the WeChat login area to return to the dispatching center, corresponding to step S105.
The scheduling center gradually takes out a target task list from the first task list or the second task list based on the total idle request amount corresponding to the idle network container, and distributes statistical tasks corresponding to the target task list to the idle network segment containers in the data request area, which corresponds to step S106.
Therefore, after the request service in the idle network container receives the statistical task sent by the dispatching center, the idle network container can traverse each subtask in the target task list corresponding to the statistical task, and can initiate a ninth request to the third party platform so as to acquire the statistical data of all subtasks in the target task list corresponding to the statistical task from the third party platform. The idle network container may return statistics of all subtasks in the target task list to the dispatch center, corresponding to step S107.
And the dispatching center continues to wait for the empty network container in the data request area after the preset time length. When the target task list does not traverse each of the first task list and the second task list, the scheduling center may update the first task list and the second task list, so that the subtasks in the target task list are updated to be the total idle requests corresponding to the idle network containers in the data request area at the current moment, where the subtasks in the target task list are not present in the target task list, and the number of the subtasks in the updated target task list is less than or equal to the total idle requests corresponding to the idle network containers in the data request area at the current moment, which corresponds to step S106.
Therefore, the dispatching center continues to distribute the statistical tasks corresponding to the updated target task list to the idle network segment containers in the data request area. Thus, the idle network container may acquire the statistics data of all the subtasks in the target task list corresponding to the statistics task from the third party platform, and return the statistics data of all the subtasks in the target task list to the dispatch center, corresponding to step S107.
When the target task list has traversed each subtask in the first task list and the second task category, the scheduling center may stop the operation of allocating the statistical task corresponding to the target task list to the idle segment container in the data request area, corresponding to step S106.
In summary, the network container in the data request area may respond to the fourth request each time, traverse all the subtasks in the target task list, and obtain the statistical data of all the subtasks in the target task list from the third party platform, which corresponds to step S107.
Wherein the number of network containers and network containers responding to the fourth request each time may be the same or different, which is not limited by the embodiment of the present application.
When the idle network container acquires the statistical data of all the subtasks in the target task list each time, the statistical data of all the subtasks in the target task list can be returned to the dispatching center, so that the dispatching center can collect the statistical data of all the subtasks in the first task list and the second task list after determining that the statistical data of all the subtasks in the first task list and the second task list are collected, and the statistical data corresponds to the steps S108 and S109.
According to the embodiment of the application, based on the partition configuration of the virtual machine resources in the data acquisition system, the virtual machine resources in different partitions correspond to different functions in the statistical data acquisition process, and the virtual machine resources in the statistical data acquisition process are balanced, so that the data acquisition system can automatically acquire the statistical data of public numbers and/or small programs with management authorities in a plurality of accounts, and the public numbers and/or small programs do not need to be manually logged in one by one to check the statistical data, thereby being beneficial to the processing efficiency and processing speed of enterprises in the aspects of operation of the public numbers and/or small programs, related product marketing, advertisement delivery, government supervision and the like, and reducing the cost pressure of the enterprises for acquiring the statistical data.
In addition, the data acquisition system can distribute network segment containers in a plurality of network segments of different operators, and virtual machine resources and network resources are reasonably utilized. Therefore, the statistical data of a plurality of WeChat account numbers are timely, comprehensively and accurately acquired.
In step S106, the scheduling center may package the related information of the same account into a statistical task. For example, the scheduling center encapsulates the first task list and the first authentication information corresponding to the first account into a statistical task 1, encapsulates the second task list and the second authentication envelope corresponding to the second account into a statistical task 2, encapsulates the sub-tasks and the first authentication information which are not executed in the first task list corresponding to the first account into a statistical task 3, and encapsulates the sub-tasks and the second authentication information which are not executed in the second task list corresponding to the second account into a statistical task 4.
Thus, the scheduling center may sort the initiation sequence of all the statistical tasks to obtain a priority queue (including the statistical task 1 and the statistical task 2, or including the statistical task 1 and the statistical task 4, or including the statistical task 2 and the statistical task 3), so that the scheduling center gradually takes out one statistical task from the priority queue and distributes the statistical task to a suitable and idle network segment container (the taken statistical task can be understood that the statistical task is not in the priority queue). After receiving the statistical task sent by the dispatching center, the request service in the network segment container can traverse the corresponding subtasks in the target task list corresponding to the statistical task, and acquire the statistical data of all subtasks in the task list corresponding to the statistical task by initiating a request to a third party platform. The request service in the network segment container can return the statistical data of the corresponding subtasks in the task list corresponding to the statistical tasks to the dispatching center, so that the dispatching center can obtain the statistical data of all the tasks in the task list corresponding to the statistical tasks.
Therefore, the dispatching center can dispatch the idle network segment container to execute one statistical task in the priority queue, so that the idle network segment container can traverse all subtasks in the target task list corresponding to the statistical task each time. And the scheduling center can also judge whether the target task list traverses all the subtasks in the first task list and the second task list. After the target task list does not traverse all the subtasks in the first task list and the second task list, the scheduling center can update the target task list and increase the statistical tasks corresponding to the target task list to the priority queue. After the target task list traverses all the subtasks in the first task list and the second task list, the dispatching center can stop distributing statistical tasks to the idle network segment containers.
Wherein the dispatch center may simultaneously assign one or more statistical tasks to one or more free segment containers. In some embodiments, the scheduling center may order the statistical tasks based on the weight values of the statistical tasks corresponding to the accounts. For example, the dispatch center may determine the statistical task with the greatest weight value in the priority queue as the target statistical task.
The embodiment of the application does not limit the specific implementation mode of the weight value of the statistical task. In some embodiments, the weight value of a statistical task may be related to the priority of an account corresponding to the statistical task, the queuing order of an account corresponding to the statistical task, and the total subtask amount of an account corresponding to the statistical task.
For example, the calculation formula of the weight value of the statistical task is as follows:
Weight value of statistical task=priority of account 0.5+queuing order of account 0.3+total subtask of account 0.2
The priority of the account is configured by the management platform, and the numerical range of the priority of the account is 0 to 1.
Wherein, the queuing order of account number= (worst queuing length-enqueue order of statistical task)/worst queuing length. Worst queuing length: i.e., the total amount of accounts registered in the management station. Priority of account number: i.e., (number of public numbers + number of applets)/maximum number of bindings for a single account number. Typically, the maximum binding number for a single account is 50.
Assume that the management station has registered 10 accounts. The priority of the A account is configured to be 0.6, 2 statistical tasks are queued in the priority queue during enqueuing, and 40 public numbers and small programs are counted in total. The priority of the account B is configured to be 0.4, the priority queue is empty during enqueue, and 40 public numbers and small programs are counted.
The scheduling center can calculate the weight value of the statistical task corresponding to the A account and the weight value of the statistical task corresponding to the B account based on the calculation formula:
The weight value of the statistical task corresponding to the a account number=0.6×0.5+7/10×0.3+40/50×0.2=0.3+0.21+0.16=0.67
Weight value of statistical task corresponding to B account number=0.4×0.5+9/10×0.3+40/50×0.2=0.2+0.27+0.16=0.63
Therefore, the scheduling center can determine that the weight value of the statistical task corresponding to the A account is larger than the weight value of the statistical task corresponding to the B account, and then can determine to execute the statistical task corresponding to the A account and execute the statistical task corresponding to the B account.
It should be noted that, in addition to the above manner, the scheduling center may determine the weight value of the statistical task corresponding to the account only based on the priority of the account.
When the total amount of the subtasks of the first task list and the second task list is larger than or equal to a preset threshold, the scheduling center can determine that the collection amount of the statistical data is larger, and the collection of the statistical data of the subtasks cannot be realized by using the idle network segment container at one time. The magnitude of the preset threshold is not limited in the embodiment of the application.
In the following, taking the example that the priority queue includes the statistic task 1 and the statistic task 2, a possible implementation of the first partition module to allocate the statistic task is described with reference to fig. 7.
Referring to fig. 7, fig. 7 is a flowchart of a data acquisition method according to an embodiment of the application. As shown in fig. 7, the data acquisition method according to the embodiment of the present application may include:
s201, the first partition module judges whether an idle network segment container exists in the third partition module.
If yes, the dispatching center executes step S202; if not, the dispatch center performs step S203.
S202, when the first partition module generates an idle network segment container in the third partition module, a target statistical task is taken out from a priority queue.
The initial state of the priority queue is constructed by the statistical task corresponding to the first task list and the statistical task corresponding to the second task list, namely the initial state of the priority queue comprises the statistical task 1 and the statistical task 2. The initial state of the target statistical task is the statistical task corresponding to the first task list or the statistical task corresponding to the second task list, namely the initial state of the target statistical task is the statistical task 1 or the statistical task 2.
The dispatch center may determine a target statistical task based on the weight values of the individual statistical tasks in the priority queue. In some embodiments, when statistical task 1 is at the top of the priority queue and statistical task 2 is at the bottom of the priority queue, the weight value of statistical task 1 is greater than the weight value of statistical task 2, so the initial state of the target statistical task is the first task list.
S203, the first partition module judges whether the total idle request amount corresponding to the idle network segment container is greater than or equal to the total subtask amount of the target statistical task.
The idle network segment container mentioned herein may refer to all network segment containers in the data request area, or may be part of network segment containers in the data request area, or may be one network segment container in the data request area, which is not limited in the embodiment of the present application.
If yes, the dispatching center executes step S2041; if not, the dispatch center performs step S2042.
S2041, the first partition module determines that a task list corresponding to the target statistical task is a target task list.
S2042, the first partition module determines that the same number of sub-tasks as the total number of idle requests corresponding to the idle network segment containers in the target statistical task are the target task list, and forms a new statistical task from the remaining sub-tasks except for the same number of sub-tasks as the total number of idle requests corresponding to the idle network segment containers in the target statistical task, and adds the new statistical task to the priority queue.
The scheduling center splits all the subtasks in the target statistical task to obtain the same number of subtasks as the total amount of idle requests and the rest of the subtasks except the same number of the subtasks as the total amount of idle requests. And, the scheduling center determines the same number of sub-tasks as the total number of idle requests as a target task list, encapsulates the remaining sub-tasks except the same number of sub-tasks as the total number of idle requests into a new statistical task (i.e., statistical task 3), and adds the statistical task to the priority queue in step S202 to implement updating of the priority queue. At this time, the priority queue is changed from including the statistical task 1 and the statistical task 2 to including the statistical task 2 and the statistical task 3.
A new statistical task is formed and added to the priority queue.
S205, the first partition module sends a fourth request to the third partition module until the statistical task does not exist in the priority queue, and stops sending the fourth request to the third partition module.
The scheduling center adopts the fourth request, and can inform the data request area to traverse all subtasks in the target task list, and continue to execute step S201 until no statistical task exists in the priority queue.
Thus, the data request area may traverse all of the subtasks in the first task list and the second task list. Therefore, network resources are effectively utilized, IP section blocking caused by a security mechanism due to a large number of requests in the same network section container is avoided, and waste of virtual machine resources and network resources caused by accumulation of a large number of statistical tasks in the same network section container or the same virtual machine is avoided.
The embodiment of the application also provides a data acquisition device.
Referring to fig. 8-9, fig. 8-9 are schematic structural diagrams of a data acquisition device according to an embodiment of the application.
The data acquisition device 100 according to the embodiment of the present application may be disposed in a server, and may implement the operation of the first partition module in the data acquisition system according to the above-described data acquisition method embodiment. As shown in fig. 8, the apparatus 100 may include: a receiving module 101, a transmitting module 102 and a summarizing module 103.
The receiving module 101 is configured to receive a first request, where the first request is used to trigger the first partition module to collect statistics data of public numbers and/or applets having management rights in the first account and the second account, and the first account is different from the second account;
a sending module 102, configured to send a second request to the second partition module in response to receiving the first request;
The receiving module 101 is further configured to receive a third request from the second partition module, where the third request carries a first task list and first authentication information corresponding to the first account, and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or an applet of the first account having management authority, each subtask in the second task list is a public number or an applet of the second account having management authority, and the first task list, the first authentication information, the second task list and the second authentication information are acquired from the third party platform after the first account and the second account have been logged in, in response to receiving the second request;
The sending module 102 is further configured to send a fourth request to the third partition module, where the fourth request is obtained based on the first task list, the first authentication information, the second task list, and the second authentication information, and the fourth request is used to notify the third partition module to collect statistics data of all subtasks in the target task list, where all subtasks in the target task list belong to the first task list or the second task list, and a total amount of subtasks corresponding to the target task list is less than or equal to a total amount of idle requests corresponding to idle network segment containers in the third partition module; updating all the subtasks in the target task list into subtasks which do not appear in the target task list and are less than or equal to the total idle request amount corresponding to the idle network segment containers in the first task list and the second task list when the target task list does not traverse all the subtasks in the first task list and the second task list, and stopping sending a fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list;
The receiving module 101 is further configured to receive a fifth request from the third partition module, where the fifth request carries statistics data of all subtasks in the target task list, and the statistics data of all subtasks in the target task list is obtained from the third party platform by the third partition module traversing all subtasks in the target task list in response to receiving each fourth request.
And the summarizing module 103 is configured to summarize the statistical data of all the subtasks in the first task list and the second task list after determining that the statistical data of all the subtasks in the first task list and the second task list are collected.
In some embodiments, the sending module 102 is specifically configured to, when an idle network segment container appears in the third partition module, take out a target statistical task from the priority queue, where an initial state of the priority queue is constructed by a statistical task corresponding to the first task list and a statistical task corresponding to the second task list, and an initial state of the target statistical task is a statistical task corresponding to the first task list or a statistical task corresponding to the second task list;
The sending module 102 is further specifically configured to determine, when the total amount of idle requests corresponding to the idle network segment containers is greater than or equal to the total amount of subtasks of the target statistical task, that a task list corresponding to the target statistical task is a target task list;
or the sending module 102 is further specifically configured to determine that, when the total amount of idle requests corresponding to the idle network segment container is smaller than the total amount of subtasks of the target statistical task, the same number of subtasks as the total amount of idle requests corresponding to the idle network segment container in the target statistical task are the target task list, and form a new statistical task from the remaining subtasks of the target statistical task except for the same number of subtasks as the total amount of idle requests corresponding to the idle network segment container, and add the new statistical task to the priority queue;
the sending module 102 is further specifically configured to send a fourth request to the third partition module, until no statistical task exists in the priority queue, and stop sending the fourth request to the third partition module.
As shown in fig. 9, the data acquisition device 100 may further include, based on the device structure shown in fig. 8: the determining module 104 is configured to determine that the statistical task with the largest weight value in the priority queue is the target statistical task.
In some embodiments, the weight value of a statistical task is related to the priority of an account corresponding to the statistical task, the queuing order of an account corresponding to the statistical task, and the total subtask amount of an account corresponding to the statistical task.
In some embodiments, the receiving module 101 has a module for receiving a first request from a device that manages an account. Or receiving a first request sent according to a preset period from equipment to which the first partition module belongs.
In some embodiments, when the first request carries the first account and the name of the first subtask corresponding to the first account, the subtasks in the first task list include subtasks corresponding to the name of the first subtask. Or when the first request carries the first account, the subtasks in the first task list comprise public numbers and applets with management authorities preset in the first account.
In some embodiments, when the first request carries the second account and the name of the second subtask corresponding to the second account, the subtasks in the second task list include subtasks corresponding to the name of the second subtask. Or when the first request carries the second account, subtasks in the second task list comprise public numbers and applets with management authorities preset in the second account.
In some embodiments, the sending module 102 is further configured to send a statistical response to the device that manages the account, where the statistical response carries statistical data of all subtasks in the first task list and/or statistical data of all subtasks in the second task list.
In the embodiment of the present application, the application data acquisition device may be divided into functional modules according to the above method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiments of the present application, the division of the modules is merely a logic function division, and other division manners may be implemented in practice.
The data acquisition device of the embodiment of the present application may be used to execute the technical scheme of the first partition module in the aforementioned data acquisition method, and its implementation principle and technical effects are similar, where the operation of the implementation of each module may further refer to the relevant description of the method embodiment, and will not be repeated here. The modules herein may also be replaced with components or circuits.
The embodiment of the application also provides an electronic device, which comprises: a memory and a processor; the memory is used for storing program instructions; the processor is configured to invoke the program instructions in the memory to cause the electronic device to perform the data acquisition method of the previous embodiment.
Exemplary, embodiments of the present application also provide a computer storage medium including computer instructions that, when executed on an electronic device, cause the electronic device to perform the data acquisition method of the previous embodiments.
The present application also provides, illustratively, a computer program product which, when run on a computer, causes the computer to perform the data acquisition method of the previous embodiments.
Exemplary, an embodiment of the present application provides a chip system including: a processor; the electronic device performs the data acquisition method of the previous embodiment when the processor executes the computer instructions stored in the memory.
In the above-described embodiments, all or part of the functions may be implemented by software, hardware, or a combination of software and hardware. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. Usable media may be magnetic media (e.g., floppy disks, hard disks, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid State Disk (SSD)) or the like.
Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. And the aforementioned storage medium includes: ROM or random access memory RAM, magnetic or optical disk, etc.

Claims (15)

1. A method of data acquisition, the method comprising:
The method comprises the steps that a first partition module receives a first request, wherein the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management rights in a first account and a second account, and the first account is different from the second account;
The first partition module sends a second request to a second partition module in response to receiving the first request;
The first partition module receives a third request from the second partition module, wherein the third request carries a first task list and first authentication information corresponding to the first account and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or a small program with management authority of the first account, each subtask in the second task list is a public number or a small program with management authority of the second account, the first task list, the first authentication information, the second task list and the second authentication information are acquired from a third party platform after the first account and the second account are logged in, and the second partition module responds to the second request;
The first partition module sends a fourth request to a third partition module, wherein the fourth request is used for notifying the third partition module to collect statistical data of all subtasks in a target task list, all subtasks in the target task list belong to the first task list or the second task list, and the total amount of subtasks corresponding to the target task list is smaller than or equal to the total amount of idle requests corresponding to idle network segment containers in the third partition module; when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into subtasks which do not appear in the target task list and the second task list and the number of which is less than or equal to the total number of idle requests corresponding to the idle network segment containers, and stopping sending the fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list; the first partition module receives a fifth request from the third partition module, wherein the fifth request carries statistics data of all subtasks in the target task list, and the statistics data of all subtasks in the target task list are obtained from the third party platform by the third partition module in response to receiving each fourth request, traversing all subtasks in the target task list;
And the first partition module gathers the statistical data of all the subtasks in the first task list and the second task list after determining that the statistical data of all the subtasks in the first task list and the second task list are collected.
2. The method according to claim 1, characterized in that it comprises in particular:
When the first partition module generates an idle network segment container in the third partition module, a target statistical task is taken out from a priority queue, the initial state of the priority queue is constructed by the statistical task corresponding to the first task list and the statistical task corresponding to the second task list, and the initial state of the target statistical task is the statistical task corresponding to the first task list or the statistical task corresponding to the second task list;
The first partition module determines a task list corresponding to the target statistical task as the target task list when the total idle request amount corresponding to the idle network segment container is greater than or equal to the total subtask amount of the target statistical task;
Or when the total idle request amount corresponding to the idle network segment container is smaller than the total subtask amount of the target statistical task, the first partition module determines that the same number of subtasks as the total idle request amount corresponding to the idle network segment container in the target statistical task are the target task list, and forms a new statistical task from the rest subtasks except the same number of subtasks as the total idle request amount corresponding to the idle network segment container in the target statistical task and adds the new statistical task into the priority queue;
and the first partition module sends the fourth request to the third partition module until the statistical task does not exist in the priority queue, and stops sending the fourth request to the third partition module.
3. The method according to claim 2, wherein the method further comprises:
and the first partition module determines that the statistical task with the largest weight value in the priority queue is the target statistical task.
4. A method according to claim 3, wherein the weight value of a statistical task is related to the priority of the account corresponding to the statistical task, the queuing order of the account corresponding to the statistical task, and the total subtask amount of the account corresponding to the statistical task.
5. The method of claim 1, wherein the first partition module receiving the first request comprises:
the first partition module receives the first request from equipment for managing accounts;
Or the first partition module receives the first request sent according to a preset period from equipment to which the first partition module belongs.
6. The method of claim 5, wherein the step of determining the position of the probe is performed,
When the first request carries the first account and the name of a first subtask corresponding to the first account, the subtasks in the first task list comprise subtasks corresponding to the name of the first subtask;
Or when the first request carries the first account, subtasks in the first task list comprise public numbers and applets with management authorities preset in the first account;
When the first request carries the second account and the name of a second subtask corresponding to the second account, the subtasks in the second task list comprise subtasks corresponding to the name of the second subtask;
Or when the first request carries the second account, subtasks in the second task list comprise public numbers and applets with management authorities preset in the second account.
7. The method according to any one of claims 1-6, further comprising:
the first partition module sends a statistical response to the equipment for managing the account, wherein the statistical response carries statistical data of all subtasks in the first task list and/or statistical data of all subtasks in the second task list.
8. The data acquisition device is characterized by being applied to a first partition module;
The device comprises:
The receiving module is used for receiving a first request, the first request is used for triggering the first partition module to collect statistics data of public numbers and/or applets with management rights in a first account and a second account, and the first account is different from the second account;
the sending module is used for responding to the first request and sending a second request to the second partition module;
The receiving module is further configured to receive a third request from the second partition module, where the third request carries a first task list and first authentication information corresponding to the first account, and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or a small program that the first account has management authority, each subtask in the second task list is a public number or a small program that the second account has management authority, and the first task list, the first authentication information, the second task list, and the second authentication information are acquired from a third party platform after the first account and the second account have been logged in, in response to receiving the second request by the second partition module;
The sending module is further configured to send a fourth request to the third partition module, where the fourth request is used to inform the third partition module to collect statistics data of all subtasks in a target task list, where all subtasks in the target task list belong to the first task list or the second task list, and the total amount of subtasks corresponding to the target task list is less than or equal to the total amount of idle requests corresponding to idle network segment containers in the third partition module; when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into subtasks which do not appear in the target task list and the second task list and the number of which is less than or equal to the total number of idle requests corresponding to the idle network segment containers, and stopping sending the fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list; the receiving module is further configured to receive a fifth request from the third partition module, where the fifth request carries statistics data of all subtasks in the target task list, and the statistics data of all subtasks in the target task list are obtained from the third party platform by the third partition module traversing all subtasks in the target task list in response to receiving each fourth request;
and the summarizing module is used for summarizing the statistical data of all the subtasks in the first task list and the second task list after the statistical data of all the subtasks in the first task list and the second task list are determined to be collected.
9. A data acquisition system, the data acquisition system comprising: the system comprises a first partition module, a second partition module and a third partition module, wherein the first partition module, the second partition module and the third partition module are obtained by partitioning virtual machine resources of the data acquisition system by the data acquisition system;
The first partition module is configured to receive a first request, where the first request is used to trigger the first partition module to collect statistics data of public numbers and/or applets with management rights in a first account and a second account, and the first account is different from the second account;
The first partition module is further configured to send a second request to the second partition module in response to receiving the first request;
The second partition module is used for responding to the second request and sending a sixth request to a third party platform after the first account and the second account are logged in;
The second partition module is further configured to receive a seventh request from the third party platform, where the seventh request carries a first task list and first authentication information corresponding to the first account, and a second task list and second authentication information corresponding to the second account, each subtask in the first task list is a public number or a applet of the first account having management authority, and each subtask in the second task list is a public number or a applet of the second account having management authority;
The second partition module is further configured to send a third request to the first partition module in response to receiving the seventh request, where the third request carries the first task list, the first authentication information, the second task list and the second authentication information;
The first partition module is further configured to send a fourth request to the third partition module, where the fourth request is obtained based on the first task list, the first authentication information, the second task list, and the second authentication information, and the fourth request is used to notify the third partition module to collect statistics data of all subtasks in a target task list, where all subtasks in the target task list belong to the first task list or the second task list, and a total amount of subtasks in the target task list is less than or equal to a total amount of idle requests corresponding to idle network containers in the third partition module; when the target task list does not traverse all the subtasks in the first task list and the second task list, updating all the subtasks in the target task list into the subtasks which do not appear in the target task list and the second task list and have the number smaller than or equal to the total idle request amount corresponding to the idle network container in the third partition module, and stopping sending the fourth request to the third partition module until the target task list traverses all the subtasks in the first task list and the second task list;
The third partitioning module is used for traversing all subtasks in the target task list in response to receiving each fourth request, and acquiring statistical data of all subtasks in the target task list from the third party platform;
The third partition module is further configured to send a fifth request to the first partition module, where the fifth request carries statistics data of all subtasks in the target task list;
The first partition module is further configured to aggregate the statistics data of all the subtasks in the first task list and the second task list after determining that the statistics data of all the subtasks in the first task list and the second task list are collected based on the fifth request.
10. The system of claim 9, wherein, when the first segment container and the second segment container in the third partition module are deployed in multiple segments of different operators,
The first network segment container is used for traversing part of subtasks in the target task list and acquiring statistical data of the part of subtasks in the target task list from the third party platform;
the first network segment container is further configured to send a first sub-request to the first partition module, where the first sub-request carries statistics data of part of sub-tasks in the target task list;
The second network segment container is used for traversing the rest subtasks in the target task list and acquiring statistical data of the rest subtasks in the target task list from the third party platform;
the second network segment container is further configured to send a second sub-request to the first partition module, where the second sub-request carries statistics data of remaining sub-tasks in the target task list;
Wherein the first sub-request and the second sub-request constitute the fifth request.
11. The system according to claim 9 or 10, wherein,
The second partition module is further configured to send an eighth request to an account management device when the first account and/or the second account are not logged in, where the eighth request carries login information of the account that is not logged in, and the eighth request is used to remind that the first account and/or the second account are not logged in.
12. The system according to claim 9 or 10, wherein the specific process of the statistical data of one subtask in the target task list obtained by the third partition module from the third party platform is:
The third partition module is specifically configured to send a ninth request to the third party platform, where the ninth request carries an identifier of the one subtask and authentication information corresponding to the one subtask, and the ninth request is used to request statistical data of the one subtask;
The third partition module is further specifically configured to receive a tenth request from the third party platform, where the tenth request carries statistics data of the one subtask.
13. An electronic device, comprising: a memory and a processor;
The memory is used for storing program instructions;
The processor is configured to invoke program instructions in the memory to cause the electronic device to perform the data acquisition method of any of claims 1-7.
14. A computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the data acquisition method of any one of claims 1-7.
15. A computer program product, characterized in that the computer program product, when run on a computer, causes the computer to perform the data acquisition method according to any one of claims 1-7.
CN202011607392.XA 2020-12-29 2020-12-29 Data acquisition method, device and system Active CN112737925B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011607392.XA CN112737925B (en) 2020-12-29 2020-12-29 Data acquisition method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011607392.XA CN112737925B (en) 2020-12-29 2020-12-29 Data acquisition method, device and system

Publications (2)

Publication Number Publication Date
CN112737925A CN112737925A (en) 2021-04-30
CN112737925B true CN112737925B (en) 2024-06-14

Family

ID=75610143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011607392.XA Active CN112737925B (en) 2020-12-29 2020-12-29 Data acquisition method, device and system

Country Status (1)

Country Link
CN (1) CN112737925B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105162676A (en) * 2015-04-03 2015-12-16 中国科学院信息工程研究所 Method and system for acquiring WeChat data

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080209031A1 (en) * 2007-02-22 2008-08-28 Inventec Corporation Method of collecting and managing computer device information
CN105577528B (en) * 2015-12-31 2019-01-15 深圳中泓在线股份有限公司 A kind of wechat public platform collecting method and device based on virtual machine
WO2018023657A1 (en) * 2016-08-05 2018-02-08 汤隆初 Method for adjusting wechat public account-based advertisement push technique, and push system
CN110046319B (en) * 2019-04-01 2021-04-09 北大方正集团有限公司 Social media information acquisition method, device, system, equipment and storage medium
CN110297711B (en) * 2019-05-16 2024-01-19 平安科技(深圳)有限公司 Batch data processing method, device, computer equipment and storage medium
CN110266596B (en) * 2019-06-12 2023-04-18 深圳前海微众银行股份有限公司 Message processing method, device, equipment and computer readable storage medium
CN110289975A (en) * 2019-06-25 2019-09-27 苏州梦嘉信息技术有限公司 Public platform message cluster transmition management system and method
US11265328B2 (en) * 2019-09-12 2022-03-01 Snowflake Inc. Private data exchange metrics sharing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105162676A (en) * 2015-04-03 2015-12-16 中国科学院信息工程研究所 Method and system for acquiring WeChat data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
微信公众账号代维平台设计方案;周璇;廖建新;沈奇威;;电信技术;20151025(第10期);全文 *

Also Published As

Publication number Publication date
CN112737925A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
US10097438B2 (en) Detecting events in cloud computing environments and performing actions upon occurrence of the events
US8606897B2 (en) Systems and methods for exporting usage history data as input to a management platform of a target cloud-based network
US8239509B2 (en) Systems and methods for management of virtual appliances in cloud-based network
US10091215B1 (en) Client isolation in a distributed queue
CN106294472B (en) A kind of querying method and device of Hadoop database HBase
CN110071965B (en) Data center management system based on cloud platform
WO2010026362A1 (en) Distributed data processing system
US10812349B2 (en) Methods, systems and computer readable media for triggering on-demand dynamic activation of cloud-based network visibility tools
US9577972B1 (en) Message inspection in a distributed strict queue
CN113596150B (en) Message pushing method, device, computer equipment and storage medium
US20180124168A1 (en) Load balancing server for forwarding prioritized traffic from and to one or more prioritized auto-configuration servers
CN107612713A (en) A kind of method for administering back-end services
CN111698126B (en) Information monitoring method, system and computer readable storage medium
JP4669601B2 (en) Network terminal device, network, and task distribution method
CA2857727C (en) Computer-implemented method, computer system, computer program product to manage traffic in a network
CN114500381A (en) Network bandwidth limiting method, system, electronic device and readable storage medium
CN112737925B (en) Data acquisition method, device and system
CN110750350B (en) Large resource scheduling method, system, device and readable storage medium
US20060072453A1 (en) Method and apparatus for assessing traffic load of a communication network
CN110198246B (en) Method and system for monitoring flow
CN113055493B (en) Data packet processing method, device, system, scheduling device and storage medium
CN108830724B (en) Resource data packet processing method and terminal equipment
CN113190347A (en) Edge cloud system and task management method
CN102611578A (en) Network equipment data management system in multi-network-equipment environment
CN110443710B (en) Block chain system and method for batch signature

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant