CN113268518B - Flow statistics method and device and distributed flow statistics system - Google Patents

Flow statistics method and device and distributed flow statistics system Download PDF

Info

Publication number
CN113268518B
CN113268518B CN202010097823.6A CN202010097823A CN113268518B CN 113268518 B CN113268518 B CN 113268518B CN 202010097823 A CN202010097823 A CN 202010097823A CN 113268518 B CN113268518 B CN 113268518B
Authority
CN
China
Prior art keywords
user
iam
statistics
time period
user requests
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010097823.6A
Other languages
Chinese (zh)
Other versions
CN113268518A (en
Inventor
孙晓辉
张鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010097823.6A priority Critical patent/CN113268518B/en
Publication of CN113268518A publication Critical patent/CN113268518A/en
Application granted granted Critical
Publication of CN113268518B publication Critical patent/CN113268518B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Abstract

The present disclosure provides a flow statistics method, comprising: acquiring user requests of a plurality of IAM users in a current preset time period, wherein the plurality of IAM users are access users of an OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI; creating a current memory concurrency statistical queue in a local memory, and caching user requests of a plurality of IAM users in a current preset time period to the corresponding current memory concurrency statistical queue; and carrying out flow statistics on the user requests in the current memory concurrency statistics queue according to the application service to which the user requests belong so as to count the user request quantity corresponding to each application service of each IAM user in the current preset time period. The disclosure also provides a flow statistics device, a distributed flow statistics system, an electronic device, and a computer readable medium.

Description

Flow statistics method and device and distributed flow statistics system
Technical Field
The embodiment of the disclosure relates to the technical field of flow statistics, in particular to a flow statistics method and device, a distributed flow statistics system, electronic equipment and a computer readable medium.
Background
Currently, with the rapid development of internet technology, an open platform (OpenAPI) has become a necessary choice and strategic development direction of more and more internet enterprise development services as a development foundation of internet online services.
The OpenAPI is a large platform development and sharing way, and a website service provider encapsulates own services into a series of Application Program Interfaces (APIs) to be opened for third-party developers to use, so that the developers develop business applications at lower cost. The open platform can open a data access interface to third party institutions such as enterprises, governments and the like so as to enable the relevant institutions to carry out data calling and realize third party services.
Disclosure of Invention
The embodiment of the disclosure provides a flow statistics method and device, a distributed flow statistics system, electronic equipment and a computer readable medium.
In a first aspect, an embodiment of the present disclosure provides a traffic statistics method, including:
acquiring user requests of a plurality of IAM users in a current preset time period, wherein the plurality of IAM users are access users of an OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI;
creating a current memory concurrency statistical queue in a local memory, and caching user requests of a plurality of IAM users in a current preset time period to the corresponding current memory concurrency statistical queue;
and carrying out flow statistics on the user requests in the current memory concurrency statistics queue according to the application service to which the user requests belong so as to count the user request quantity corresponding to each application service of each IAM user in the current preset time period.
In some embodiments, after the statistics of the user request amount corresponding to each application service of each IAM user in the current preset period of time, the method further includes:
and caching the user request in the counted current preset time period to a corresponding local database.
In some embodiments, the caching the user request in the counted current preset time period to the corresponding local database includes: and caching the counted user requests to the local database at intervals of set time by using a real-time working thread.
In some embodiments, after caching the user request in the counted current preset time period to the corresponding local database, the method further includes:
and when the memory size occupied by all the user requests currently cached in the local database is larger than or equal to a preset size, asynchronously storing the user requests cached in the local database to a distributed file system.
In some embodiments, after the statistics of the user request amount corresponding to each application service of each IAM user in the current preset period of time, the method further includes:
copying the user request in the counted current preset time period to a lock-free queue;
using at least one persistent working thread to persist the user request in the lock-free queue to a remote database;
and deleting the user request which is successfully persisted to the remote database in the local database.
In some embodiments, the remote database is mongab.
In some embodiments, the local database is a local locksdb.
In a second aspect, an embodiment of the present disclosure provides a flow statistics apparatus, including:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring user requests of a plurality of IAM users in a current preset time period, the plurality of IAM users are access users of an OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI;
the queue creating and caching module is used for creating a current memory concurrency statistical queue in the local memory and caching user requests of a plurality of IAM users in a current preset time period to the corresponding current memory concurrency statistical queue;
and the statistics module is used for carrying out flow statistics on the user requests in the current memory concurrency statistics queue according to the application service to which the user requests belong so as to count the user request quantity corresponding to each application service of each IAM user in the current preset time period.
In some embodiments, the queue creating and caching module is further configured to cache the user request in the counted current preset time period to a corresponding local database.
In some embodiments, the queue creating and buffering module is specifically configured to buffer the counted user request to the local database at intervals of a set time by using a real-time work thread.
In some embodiments, the system further includes an asynchronous storage module, configured to asynchronously store the user requests cached in the local database to the distributed file system when the memory size occupied by all user requests currently cached in the local database is greater than or equal to a predetermined size.
In some embodiments, the system further comprises a copy module, a persistence module, and a delete module;
the copying module is used for copying the user request in the counted current preset time period to the lock-free queue;
the persistence module is used for persistence of the user request in the lock-free queue to a remote database by utilizing at least one persistence work thread;
and the deleting module is used for deleting the user request which is successfully and durably stored in the local database to the remote database.
In some embodiments, the remote database is mongab.
In some embodiments, the local database is a local locksdb.
In a third aspect, embodiments of the present disclosure provide a distributed traffic statistics system, comprising: a statistics agent layer and a plurality of traffic statistics devices;
the statistical agent layer is used for receiving user requests of a plurality of IAM users from an entrance layer of the OpenAPI and distributing the user requests of the plurality of IAM users to a plurality of flow statistical devices;
each flow statistics device is configured to implement the flow statistics method according to any one of the foregoing embodiments.
In some embodiments, the statistical proxy layer is specifically configured to distribute user requests of a plurality of IAM users to a plurality of traffic statistics devices through a consistent hashing algorithm.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including:
one or more processors;
a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement the flow statistics method as described in any of the embodiments above;
one or more I/O interfaces coupled between the processor and the memory configured to enable information interaction of the processor with the memory.
In a fifth aspect, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the computer program when executed implements the traffic statistics method according to any of the above embodiments.
The flow statistics method and device, the distributed flow statistics system, the electronic device and the computer readable medium provided by the embodiment of the disclosure utilize the local memory to create an efficient memory concurrency statistics queue, and realize the statistics of the flow of the OpenAPI from the usage dimension (belonging IAM user and belonging application service) and the time dimension (preset time period). Compared with the traditional statistical method using offline logs, the flow statistical method provided by the embodiment of the disclosure has higher real-time performance and better accuracy of statistical results.
Drawings
The accompanying drawings are included to provide a further understanding of embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:
fig. 1 is a flowchart of a flow statistics method according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a memory concurrency statistics queue according to a first embodiment of the disclosure;
fig. 3 is a flowchart of a flow statistics method provided in a second embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a local database according to a second embodiment of the disclosure;
fig. 5 is a flowchart of a flow statistics method provided in a third embodiment of the present disclosure;
fig. 6 is a flowchart of a flow statistics method provided in a fourth embodiment of the present disclosure;
fig. 7 is a block diagram of a flow statistics device according to a fifth embodiment of the present disclosure;
fig. 8 is a block diagram of a flow statistics device according to a sixth embodiment of the present disclosure;
fig. 9 is a block diagram of a flow statistics device according to a seventh embodiment of the present disclosure;
FIG. 10 is a block diagram illustrating a distributed flow statistics system according to an eighth embodiment of the present disclosure;
fig. 11 is a block diagram of an electronic device according to a ninth embodiment of the disclosure.
Detailed Description
In order to better understand the technical solutions of the present disclosure, the flow statistics method and apparatus, the distributed flow statistics system, the electronic device, and the computer readable medium provided in the present disclosure are described in detail below with reference to the accompanying drawings.
Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Example 1
Fig. 1 is a flowchart of a flow statistics method according to an embodiment of the present disclosure, as shown in fig. 1, where the method may be performed by a flow statistics device, and the device may be implemented by software and/or hardware, and the device may be integrated into an electronic device, such as a server. The flow statistics method comprises the following steps:
and 11, acquiring user requests of a plurality of IAM users in a current preset time period.
The IAM user is an access user of an open platform (OpenAPI). In this embodiment, the open platform associates a unique Identity (ID) and corresponding account information with each access user in advance, and uses an identity recognition and access management system (Identity and Access Management, abbreviated as IAM) to perform identity verification and authorization on each access user to use resources, where each access user is referred to as an IAM user of the open platform (OpenAPI).
In this embodiment, the IAM user is a third party developer and a third party organization. Each IAM user may create one or more Applications (APPs) on the console of the open platform, each application having a unique application identification (Key), each application Key may have one or more service rights to facilitate rights management by the IAM user, and the service to which each application Key corresponds is referred to as an application service. For example, a travel APP is created by a travel APP development mechanism on a map open platform, and an application identifier KEY1 is used, and positioning service, geocoding service, path planning service and the like are opened under the application, so that functions of driver and passenger position tracking, driver-side navigation and the like are realized by calling the services. Thus, in this embodiment, each IAM user may be correspondingly configured with at least one open platform based application service in each open platform based application.
When a user accesses an open platform through an Application (APP) of an IAM user to request to use an application service based on the open platform corresponding to the application, it is equivalent to sending a user request to the open platform.
In this embodiment, in order to count the access traffic of the open platform, first, in step 11, user requests of a plurality of IAM users in a current preset period of time are obtained. The preset time period may be a second-level time period, a hierarchical time period, an hour-level time period, or a day-level time period, for example, the preset time period is 1 second(s).
Step 12, creating a current memory concurrency statistics queue in the local memory, and caching user requests of a plurality of IAM users in a current preset time period to corresponding current memory concurrency statistics queues.
In this embodiment, in step 12, a current memory concurrency (concurrency) statistics queue is created in the local memory, and after obtaining user requests of a plurality of IAM users in a current preset time period, the user requests of the plurality of IAM users in the current preset time period are cached in the current memory concurrency statistics queue.
Fig. 2 is a schematic structural diagram of a memory concurrency statistics queue in accordance with a first embodiment of the present disclosure, as shown in fig. 2, in some embodiments, the memory concurrency statistics queue includes a plurality (e.g., 16) of Bucket units (bucket_0, bucket_1, bucket_2, … …, bucket_16), that is, the memory concurrency statistics queue employs a Bucket splitting mechanism to cache user requests of a plurality of IAM users. Specifically, a Hash (Hash) algorithm may be utilized to distribute user requests of multiple IAM users to multiple bucket units for caching. The lock conflict can be effectively reduced by utilizing a barrel-dividing mechanism to carry out caching.
It can be understood that, in each preset time period, after user requests of a plurality of IAM users in the preset time period are obtained, a new memory concurrency statistics queue is created in the local memory and used as a memory concurrency statistics queue for caching the user requests of a plurality of IAMs in the preset time period.
And 13, carrying out flow statistics on the user requests in the current memory concurrency statistics queue according to the application service to which the user requests belong, so as to count the user request quantity corresponding to each application service of each IAM user in the current preset time period.
In this embodiment, the user request carries an Identity (ID) of the IAM user to which the user belongs, a service name (service_name) of the application service to be used, and a timestamp (timestamp) of the request.
In step 13, all user requests in the current memory concurrency statistics queue are classified according to the belonging IAM users and the belonging application services, so as to count the user request quantity corresponding to each application service of each IAM user in the current preset time period.
For example, the plurality of IAM users include a user a1 and a user a2, the application service corresponding to the user a1 includes a service1 and a service2, and the application service corresponding to the user a2 includes a service1 and a service2. It is assumed that there are 4 user requests corresponding to the user a1 and 5 user requests corresponding to the user a2 in the current preset time period. For the user a1, 2 application services to which the user requests belong are service1, and the remaining 2 application services to which the user requests belong are service2; for user a2, there are 2 application services to which the user request belongs that are service services 1, and the remaining 3 application services to which the user request belongs that are service services 2. In step 13, the user request amount corresponding to the service1 of the user a1 is counted as 2, the user request amount corresponding to the service2 of the user a1 is counted as 2, the user request amount corresponding to the service1 of the user a2 is counted as 2, and the user request amount corresponding to the service2 of the user a2 is counted as 3.
In this embodiment, the preset time period may be 1 second, and through the steps 11 to 13, the user request amount (QPS) per second corresponding to each application service of each IAM user may be counted, and further the page browsing amount (PV) corresponding to each application service of each IAM user may be counted, where the page browsing amount is equal to the accumulation of the user request amounts per second.
According to the traffic statistics method provided by the embodiment, user requests of a plurality of IAM users in a current preset time period are obtained, a current memory concurrency statistics queue is created, the user requests of the plurality of IAM users in the current preset time period are cached to the corresponding current memory concurrency statistics queue, the user requests in the current memory concurrency statistics queue are subjected to traffic statistics according to the application services to count the user request quantity corresponding to each application service of each IAM user in the current preset time period. In the embodiment, an efficient memory concurrency statistics queue is created by using a local memory, and statistics of OpenAPI traffic is realized from a usage dimension (belonging to an IAM user and belonging to an application service) and a time dimension (preset time period). Compared with the traditional statistical method using offline logs, the flow statistical method provided by the embodiment has higher real-time performance and better accuracy of statistical results.
Example two
Fig. 3 is a flowchart of a flow statistics method provided in a second embodiment of the present disclosure, and as shown in fig. 3, the flow statistics method provided in the second embodiment of the present disclosure is different from the flow statistics method in the first embodiment, in that: step 13 is further followed by step 14, and only step 14 is described below, and other specific descriptions can be referred to the description of the flow statistics method in the first embodiment, which is not repeated here.
And step 14, caching the user request in the counted current preset time period to a corresponding local database.
In step 14, after counting the user request amount corresponding to each application service of each IAM user in the current preset time period, writing the counted user request in the current preset time period into the corresponding local database for caching. Specifically, in step 14, the statistical user request is cached to the local database at intervals of a set time by a set real-time working thread (real-time-worker), so that the statistical user request is cached to the local database in real time. Wherein the set time may be a time in the order of milliseconds (ms).
In this embodiment, the local database may use a log-structured merge Tree (LSM-Tree) -based locksdb, which is a LSM-Tree-based storage engine and is a persistent Key-value (Key-value) storage system.
In this embodiment, since a local memory is adopted, once the machine is restarted or fails, the local memory concurrently counts the risk of losing data in the queue, resulting in inaccurate traffic statistics, in order to solve this problem, a LSM-Tree-based locksdb is adopted as a local database, and the high-efficiency writing performance is utilized to write the counted data (user request) into the locksdb.
Fig. 4 is a schematic structural diagram of a local database in a second embodiment of the disclosure, as shown in fig. 4, where in the embodiment, the local database stores statistical user requests based on a user dimension and an application (app) dimension. The local database is divided into a plurality of sub-databases according to a hash (user_id+service_name)% 100 based on a user dimension, wherein user represents an IAM user, user_id represents an identity of the IAM user, service_name represents a service name of an application service corresponding to the IAM user, for example, slot_0 and slot_1 in the user dimension represent sub-databases in two user dimensions, a user request (realtime data) corresponding to each service of each IAM user is stored based on the user dimension, for example, a user request (realtime data) of a geographic coding service is used by a trip APP development mechanism.
The local database is divided into a plurality of sub-libraries in a hash (user_id+app_id+service_name)% 100 mode based on an application dimension, wherein user_id represents an identity of an IAM user, app_id represents an application identity of an application created by the IAM user, service_name represents a service name of an application service, for example, slot_0 and slot_1 in the APP dimension represent sub-libraries in two APP dimensions, and a user request (realtime) corresponding to each application service of each application of each IAM user is stored based on the application dimension, for example, a request amount of a path planning service corresponding to a trip APP of a trip APP development mechanism.
In addition, the local database manages the local sub-library identification data, i.e., creates which sub-libraries are opened in total, through MetaData (MetaData) storage, so that all the sub-library identifications locally can be acquired. The local database is segmented Cheng Ziku, so that the single-database data can be effectively prevented from expanding in scale, and the accuracy of the statistical data is ensured while the whole extremely high statistical performance is ensured.
Example III
Fig. 5 is a flowchart of a flow statistics method provided in the third embodiment of the present disclosure, and as shown in fig. 5, the flow statistics method provided in the third embodiment of the present disclosure is different from the flow statistics method of the second embodiment in that: step 14 is followed by step 15 and step 16. The following description is only directed to step 15 and step 16, and other specific descriptions can be made to the description of the flow statistics method in the second embodiment, which is not repeated here.
And 15, judging whether the size of the memory occupied by all user requests currently cached in the local database is smaller than a preset size, if so, not performing further processing, otherwise, executing step 16.
The predetermined size may be set according to actual needs, for example, the predetermined size is 2M or 15M.
And step 16, asynchronously storing the user request currently cached by the local database into the distributed file system.
In step 16, when the memory size occupied by all the user requests currently cached in the local database is greater than or equal to the predetermined size, the user requests currently cached in the local database are asynchronously stored in the set distributed file system. Therefore, data loss caused by single-point faults (such as faults of a local database) is effectively prevented, and the safety of the data is further guaranteed.
Example IV
Fig. 6 is a flowchart of a flow statistics method provided in a fourth embodiment of the present disclosure, and as shown in fig. 6, the flow statistics method provided in the fourth embodiment of the present disclosure is different from the flow statistics methods in the second and third embodiments described above in that: step 13 is followed by steps 17 to 19. The following description is only directed to steps 17 to 19, and other specific descriptions may refer to the descriptions of the flow statistics methods in the second and third embodiments, which are not repeated herein.
And step 17, copying the user request in the counted current preset time period to a lock-free queue.
In step 17, when the statistical time slice arrives, the user request within the current preset time period is copied to a set lock-free queue. The statistical time slice may be a preset time period, that is, when the current preset time period ends, the user request in the counted current preset time period is copied to a lock-free queue (lock-free queue). For example, if the preset time period is 1 second, after counting the flow of the current second, the user request in the current second is copied to the lock-free queue.
In some embodiments, the current memory concurrency statistics queue adopts a bucket-splitting mechanism to cache the user request in the current preset time period, and in step 17, the user request in the current preset time period cached in the current memory concurrency statistics queue can be copied to a lock-free queue in a pointer transfer (switch) manner, and pointers of all bucket units (buckets) are stored in the lock-free queue, so that the efficiency of data copying can be improved.
And step 18, persisting the user request in the lock-free queue to a remote database by using at least one persistence work thread.
In this embodiment, user requests in the lock-free queue are periodically persisted to a remote database by at least one set persisted worker thread (persistence-worker).
In some embodiments, at least one (e.g., 3) persistent worker thread is utilized to concurrently scan the lock-free queue, wherein each persistent worker thread scans the data in the lock-free queue at regular intervals (e.g., minutes), and each time a pointer to a bucket element is scanned, the data in the bucket element is popped (POP) and persisted to a remote database.
In some embodiments, the remote database may employ mongab, which is a database based on distributed file storage.
And step 19, deleting the user request which is successfully persisted to the remote database in the local database.
In step 19, for user requests that have been successfully persisted to the remote database, the corresponding user requests in the local database are deleted.
In this embodiment, due to the limitation of the local disk and the storage space, the statistical user request is periodically persisted into the remote database, and meanwhile, the persisted data is cleared from the local database, so that the expansibility of the data storage can be ensured.
Example five
Fig. 7 is a block diagram of a flow statistics device according to a fifth embodiment of the present disclosure, where, as shown in fig. 7, the flow statistics device is configured to implement the flow statistics method described above, and the flow statistics device includes: an acquisition module 201, a queue creation and caching module 202 and a statistics module 203.
The obtaining module 201 is configured to obtain user requests of a plurality of IAM users in a current preset time period, where the plurality of IAM users are access users of the OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI.
The queue creation and caching module 202 is configured to create a current memory concurrency statistics queue in the local memory, and cache user requests of multiple IAM users in a current preset period of time to corresponding current memory concurrency statistics queues.
The statistics module 203 is configured to count the user requests in the current memory concurrency statistics queue according to the application services to which the user requests belong, so as to count the user request amount corresponding to each application service of each IAM user in the current preset time period.
In some embodiments, the queue creation and caching module 202 is further configured to cache the user request in the counted current preset time period to the corresponding local database. Specifically, the queue creation and caching module 202 is specifically configured to cache the counted user request to the local database at intervals of a set time by using the real-time work thread.
Example six
Fig. 8 is a block diagram of a flow rate statistics device according to a sixth embodiment of the present disclosure, and as shown in fig. 8, the flow rate statistics device according to the sixth embodiment of the present disclosure is different from the flow rate statistics device according to the fifth embodiment in that: the flow statistics device provided in the sixth embodiment of the present disclosure further includes: the asynchronous memory module 204 is described below only with respect to the asynchronous memory module 204, and other related descriptions can be found in the fifth embodiment, which is not described herein.
The asynchronous storage module 204 is configured to asynchronously store the user requests cached in the local database to the distributed file system when the memory size occupied by all user requests currently cached in the local database is greater than or equal to a predetermined size.
Example seven
Fig. 9 is a block diagram of a flow rate statistics device according to a seventh embodiment of the present disclosure, and as shown in fig. 9, the flow rate statistics device according to the seventh embodiment of the present disclosure is different from the flow rate statistics device according to the sixth embodiment in that: the flow statistics device provided in the seventh embodiment of the present disclosure further includes: a copy module 205, a persistence module 206, and a delete module 207. The following description is only made with respect to the copy module 205, the persistence module 206 and the delete module 207, and other related descriptions can be found in the foregoing description of the sixth embodiment, which is not repeated here.
The copy module 205 is configured to copy the user request in the counted current preset time period to the lock-free queue.
The persistence module 206 is configured to persist the user request in the lock-free queue to a remote database using at least one persistence worker thread.
The deletion module 207 is configured to delete user requests in the local database that have been successfully persisted to the remote database.
In some embodiments, the remote database is mongoldb. In some embodiments, the local database is a local locksdb.
In addition, the flow statistics device provided in the embodiments of the present disclosure is specifically configured to implement the foregoing flow statistics method, and the detailed description thereof may refer to the description of the foregoing flow statistics method, which is not repeated herein.
Example eight
Fig. 10 is a block diagram of a distributed flow statistics system according to an eighth embodiment of the present disclosure, as shown in fig. 10, where the distributed flow statistics system includes: a statistics proxy layer 301 and a plurality of traffic statistics means 302.
The statistics proxy layer 301 is configured to receive user requests of a plurality of IAM users from an ingress layer of the OpenAPI, and distribute the user requests of the plurality of IAM users to the plurality of traffic statistics devices 302.
Each flow statistics device 302 is configured to implement the flow statistics method provided in any of the above embodiments.
In this embodiment, the ingress layer of the OpenAPI sends the received user requests of the plurality of IAM users to the statistics proxy layer 301 at intervals of a predetermined time (e.g., 10 ms), and the statistics proxy layer 301 distributes the received user requests of the plurality of IAM users at intervals of a predetermined time (e.g., 10 ms) to the plurality of traffic statistics devices 302. Among them, openapis typically have multiple portal layers.
In some embodiments, the statistical proxy layer 301 distributes user requests of a plurality of IAM users received at predetermined intervals (e.g., 10 ms) to a plurality of traffic statistics devices 302 by using a consistent Hash algorithm through a Load Balancer (Load Balancer). For example, it is assumed that user requests of 1000 IAM users are received at a predetermined time, and 10 traffic statistics devices 302 are provided, and the user requests of 1000 IAM users are consistently hashed to the 10 traffic statistics devices 302 at the statistics proxy layer 301, and each traffic statistics device 302 may obtain the user requests of a plurality of IAM users.
It should be noted that, the flow statistics device 302 in this embodiment is configured to implement the flow statistics method in any of the above embodiments, and specific description may refer to the flow statistics method described above, which is not repeated herein. It will be appreciated that the aforementioned local database is in one-to-one correspondence with the flow statistics device 302, and the local database is disposed on the machine where the flow statistics device 302 is located.
In this embodiment, the distributed traffic statistics system further includes the aforementioned remote database and the aforementioned distributed file system.
In this embodiment, in order to fully utilize the local memory, each flow statistics device 302 adopts an efficient memory concurrency statistics queue, utilizes mechanisms such as barrel division and no lock to ensure extremely high writing performance and computation performance of the whole system, and adopts a rocksdb based on an LSM-Tree as a local database to store flow data of an OpenAPI in real time, thereby preventing data loss and single-database data scale expansion, and ensuring accuracy of statistical data while ensuring extremely high statistical performance of the whole system; meanwhile, the distributed file system is utilized to asynchronously store data, so that data loss caused by single-point faults is prevented, and the safety of the whole system is ensured; in addition, the flow data is persisted to a remote database, and the persisted data is cleared from a local database, so that the expansibility of the whole system storage is ensured.
In addition, the distributed traffic statistics system provided in this embodiment may be disposed at a gateway layer of the OpenAPI, and may be used to count service usage of all users of the whole OpenAPI.
Example nine
Fig. 11 is a block diagram of an electronic device according to a ninth embodiment of the disclosure, as shown in fig. 11, where the electronic device includes: one or more processors 501; a memory 502 having one or more programs stored thereon, which when executed by the one or more processors 501 cause the one or more processors 501 to implement the flow statistics method described above; one or more I/O interfaces 503 coupled between the processor 501 and the memory 502 are configured to enable information interaction of the processor 501 with the memory 502.
Furthermore, the embodiment of the disclosure also provides a computer readable storage medium, on which a computer program is stored, wherein the computer program is executed to implement the foregoing flow statistics method.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (18)

1. A method of traffic statistics, comprising:
acquiring user requests of a plurality of IAM users in a current preset time period, wherein the plurality of IAM users are access users of an OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI;
creating a current memory concurrency statistical queue in a local memory, and caching user requests of a plurality of IAM users in a current preset time period to the corresponding current memory concurrency statistical queue, wherein the current memory concurrency statistical queue caches the user requests of the plurality of IAM users based on a barrel-dividing mechanism;
carrying out flow statistics on user requests in a current memory concurrency statistics queue according to the application service to which the user requests belong so as to count the user request quantity corresponding to each application service of each IAM user in a current preset time period;
and under the condition that user requests of a plurality of IAM users in the preset time period are acquired for each preset time period, creating a new memory concurrency statistical queue in the local memory so as to cache the user requests of the plurality of IAM users in the preset time period.
2. The traffic statistics method according to claim 1, wherein after the statistics of the user request amount corresponding to each application service of each IAM user in the current preset period of time, the traffic statistics method further comprises:
and caching the user request in the counted current preset time period to a corresponding local database.
3. The traffic statistics method according to claim 2, wherein said caching the user request within the counted current preset time period to the corresponding local database comprises: and caching the counted user requests to the local database at intervals of set time by using a real-time working thread.
4. The traffic statistics method according to claim 2, wherein after caching the user request within the counted current preset time period to the corresponding local database, further comprising:
and when the memory size occupied by all the user requests currently cached in the local database is larger than or equal to a preset size, asynchronously storing the user requests cached in the local database to a distributed file system.
5. The traffic statistics method according to claim 2, wherein after the statistics of the user request amount corresponding to each application service of each IAM user in the current preset period, the traffic statistics method further comprises:
copying the user request in the counted current preset time period to a lock-free queue;
using at least one persistent working thread to persist the user request in the lock-free queue to a remote database;
and deleting the user request which is successfully persisted to the remote database in the local database.
6. The traffic statistics method according to claim 5, wherein the remote database is mongadb.
7. The traffic statistics method according to any of claims 2-6, wherein the local database is a local locksdb.
8. A flow statistics apparatus, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring user requests of a plurality of IAM users in a current preset time period, the plurality of IAM users are access users of an OpenAPI, and each IAM user is correspondingly configured with at least one application service based on the OpenAPI;
the queue creating and caching module is used for creating a current memory concurrency statistical queue in the local memory, caching user requests of a plurality of IAM users in a current preset time period to the corresponding current memory concurrency statistical queue, and caching the user requests of the plurality of IAM users by the current memory concurrency statistical queue based on a barrel dividing mechanism;
the statistics module is used for carrying out flow statistics on the user requests in the current memory concurrency statistics queue according to the application services to obtain statistics on the user request quantity corresponding to each application service of each IAM user in the current preset time period;
wherein the queue creation and caching module is further configured to: and under the condition that user requests of a plurality of IAM users in the preset time period are acquired for each preset time period, creating a new memory concurrency statistic queue in the local memory to cache the user requests of the plurality of IAM users in the preset time period.
9. The traffic statistics apparatus according to claim 8, wherein the queue creation and caching module is further configured to cache the user request within the counted current preset time period to the corresponding local database.
10. The traffic statistics device according to claim 9, wherein the queue creation and caching module is specifically configured to cache the counted user requests to the local database at intervals of a set time using a real-time work thread.
11. The traffic statistics device according to claim 9, further comprising an asynchronous storage module configured to asynchronously store the user requests cached in the local database to the distributed file system when the memory size occupied by all user requests currently cached in the local database is greater than or equal to a predetermined size.
12. The traffic statistics device according to claim 9, further comprising a copy module, a persistence module, and a delete module;
the copying module is used for copying the user request in the counted current preset time period to the lock-free queue;
the persistence module is used for persistence of the user request in the lock-free queue to a remote database by utilizing at least one persistence work thread;
and the deleting module is used for deleting the user request which is successfully and durably stored in the local database to the remote database.
13. The flow statistics apparatus of claim 12 wherein the remote database is mongoldb.
14. The traffic statistics device according to any of claims 9-13, wherein the local database is a local locksdb.
15. A distributed traffic statistics system comprising a statistics proxy layer and a plurality of traffic statistics devices;
the statistical agent layer is used for receiving user requests of a plurality of IAM users from an entrance layer of the OpenAPI and distributing the user requests of the plurality of IAM users to a plurality of flow statistical devices;
each of said flow statistics means is adapted to implement the flow statistics method of any of the preceding claims 1-7.
16. The distributed traffic statistics system as recited in claim 15, wherein the statistics proxy layer is operable to distribute user requests of a plurality of IAM users to a plurality of traffic statistics devices via a consistent hashing algorithm.
17. An electronic device, comprising:
one or more processors;
a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement the flow statistics method of any of claims 1-7;
one or more I/O interfaces coupled between the processor and the memory configured to enable information interaction of the processor with the memory.
18. A computer readable medium having stored thereon a computer program, wherein the computer program when executed implements the flow statistics method of any of claims 1-7.
CN202010097823.6A 2020-02-17 2020-02-17 Flow statistics method and device and distributed flow statistics system Active CN113268518B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010097823.6A CN113268518B (en) 2020-02-17 2020-02-17 Flow statistics method and device and distributed flow statistics system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010097823.6A CN113268518B (en) 2020-02-17 2020-02-17 Flow statistics method and device and distributed flow statistics system

Publications (2)

Publication Number Publication Date
CN113268518A CN113268518A (en) 2021-08-17
CN113268518B true CN113268518B (en) 2024-03-29

Family

ID=77227533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010097823.6A Active CN113268518B (en) 2020-02-17 2020-02-17 Flow statistics method and device and distributed flow statistics system

Country Status (1)

Country Link
CN (1) CN113268518B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115150171B (en) * 2022-06-30 2023-11-10 北京天融信网络安全技术有限公司 Flow statistics method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2708415A1 (en) * 2010-06-21 2011-12-21 Radian6 Technologies Inc. Referred internet traffic analysis system and method
CN106230662A (en) * 2016-08-01 2016-12-14 北京小米移动软件有限公司 Network flux statistical method and device
CN106303751A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 A kind of realization method and system orienting flow bag
CN108632164A (en) * 2018-08-17 2018-10-09 四川新网银行股份有限公司 Open platform gateway intelligence flow control method based on time series forecasting
CN109039817A (en) * 2018-08-03 2018-12-18 北京京东金融科技控股有限公司 A kind of information processing method and device for traffic monitoring
CN109194584A (en) * 2018-08-13 2019-01-11 中国平安人寿保险股份有限公司 A kind of flux monitoring method, device, computer equipment and storage medium
CN109889401A (en) * 2019-01-22 2019-06-14 金蝶软件(中国)有限公司 Flow statistical method, device, computer equipment and storage medium
CN110087226A (en) * 2018-01-25 2019-08-02 中兴通讯股份有限公司 Flow statistical method, device, storage medium and electronic device
CN110224943A (en) * 2019-05-29 2019-09-10 掌阅科技股份有限公司 Traffic service current-limiting method, electronic equipment and computer storage medium based on URL

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8880524B2 (en) * 2009-07-17 2014-11-04 Apple Inc. Scalable real time event stream processing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2708415A1 (en) * 2010-06-21 2011-12-21 Radian6 Technologies Inc. Referred internet traffic analysis system and method
CN106303751A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 A kind of realization method and system orienting flow bag
CN106230662A (en) * 2016-08-01 2016-12-14 北京小米移动软件有限公司 Network flux statistical method and device
CN110087226A (en) * 2018-01-25 2019-08-02 中兴通讯股份有限公司 Flow statistical method, device, storage medium and electronic device
CN109039817A (en) * 2018-08-03 2018-12-18 北京京东金融科技控股有限公司 A kind of information processing method and device for traffic monitoring
CN109194584A (en) * 2018-08-13 2019-01-11 中国平安人寿保险股份有限公司 A kind of flux monitoring method, device, computer equipment and storage medium
CN108632164A (en) * 2018-08-17 2018-10-09 四川新网银行股份有限公司 Open platform gateway intelligence flow control method based on time series forecasting
CN109889401A (en) * 2019-01-22 2019-06-14 金蝶软件(中国)有限公司 Flow statistical method, device, computer equipment and storage medium
CN110224943A (en) * 2019-05-29 2019-09-10 掌阅科技股份有限公司 Traffic service current-limiting method, electronic equipment and computer storage medium based on URL

Also Published As

Publication number Publication date
CN113268518A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
US10983868B2 (en) Epoch based snapshot summary
US10129118B1 (en) Real time anomaly detection for data streams
JP6716727B2 (en) Streaming data distributed processing method and apparatus
CN109918382A (en) Data processing method, device, terminal and storage medium
CN109101580A (en) A kind of hot spot data caching method and device based on Redis
US7752625B2 (en) Caching resources requested by applications
CN114625767A (en) Data query method, device, equipment and readable medium
US11178197B2 (en) Idempotent processing of data streams
CN113268518B (en) Flow statistics method and device and distributed flow statistics system
CN106649530B (en) Cloud detail query management system and method
JP2018513454A (en) Efficient performance of insert and point query operations in the column store
CN109062717A (en) Data buffer storage and caching disaster recovery method and system, caching system
JP6406254B2 (en) Storage device, data access method, and data access program
CN115174158B (en) Cloud product configuration checking method based on multi-cloud management platform
CN113835613B (en) File reading method and device, electronic equipment and storage medium
KR102202645B1 (en) Data Sharing Method for Relational Edge Servers
US11321015B2 (en) Aggressive intent write request cancellation
CN113901018A (en) Method and device for identifying file to be migrated, computer equipment and storage medium
CN109976896B (en) Service re-ranking processing method and device
CN113849119A (en) Storage method, storage device, and computer-readable storage medium
CN113051323A (en) Water environment big data exchange method
US20210176215A1 (en) Attribute-based quasi-identifier discovery
CN116010677B (en) Spatial index method and device and electronic equipment thereof
CN112968980B (en) Probability determination method and device, storage medium and server
CN110213393B (en) Message processing method and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant