WO2017133539A1 - Service data processing method, device and system - Google Patents
Service data processing method, device and system Download PDFInfo
- Publication number
- WO2017133539A1 WO2017133539A1 PCT/CN2017/072185 CN2017072185W WO2017133539A1 WO 2017133539 A1 WO2017133539 A1 WO 2017133539A1 CN 2017072185 W CN2017072185 W CN 2017072185W WO 2017133539 A1 WO2017133539 A1 WO 2017133539A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- distributed
- statistics
- users
- data
- real
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/563—Data redirection of data network streams
Definitions
- the invention relates to the field of information monitoring, in particular to a method, device and system for processing business data.
- some information about a service message is required to perform statistical processing and determination of geographical distribution. For example, when there may be malicious sources in each service, it is necessary to determine the distribution of the malicious messages in various regions in order to monitor the key services. For another example, when there may be a malicious increase in a certain business, it is necessary to determine the business area of the malicious increase, so as to quickly adjust the strategy to the corresponding business, and suppress the malicious spread. In addition, when it is necessary to assist in offline attacks, and to eliminate malicious sources, it is also necessary to count and determine the geographical distribution of the target information of the business messages.
- an embodiment of the present invention provides a method for processing service data, and a device and system for processing service data for determining a geographical distribution of information related to a service message.
- the distributed computing server receives the service message and the attribute information from the service system; the attribute information includes the user identifier and the source geographic location information;
- the distributed computing server performs the de-recalculation of the number of distributed users according to the geographical location information and the user identifier according to the geographical location information and the user identifier of the service system according to the set first time interval, and obtains the distribution of the localities. Statistics of the number of users Interest; and
- the statistical data of the number of users distributed in each place obtained according to the first time interval is stored in a database.
- the query server receives the query request from the user, and obtains statistics of the number of users distributed by the distributed computing server according to the first time interval according to the query request, and displays the number of users distributed in the local area. Statistics;
- the statistical data of the number of users distributed by the distributed computing server according to the first time interval is: the distributed computing server according to the set first time interval, according to the source of the service message from the service system
- the geographical location information and the user identification are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained.
- the apparatus for processing service data provided in the embodiment of the present invention includes: at least one computing server and at least one summary server; wherein
- Each computing server is configured to receive a service message from the service system and attribute information thereof, where the attribute information includes a user identifier and a source geographic location information; and according to the geographical location information of each service message according to the first time interval, the different geographic level
- the service message of one area in the area is recalculated according to the user identifier of each service message, and the statistics of the number of users in the area are obtained;
- Each summary server is configured to summarize the statistics of the number of users in the same area of different computing servers according to the first time interval, and obtain statistics of the number of users distributed in each place.
- a request receiving module configured to receive a query request from a user
- a query module configured to obtain, according to the query request, statistics of a number of users distributed by a distributed computing server according to a first time interval from a database
- a display module configured to display statistics queried by the query module.
- a real-time retrieval analysis server configured to receive a service message and a property information thereof from the service system from the service system, and store the service message and its attribute information by using a nested column storage and a bitmap; the attribute information includes User identification and source geographical location information; determining, according to the stored geographical information in the service message and the attribute information thereof, statistical data of the volume of the message distributed by the service message in real time; and
- a distributed computing server configured to receive a service message from the service system and attribute information thereof from the service system or the real-time retrieval analysis server; according to the set first time interval, the service message from the service system is based on the source thereof
- the geographical location information and the user identifier are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained; the statistical data of the number of users distributed in each place according to the first time interval is obtained.
- the service message from the service system can be re-calculated according to the geographical location information and the user identifier according to different geographical levels. Quickly determine the statistics of the number of users distributed around, and then store the statistics in a database for query display.
- FIG. 1 is a schematic structural diagram of an implementation environment according to various embodiments of the present invention.
- FIGS. 2A and 2B are respectively schematic structural diagrams of a query server according to an embodiment of the present invention.
- FIG. 3 is an exemplary flowchart of a method for determining a geographical distribution of target information of a service message according to an embodiment of the present invention
- FIG. 4 and FIG. 5 are schematic structural diagrams showing a method for displaying geographical distribution of target information of a service message according to an embodiment of the present invention.
- FIG. 1 is a schematic structural diagram of an implementation environment according to various embodiments of the present invention.
- the implementation environment includes a business system 101, a real-time retrieval analysis server 102, a distributed computing server 103, a database 104, and a query server 105.
- the service system 101 is configured to provide a service message and attribute information of the service message.
- the service message may be a malicious message filtered by the service system, or may be a message for setting a service to be monitored.
- the specific type of the service message is not limited herein.
- the attribute information of the service message may include a user identifier and source location information, and the like.
- the real-time search and analysis server 102 is configured to receive the service message from the service system 101 and its attribute information from the service system 101 in real time, and organize and store the service message by means of nested column storage and bit-map. And its attribute information.
- This kind of structure storage method can realize the rapid positioning of key data when analyzing large-scale complex data, such as terabyte-scale complex data, so that data access analysis of seconds can be realized. Therefore, the real-time search analysis server 102 can determine the statistical data of the message amount distributed by the service message in real time in a real-time manner according to the stored geographical information of the service message and its attribute information.
- the real-time search and analysis server 102 can be a search and analysis platform, such as Hermes real-time search and analysis server, and the Hermes real-time search and analysis server combines search and data analysis based on search engine technology to realize tera-level data-level directed search analysis and Fuzzy search analysis.
- the real-time search and analysis server 102 can also perform other statistical functions for realizing the real-time determination of the volume of the message distributed by the service message in real time according to the source geographical information in the stored service message and its attribute information. retrieve the analytics server in real time.
- the distributed computing server 103 is configured to receive business messages from the business system 101 and their attribute information from the business system 101 or the real-time retrieval analysis server 102.
- the service message from the service system 101 is de-recalculated according to the geographical location information and the user identifier according to the geographical location of the source, and the number of distributed users is obtained.
- Statistics information; the statistical data of the number of users distributed in each place obtained according to the first time interval is stored in the database 104.
- the first time interval here can be 1 hour, which can be calculated every 1 hour. In addition, the first time interval may also be other time periods, such as half an hour, 45 minutes, 1.5 hours or 2 hours, and the like.
- the distributed computing server 103 can be a Hadoop computing cluster.
- the Hadoop computing cluster is a distributed system infrastructure. Users can develop distributed programs without utilizing the underlying details of the distributed, and make full use of the performance of the cluster for high-speed computing. And storage.
- the distributed computing server 103 can also be other computing clusters that can implement the above functions.
- the distributed computing server 103 can be a server cluster that includes multiple servers. In one process, these servers can be used for stand-alone computing or multi-machine aggregation, respectively.
- Each server for performing calculation (referred to as a computing server) is configured to receive a service message and a property information thereof from the service system, where the attribute information includes a user identifier and a source geographic location information; and according to the first time interval according to each service The source of the message, geographical location information, different The service message of one area in the area level is recalculated according to the user ID of each service message, and the statistics of the number of users in the area are obtained.
- the geographic level includes three levels of city, province, and country
- at least one computing server is used for service messages of province A.
- the user number is de-recalculated according to the user identifier of each service message, and the statistics of the number of users in the province A are obtained, and at least one calculation server is used to perform service calculation on the B city, and the user number is de-recalculated according to the user identifier of each service message. Get statistics on the number of users in City B.
- each server for aggregation (referred to as a summary server) is used to summarize the statistics of the number of users in the same area of different computing servers, and obtain statistics of the number of users distributed in various places.
- a summary server is used to summarize the statistics of the number of users in the A provinces of different computing servers, and obtain statistics of the number of users distributed in the province A;
- at least one summary server is used to collect statistics on the number of users in the B city of different computing servers.
- the summary is performed to obtain statistics on the number of users distributed in the B city.
- the summary server can also be one.
- the geographical level can also be divided into four levels including county, region, province, and country. How to divide it here is not specifically limited.
- the distributed computing server 103 may be further configured to perform, according to the set second time interval, the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system. Recalculating, obtaining statistical data of the number of users distributed in various places; and performing distributed message volume calculation according to different geographical levels according to the geographic location information of the service message, and obtaining statistics of the amount of messages distributed everywhere; Statistics of the number of users distributed throughout the time and the amount of messages obtained in accordance with the second time interval are stored in the database 104.
- the second time interval is greater than the first time interval.
- the second time interval may be one day, that is, the statistics of the amount of messages distributed around the country and the number of users are calculated once a day.
- the second time interval may also be other time intervals, for example, 2 days, 3 days, 4 days, ....
- each computing server is further used According to the geographical location information of each service message according to the second time interval, the service message of one of the different geographical levels is re-calculated according to the user identifier of each service message, and the statistics of the number of users in the area are obtained; And performing message volume statistics on the service messages of one of the different geographical levels according to the geographical location information of each service message, and obtaining the message volume statistics of the area; each summary server is configured to perform different calculations according to the second time interval.
- the statistical results of the number of users in the same area of the server are summarized, and the statistical data of the number of users distributed in each area is obtained; and the statistical results of the user message volume in the same area of different computing servers are summarized, and the statistical data of the distributed message volume is obtained.
- At least one calculation server may use the message volume statistics of the service messages of the province A to obtain the province A.
- the message volume statistics at least one calculation server is used to perform message volume statistics on the service messages of the B city, and obtain the message volume statistics data of the B city.
- At least one summary server is configured to summarize the A-message statistics of different computing servers, and obtain statistics of the amount of messages distributed by the province A; at least one summary server is used to summarize the statistics of the B-messages of different computing servers. , get the statistics of the amount of messages distributed by the B city.
- the summary server can also be one.
- the database 104 can be a Cloud Database (CDB).
- CDB Cloud Database
- the main features of the CDB include: high-performance, highly reliable MySQL services; and integration of automated management tools to minimize developer involvement in deployment, monitoring, capacity expansion, and failover.
- database 104 can also be a traditional database or a database integrated on a compute cluster server.
- the query server 105 is configured to receive a query request from the user, determine whether the statistical data that needs to be queried is real-time data or historical data, and if it is real-time data, obtain the number of users distributed according to the latest first time interval from the database 104.
- Statistics from The real-time search and analysis server 102 obtains the statistical data of the distributed amount of the localized distributed message in real time, and displays the statistical data of the number of users distributed in the local area and the amount of the message; if it is historical data, the second corresponding query is obtained from the database 104.
- the statistics of the number of users and the amount of messages distributed around the time interval are obtained, and the statistics of the number of users and the amount of messages distributed in the localities are displayed.
- the query server 105 may also default to obtain statistical data of the number of users distributed in the local area obtained from the database 104 according to the most recent first time interval or obtain the distributed computing server 103 from the database 104 according to the query request. Statistics of the number of users distributed throughout the time interval. And/or, the real-time retrieval analysis server 102 may also be used to obtain the statistics of the locally distributed message amount in real time.
- the response delay of the initial query request may be less than 10 seconds (s), and the response delay of the continuous query request may be less than 5 seconds. That is to say, the technical solution can display the geographical distribution statistics of the service message volume and the number of users in a real-time manner under low delay conditions.
- a processing system for service data proposed in the embodiment of the present invention may include only the distributed computing server 103 described above.
- Still another processing system for service data proposed in the embodiments of the present invention may include only the real-time retrieval analysis server 102 described above.
- a further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102 and the distributed computing server 103 described above.
- the processing system of each of the foregoing service data may further include a query server 105.
- a processing system for service data proposed in the embodiment of the present invention may include only the above-mentioned query server 105.
- Still another processing system for service data proposed in the embodiments of the present invention may include the distributed computing server 103 and the query server 105 described above.
- a further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102 and the query server 105 described above.
- a further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102, the distributed computing server 103, and the query server 105 described above.
- the query server 105 in the embodiment of the present invention may have various specific implementation manners, and FIG. 2A and FIG. 2B respectively show one of them.
- the query server 105 can include a request receiving module 201, a query module 202, and a presentation module 203.
- the request receiving module 201 is configured to receive a query request from a user.
- the querying module 202 is configured to obtain, according to the query request, statistics of a number of users distributed by a distributed computing server according to a first time interval from a database.
- the display module 203 is configured to display the statistics queried by the query module.
- the query server 105 can include a request receiving module 201, a determining module 204, a query module 202, and a display module 203.
- the request receiving module 201 is configured to receive a query request from a user.
- the determining module 202 is configured to determine, according to the query request, whether the statistical data that needs to be queried is real-time data or historical data.
- the query module 203 is configured to: when the statistical data that needs to be queried is real-time data, obtain statistics of the number of users distributed by the distributed computing server according to the latest first time interval from a database; and the statistical data that needs to be queried When the data is historical data, the statistical data of the number of distributed users and the amount of messages obtained by the distributed computing server corresponding to the second time interval are queried from the database.
- the display module 204 is configured to display the statistics queried by the query module.
- the query module 203 is further configured to obtain the real-time search and analysis from a real-time search and analysis server when the statistical data that needs to be queried is real-time data.
- a method for processing service data is also proposed in the embodiment of the present invention, and the method can be implemented in the implementation environment shown in FIG.
- FIG. 3 is an exemplary flowchart of a method for processing service data according to an embodiment of the present invention. This method can be applied to distributed computing servers. As shown in FIG. 3, the method may include the following steps:
- Step 301 The distributed computing server receives the service message and the attribute information from the service system, where the attribute information includes the user identifier and the source geographic location information.
- Step 302 The distributed computing server performs the de-recalculation of the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system according to the set first time interval. Get statistics on the number of users distributed across the country.
- different geographical levels can be divided according to actual needs. For example, they can be divided into three levels: city, province, and country, or can be divided into four levels: county, region, province, and country.
- Step 303 The distributed computing server stores the statistical data of the number of distributed users in the local time interval according to the first time interval into a database.
- the method may further include the following steps 304 and 305.
- Step 304 The distributed computing server recalculates the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system according to the set second time interval.
- the statistical data of the number of distributed users; and the distributed message amount calculation according to different geographical levels according to the geographic location information of the service message, and the statistical data of the distributed message volume is obtained.
- the second time interval is greater than the first time interval.
- Step 305 the number of users distributed in each place obtained according to the second time interval
- the statistics of the amount of interest are stored in a database.
- the service message data from the service system is de-recalculated according to the geographical location information and the user identifier according to the source geographic information and the user identifier, and the statistics of the number of users distributed in the local area are obtained, which may include:
- Each of the computing servers in the distributed computing server performs the deduplication of the number of users according to the user identifier of each service message according to the geographical location information of each service message, and obtains the user of the area according to the user identifier of each service message.
- Statistics data each summary server in the distributed computing server aggregates the statistics of the number of users in the same area of different computing servers, and obtains statistical data of the number of users distributed in various places.
- the service message is calculated according to the geographical location information of the source according to different geographical levels, and the statistical data of the distributed message volume is obtained, which may include: each of the distributed computing servers.
- the calculation server performs message volume statistics on the service messages of one of the different geographical levels according to the geographical location information of each service message, and obtains the statistics of the message volume of the area; each summary server in the distributed computing server performs different calculations.
- the statistics of the message volume of the same area of the server are summarized, and the statistics of the amount of the message distributed in each place are obtained.
- the foregoing method may further include: the real-time retrieval analysis server receives the service message and the attribute information from the service system from the service system, and stores the service message and the attribute information by using a nested column storage and a bitmap. And determining, according to the stored geographical information of the service message and the attribute information thereof, statistics of the amount of the message distributed by the service message in real time.
- FIG. 4 and FIG. 5 are respectively schematic flowcharts of a method for processing service data according to an embodiment of the present invention. This method can be applied to the query server.
- a method for processing service data provided in this embodiment may be as shown in FIG. 4, and includes the following steps:
- step 401 the query server receives a query request from the user.
- Step 402 The query server obtains, according to the query request, statistics of the number of users distributed by the distributed computing server according to the first time interval from a database.
- this step if statistics of the number of users distributed in the local area obtained in the first time interval are displayed, the statistics of the number of users distributed by the distributed computing server in the latest first time interval are obtained from the database. Data; if statistics of the number of users distributed in the local area obtained by the previous first time interval are displayed, the statistics of the number of users distributed by the distributed computing server at the corresponding first time interval are obtained from the database. data.
- Step 403 displaying statistics of the number of users distributed in the localities.
- the method shown in FIG. 4 may further include: obtaining, from the real-time search and analysis server, statistical data of the distributed amount of the localized information obtained by the real-time search and analysis server in real time, and displaying statistical data of the distributed amount of the localized information.
- the method for displaying the geographical distribution of the target information of the service message provided in this embodiment may be as shown in FIG. 5, and includes the following steps:
- step 501 the query server receives a query request from the user.
- Step 502 The query server receives the query request from the user, and determines whether the statistical data to be queried is real-time data or historical data. If it is real-time data, step 503 is performed; otherwise, step 504 is performed.
- Step 503 Obtain statistics of the number of users distributed by the distributed computing server in the latest first time interval from the database, and display statistics of the number of users distributed in the local area.
- Step 504 Query, from the database, statistics of the number of users and the amount of messages distributed by the distributed computing server at corresponding second time intervals, and display statistics of the number of users and the amount of messages distributed in the local area. data.
- the method shown in FIG. 5 may further include: when the statistical data that needs to be queried is real-time data, obtaining, from the real-time search and analysis server, the statistical data of the distributed amount of the localized information obtained by the real-time search and analysis server in real time, and displaying Statistics of the amount of messages distributed throughout the country.
- each of the embodiments of the present invention can be implemented by a data processing program executed by a data processing device such as a computer.
- the data processing program constitutes the present invention.
- a data processing program usually stored in a storage medium is executed by directly reading a program out of a storage medium or by installing or copying the program to a storage device (such as a hard disk and or a memory) of the data processing device. Therefore, such a storage medium also constitutes the present invention.
- the storage medium can use any type of recording method, such as paper storage medium (such as paper tape, etc.), magnetic storage medium (such as floppy disk, hard disk, flash memory, etc.), optical storage medium (such as CD-ROM, etc.), magneto-optical storage medium (such as MO, etc.).
- paper storage medium such as paper tape, etc.
- magnetic storage medium such as floppy disk, hard disk, flash memory, etc.
- optical storage medium such as CD-ROM, etc.
- magneto-optical storage medium Such as MO, etc.
- the program code read out from the storage medium is written into a memory provided in an expansion board inserted into the computer or written in a memory set in an extension unit connected to the computer, and then based on the program code.
- the instructions cause a processor or the like mounted on the expansion board or the expansion unit to perform part and all of the actual operations, thereby realizing the functions of any of the above embodiments.
- the processor may include one or more processing cores.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
Provided are a service data processing method, device and system. The method comprises: a distributed computing server receives a service message and attribute information thereof from a service system, wherein the attribute information comprises a user identifier and source geographical location information; in accordance with a specified first time interval, the distributed computing server performs, on the basis of different geographical levels, distributed user quantity deduplication computation on the service message from the service system according to the source geographical location information and the user identifier thereof, so as to obtain statistical data of user quantities in different places; and the distributed computing server stores, to a database, the statistical data of the user quantities in different places obtained according to the first time interval. The technical solution in the present invention can quickly determine geographical distribution of service message-related information.
Description
本发明涉及信息监控领域,特别是一种业务数据的处理方法、装置及系统。The invention relates to the field of information monitoring, in particular to a method, device and system for processing business data.
发明背景Background of the invention
在有些应用场景中,需要对业务消息的一些信息,如消息量和/或用户数等业务数据,进行地域分布的统计和确定等处理。例如,在各业务可能存在恶意消息来源时,需要确定该恶意消息在各个地区中的分布情况,以便针对重点业务进行监控。又如,在某业务可能存在恶意剧增时,需要确定该恶意剧增的业务地区,以迅速对相应业务调整策略,压制恶意进一步蔓延。此外,需要辅助进行线下打击,杜绝恶意来源时,也需要对业务消息的目标信息的地域分布进行统计和确定。In some application scenarios, some information about a service message, such as a message volume and/or a number of users, is required to perform statistical processing and determination of geographical distribution. For example, when there may be malicious sources in each service, it is necessary to determine the distribution of the malicious messages in various regions in order to monitor the key services. For another example, when there may be a malicious increase in a certain business, it is necessary to determine the business area of the malicious increase, so as to quickly adjust the strategy to the corresponding business, and suppress the malicious spread. In addition, when it is necessary to assist in offline attacks, and to eliminate malicious sources, it is also necessary to count and determine the geographical distribution of the target information of the business messages.
发明内容Summary of the invention
有鉴于此,本发明实施例中一方面提供一种业务数据的处理方法,另一方面提供一种业务数据的处理装置和系统,用以确定业务消息相关信息的地域分布。In view of this, an embodiment of the present invention provides a method for processing service data, and a device and system for processing service data for determining a geographical distribution of information related to a service message.
本发明实施例中提供的一种业务数据的处理方法,包括:A method for processing service data provided in the embodiment of the present invention includes:
分布式计算服务器接收来自业务系统的业务消息及其属性信息;所述属性信息包括用户标识和来源地理位置信息;The distributed computing server receives the service message and the attribute information from the service system; the attribute information includes the user identifier and the source geographic location information;
分布式计算服务器按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信
息;和The distributed computing server performs the de-recalculation of the number of distributed users according to the geographical location information and the user identifier according to the geographical location information and the user identifier of the service system according to the set first time interval, and obtains the distribution of the localities. Statistics of the number of users
Interest; and
将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到一数据库中。The statistical data of the number of users distributed in each place obtained according to the first time interval is stored in a database.
本发明实施例中提供的一种业务数据的处理方法,包括:A method for processing service data provided in the embodiment of the present invention includes:
查询服务器接收来自用户的查询请求,根据所述查询请求从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据,并展示所述各地分布的用户数的统计数据;和The query server receives the query request from the user, and obtains statistics of the number of users distributed by the distributed computing server according to the first time interval according to the query request, and displays the number of users distributed in the local area. Statistics; and
所述分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据为:所述分布式计算服务器按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息。The statistical data of the number of users distributed by the distributed computing server according to the first time interval is: the distributed computing server according to the set first time interval, according to the source of the service message from the service system The geographical location information and the user identification are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained.
本发明实施例中提供的一种业务数据的处理装置,包括:至少一个计算服务器和至少一个汇总服务器;其中,The apparatus for processing service data provided in the embodiment of the present invention includes: at least one computing server and at least one summary server; wherein
每个计算服务器用于接收来自业务系统的业务消息及其属性信息,所述属性信息包括用户标识和来源地理位置信息;按照第一时间间隔根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息,根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;Each computing server is configured to receive a service message from the service system and attribute information thereof, where the attribute information includes a user identifier and a source geographic location information; and according to the geographical location information of each service message according to the first time interval, the different geographic level The service message of one area in the area is recalculated according to the user identifier of each service message, and the statistics of the number of users in the area are obtained;
每个汇总服务器用于按照第一时间间隔对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据。Each summary server is configured to summarize the statistics of the number of users in the same area of different computing servers according to the first time interval, and obtain statistics of the number of users distributed in each place.
本发明实施例中提供的一种业务数据的处理装置,包括:A device for processing service data provided in the embodiment of the present invention includes:
请求接收模块,用于接收来自用户的查询请求;a request receiving module, configured to receive a query request from a user;
查询模块,用于根据所述查询请求,从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据;和
a query module, configured to obtain, according to the query request, statistics of a number of users distributed by a distributed computing server according to a first time interval from a database; and
展示模块,用于展示所述查询模块查询到的统计数据。a display module, configured to display statistics queried by the query module.
本发明实施例中提供的一种业务数据的处理系统,包括:A processing system for service data provided in the embodiment of the present invention includes:
实时检索分析服务器,用于从业务系统接收来自业务系统的业务消息及其属性信息,并采用嵌套式的列存储以及位图的方式存储所述业务消息及其属性信息;所述属性信息包括用户标识和来源地理位置信息;根据存储的所述业务消息及其属性信息中的来源地理位置信息,实时确定业务消息在各地分布的消息量的统计数据;和a real-time retrieval analysis server, configured to receive a service message and a property information thereof from the service system from the service system, and store the service message and its attribute information by using a nested column storage and a bitmap; the attribute information includes User identification and source geographical location information; determining, according to the stored geographical information in the service message and the attribute information thereof, statistical data of the volume of the message distributed by the service message in real time; and
分布式计算服务器,用于从业务系统或所述实时检索分析服务器接收来自业务系统的业务消息及其属性信息;按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息;将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到一数据库中。a distributed computing server, configured to receive a service message from the service system and attribute information thereof from the service system or the real-time retrieval analysis server; according to the set first time interval, the service message from the service system is based on the source thereof The geographical location information and the user identifier are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained; the statistical data of the number of users distributed in each place according to the first time interval is obtained. Stored in a database.
可见,本发明实施例中,通过利用分布式计算服务器对来自业务系统的业务消息,按照其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,便可方便快速的确定各地分布的用户数的统计信息,之后可将该统计数据存储到一数据库中,以便查询展示。It can be seen that, in the embodiment of the present invention, by using the distributed computing server, the service message from the service system can be re-calculated according to the geographical location information and the user identifier according to different geographical levels. Quickly determine the statistics of the number of users distributed around, and then store the statistics in a database for query display.
附图简要说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚的说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。其中,In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following drawings will be briefly described, and the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work. among them,
图1为本发明各个实施例所涉及的一种实施环境的结构示意图;
1 is a schematic structural diagram of an implementation environment according to various embodiments of the present invention;
图2A和图2B分别为本发明实施例中一种查询服务器的结构示意图;2A and 2B are respectively schematic structural diagrams of a query server according to an embodiment of the present invention;
图3为本发明实施例中一种业务消息的目标信息地域分布确定方法的示例性流程图;3 is an exemplary flowchart of a method for determining a geographical distribution of target information of a service message according to an embodiment of the present invention;
图4和图5分别为本发明实施例中一种业务消息的目标信息地域分布展示方法的结构示意图。4 and FIG. 5 are schematic structural diagrams showing a method for displaying geographical distribution of target information of a service message according to an embodiment of the present invention.
实施本发明的方式Mode for carrying out the invention
为使本发明的目的、技术方案和优点更加清楚,以下举实施例对本发明进一步详细说明。In order to make the objects, technical solutions and advantages of the present invention more comprehensible, the present invention will be further described in detail below.
图1为本发明各个实施例所涉及的一种实施环境的结构示意图。如图1所示,该实施环境包括:业务系统101、实时检索分析服务器102、分布式计算服务器103、数据库104和查询服务器105。FIG. 1 is a schematic structural diagram of an implementation environment according to various embodiments of the present invention. As shown in FIG. 1, the implementation environment includes a business system 101, a real-time retrieval analysis server 102, a distributed computing server 103, a database 104, and a query server 105.
其中,业务系统101用于提供业务消息以及该业务消息的属性信息。业务消息可以是业务系统过滤得到的恶意消息,也可以是设定需要监控的某业务的消息等,此处不对业务消息的具体类型进行限定。业务消息的属性信息可以包括用户标识和来源地理位置信息等。The service system 101 is configured to provide a service message and attribute information of the service message. The service message may be a malicious message filtered by the service system, or may be a message for setting a service to be monitored. The specific type of the service message is not limited herein. The attribute information of the service message may include a user identifier and source location information, and the like.
实时检索分析服务器102用于从业务系统101实时接收来自业务系统101的业务消息及其属性信息,并可采用嵌套式的列存储以及位图(bit-map)等方式组织存储所述业务消息及其属性信息。这种结构的存储方式,在分析大规模复杂数据,如TB级规模的复杂数据时,能实现关键数据的快速定位,从而可实现秒级的数据访问分析。因此,实时检索分析服务器102可根据存储的所述业务消息及其属性信息中的来源地理位置信息,快速、低延迟地实时确定业务消息在各地分布的消息量的统计数据。
The real-time search and analysis server 102 is configured to receive the service message from the service system 101 and its attribute information from the service system 101 in real time, and organize and store the service message by means of nested column storage and bit-map. And its attribute information. This kind of structure storage method can realize the rapid positioning of key data when analyzing large-scale complex data, such as terabyte-scale complex data, so that data access analysis of seconds can be realized. Therefore, the real-time search analysis server 102 can determine the statistical data of the message amount distributed by the service message in real time in a real-time manner according to the stored geographical information of the service message and its attribute information.
实时检索分析服务器102可以为检索分析平台,如Hermes实时检索分析服务器,Hermes实时检索分析服务器基于搜索引擎技术,将检索与数据分析相结合,可实现万亿级别的数据秒级的定向检索分析和模糊检索分析。此外,实时检索分析服务器102也可以为其它能实现上述根据存储的业务消息及其属性信息中的来源地理位置信息,快速、低延迟地实时确定业务消息在各地分布的消息量的统计数据功能的实时检索分析服务器。The real-time search and analysis server 102 can be a search and analysis platform, such as Hermes real-time search and analysis server, and the Hermes real-time search and analysis server combines search and data analysis based on search engine technology to realize tera-level data-level directed search analysis and Fuzzy search analysis. In addition, the real-time search and analysis server 102 can also perform other statistical functions for realizing the real-time determination of the volume of the message distributed by the service message in real time according to the source geographical information in the stored service message and its attribute information. Retrieve the analytics server in real time.
分布式计算服务器103用于从业务系统101或实时检索分析服务器102接收来自业务系统101的业务消息及其属性信息。按照设定的第一时间间隔,对所述来自业务系统101的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息;将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到数据库104中。这里的第一时间间隔可以为1小时,即可每隔1小时计算一次。此外,第一时间间隔也可以为其它的时间段,如半小时、45分钟、1.5小时或2小时等。The distributed computing server 103 is configured to receive business messages from the business system 101 and their attribute information from the business system 101 or the real-time retrieval analysis server 102. According to the set first time interval, the service message from the service system 101 is de-recalculated according to the geographical location information and the user identifier according to the geographical location of the source, and the number of distributed users is obtained. Statistics information; the statistical data of the number of users distributed in each place obtained according to the first time interval is stored in the database 104. The first time interval here can be 1 hour, which can be calculated every 1 hour. In addition, the first time interval may also be other time periods, such as half an hour, 45 minutes, 1.5 hours or 2 hours, and the like.
其中,分布式计算服务器103可以为Hadoop计算集群,Hadoop计算集群是一个分布式系统基础架构,用户可以在不了解分布式底层细节的情况下,开发分布式程序,充分利用集群的性能进行高速运算和存储。此外,分布式计算服务器103也可以为其它可以实现上述功能的计算集群。The distributed computing server 103 can be a Hadoop computing cluster. The Hadoop computing cluster is a distributed system infrastructure. Users can develop distributed programs without utilizing the underlying details of the distributed, and make full use of the performance of the cluster for high-speed computing. And storage. In addition, the distributed computing server 103 can also be other computing clusters that can implement the above functions.
分布式计算服务器103可以是一个包括多台服务器的服务器集群。在一次处理中,这些服务器可分别用于进行单机计算或多机汇总。其中,每个用于进行计算的服务器(简称计算服务器)用于接收来自业务系统的业务消息及其属性信息,所述属性信息包括用户标识和来源地理位置信息;按照第一时间间隔根据各业务消息的来源地理位置信息,对不同
地域层级中的一个地域的业务消息,根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据。例如,假设地域层级包括城市、省份和国家三个层级,则如需统计分布在A省的用户数,以及分布在B城市的用户数,则至少一个计算服务器用于对A省的业务消息,根据各业务消息的用户标识进行用户数去重计算,得到A省的用户数统计数据,至少一个计算服务器用于对B城市的业务消息,根据各业务消息的用户标识进行用户数去重计算,得到B城市的用户数统计数据。之后每个用于汇总的服务器(简称汇总服务器)用于对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据。例如,至少一汇总服务器用于对不同计算服务器的A省用户数统计结果进行汇总,得到A省分布的用户数的统计数据;至少一汇总服务器用于对不同计算服务器的B城市用户数统计结果进行汇总,得到B城市分布的用户数的统计数据。在某些应用中,汇总服务器也可以为一台。当然,地域层级也可以划分为包括县、地区、省份、国家四个地域层级等,具体如何划分此处不对其进行具体限定。The distributed computing server 103 can be a server cluster that includes multiple servers. In one process, these servers can be used for stand-alone computing or multi-machine aggregation, respectively. Each server for performing calculation (referred to as a computing server) is configured to receive a service message and a property information thereof from the service system, where the attribute information includes a user identifier and a source geographic location information; and according to the first time interval according to each service The source of the message, geographical location information, different
The service message of one area in the area level is recalculated according to the user ID of each service message, and the statistics of the number of users in the area are obtained. For example, if the geographic level includes three levels of city, province, and country, if the number of users distributed in province A and the number of users distributed in city B are to be counted, at least one computing server is used for service messages of province A. The user number is de-recalculated according to the user identifier of each service message, and the statistics of the number of users in the province A are obtained, and at least one calculation server is used to perform service calculation on the B city, and the user number is de-recalculated according to the user identifier of each service message. Get statistics on the number of users in City B. Then, each server for aggregation (referred to as a summary server) is used to summarize the statistics of the number of users in the same area of different computing servers, and obtain statistics of the number of users distributed in various places. For example, at least one summary server is used to summarize the statistics of the number of users in the A provinces of different computing servers, and obtain statistics of the number of users distributed in the province A; at least one summary server is used to collect statistics on the number of users in the B city of different computing servers. The summary is performed to obtain statistics on the number of users distributed in the B city. In some applications, the summary server can also be one. Of course, the geographical level can also be divided into four levels including county, region, province, and country. How to divide it here is not specifically limited.
此外,分布式计算服务器103还可进一步用于按照设定的第二时间间隔,对来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;并对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据;将按照所述第二时间间隔得到的各地分布的用户数及消息量的统计数据存储到数据库104中。其中,第二时间间隔大于第一时间间隔。例如,第二时间间隔可以为一天,即每天计算一次各地分布的消息量及用户数的统计数据。此外,第二时间间隔也可以为其它的时间间隔,例如,2天、3天,4天,......。具体地,每个计算服务器进一步用
于按照第二时间间隔根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;以及根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息进行消息量统计,得到该地域的消息量统计数据;每个汇总服务器用于按照第二时间间隔对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据;以及对不同计算服务器的相同地域的用户消息量统计结果进行汇总,得到各地分布的消息量的统计数据。通过进一步按照第二时间间隔计算用户数和消息量的地域分布,可以减少查询历史数据时不必要的重复计算。In addition, the distributed computing server 103 may be further configured to perform, according to the set second time interval, the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system. Recalculating, obtaining statistical data of the number of users distributed in various places; and performing distributed message volume calculation according to different geographical levels according to the geographic location information of the service message, and obtaining statistics of the amount of messages distributed everywhere; Statistics of the number of users distributed throughout the time and the amount of messages obtained in accordance with the second time interval are stored in the database 104. The second time interval is greater than the first time interval. For example, the second time interval may be one day, that is, the statistics of the amount of messages distributed around the country and the number of users are calculated once a day. In addition, the second time interval may also be other time intervals, for example, 2 days, 3 days, 4 days, .... Specifically, each computing server is further used
According to the geographical location information of each service message according to the second time interval, the service message of one of the different geographical levels is re-calculated according to the user identifier of each service message, and the statistics of the number of users in the area are obtained; And performing message volume statistics on the service messages of one of the different geographical levels according to the geographical location information of each service message, and obtaining the message volume statistics of the area; each summary server is configured to perform different calculations according to the second time interval. The statistical results of the number of users in the same area of the server are summarized, and the statistical data of the number of users distributed in each area is obtained; and the statistical results of the user message volume in the same area of different computing servers are summarized, and the statistical data of the distributed message volume is obtained. By further calculating the geographical distribution of the number of users and the amount of messages according to the second time interval, it is possible to reduce unnecessary repeated calculations when querying historical data.
其中,以统计A省和B城市的消息量和用户数的统计数据的情况为例,当统计消息量时,可由至少一个计算服务器用于对A省的业务消息进行消息量统计,得到A省的消息量统计数据,至少一个计算服务器用于对B城市的业务消息进行消息量统计,得到B城市的消息量统计数据。至少一汇总服务器用于对不同计算服务器的A省消息量统计结果进行汇总,得到A省分布的消息量的统计数据;至少一汇总服务器用于对不同计算服务器的B城市消息量统计结果进行汇总,得到B城市分布的消息量的统计数据。在某些应用中,汇总服务器也可以为一台。For example, in the case of counting the statistics of the message volume and the number of users in the provinces A and B, as an example, when the message amount is counted, at least one calculation server may use the message volume statistics of the service messages of the province A to obtain the province A. The message volume statistics, at least one calculation server is used to perform message volume statistics on the service messages of the B city, and obtain the message volume statistics data of the B city. At least one summary server is configured to summarize the A-message statistics of different computing servers, and obtain statistics of the amount of messages distributed by the province A; at least one summary server is used to summarize the statistics of the B-messages of different computing servers. , get the statistics of the amount of messages distributed by the B city. In some applications, the summary server can also be one.
数据库104可以为Cloud Database(CDB),CDB的主要特点包括:高性能、高可靠的MySQL服务;同时整合自动化管理工具,最大程度减少开发人员在部署、监控、扩容和故障恢复等方面的投入。此外,数据库104也可以为传统数据库,或者集成在计算集群服务器上的数据库。The database 104 can be a Cloud Database (CDB). The main features of the CDB include: high-performance, highly reliable MySQL services; and integration of automated management tools to minimize developer involvement in deployment, monitoring, capacity expansion, and failover. In addition, database 104 can also be a traditional database or a database integrated on a compute cluster server.
查询服务器105用于接收来自用户的查询请求,判断需要查询的统计数据是实时数据还是历史数据,如果为实时数据,则从数据库104中获取按照最近的第一时间间隔得到的各地分布的用户数的统计数据,从
实时检索分析服务器102获取所述实时得到的各地分布的消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据;如果为历史数据,则从数据库104中查询对应第二时间间隔得到的各地分布的用户数及消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据。The query server 105 is configured to receive a query request from the user, determine whether the statistical data that needs to be queried is real-time data or historical data, and if it is real-time data, obtain the number of users distributed according to the latest first time interval from the database 104. Statistics from
The real-time search and analysis server 102 obtains the statistical data of the distributed amount of the localized distributed message in real time, and displays the statistical data of the number of users distributed in the local area and the amount of the message; if it is historical data, the second corresponding query is obtained from the database 104. The statistics of the number of users and the amount of messages distributed around the time interval are obtained, and the statistics of the number of users and the amount of messages distributed in the localities are displayed.
当然,查询服务器105也可以默认为从数据库104中获取按照最近的第一时间间隔得到的各地分布的用户数的统计数据或根据所述查询请求,从数据库104中获取分布式计算服务器103按照第一时间间隔得到的各地分布的用户数的统计数据。和/或,还可默认从实时检索分析服务器102获取所述实时得到的各地分布的消息量的统计数据。Of course, the query server 105 may also default to obtain statistical data of the number of users distributed in the local area obtained from the database 104 according to the most recent first time interval or obtain the distributed computing server 103 from the database 104 according to the query request. Statistics of the number of users distributed throughout the time interval. And/or, the real-time retrieval analysis server 102 may also be used to obtain the statistics of the locally distributed message amount in real time.
本发明实施例中的技术方案,初次查询请求的响应延迟可在10秒(s)以下,连续查询请求的响应延迟可在5s以下。也就是说,该技术方案能够在低延迟的状况下,准实时展示业务消息量和用户数的地域分布统计信息。In the technical solution in the embodiment of the present invention, the response delay of the initial query request may be less than 10 seconds (s), and the response delay of the continuous query request may be less than 5 seconds. That is to say, the technical solution can display the geographical distribution statistics of the service message volume and the number of users in a real-time manner under low delay conditions.
本发明实施例中提出的一种业务数据的处理系统可仅包括上述的分布式计算服务器103。A processing system for service data proposed in the embodiment of the present invention may include only the distributed computing server 103 described above.
本发明实施例中提出的又一种业务数据的处理系统可仅包括上述的实时检索分析服务器102。Still another processing system for service data proposed in the embodiments of the present invention may include only the real-time retrieval analysis server 102 described above.
本发明实施例中提出的再一种业务数据的处理系统可同时包括上述的实时检索分析服务器102和分布式计算服务器103。A further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102 and the distributed computing server 103 described above.
上述各业务数据的处理系统还可以进一步包括查询服务器105。The processing system of each of the foregoing service data may further include a query server 105.
本发明实施例中提出的一种业务数据的处理系统可仅包括上述的查询服务器105。A processing system for service data proposed in the embodiment of the present invention may include only the above-mentioned query server 105.
本发明实施例中提出的又一种业务数据的处理系统可包括上述的分布式计算服务器103和查询服务器105。
Still another processing system for service data proposed in the embodiments of the present invention may include the distributed computing server 103 and the query server 105 described above.
本发明实施例中提出的再一种业务数据的处理系统可包括上述的实时检索分析服务器102和查询服务器105。A further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102 and the query server 105 described above.
本发明实施例中提出的再一种业务数据的处理系统可包括上述的实时检索分析服务器102、分布式计算服务器103和查询服务器105。A further processing system for service data proposed in the embodiments of the present invention may include the real-time retrieval analysis server 102, the distributed computing server 103, and the query server 105 described above.
本发明实施例中的查询服务器105可有多种具体实现方式,图2A和图2B分别示出了其中一种。The query server 105 in the embodiment of the present invention may have various specific implementation manners, and FIG. 2A and FIG. 2B respectively show one of them.
如图2A所示,该查询服务器105可包括请求接收模块201、查询模块202和展示模块203。As shown in FIG. 2A, the query server 105 can include a request receiving module 201, a query module 202, and a presentation module 203.
其中,请求接收模块201用于接收来自用户的查询请求。The request receiving module 201 is configured to receive a query request from a user.
查询模块202用于根据所述查询请求,从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据。The querying module 202 is configured to obtain, according to the query request, statistics of a number of users distributed by a distributed computing server according to a first time interval from a database.
展示模块203用于展示所述查询模块查询到的统计数据。The display module 203 is configured to display the statistics queried by the query module.
如图2B所示,该查询服务器105可包括请求接收模块201、判断模块204、查询模块202和展示模块203。As shown in FIG. 2B, the query server 105 can include a request receiving module 201, a determining module 204, a query module 202, and a display module 203.
其中,请求接收模块201用于接收来自用户的查询请求。The request receiving module 201 is configured to receive a query request from a user.
判断模块202用于根据所述查询请求判断需要查询的统计数据是实时数据还是历史数据。The determining module 202 is configured to determine, according to the query request, whether the statistical data that needs to be queried is real-time data or historical data.
查询模块203用于在需要查询的统计数据为实时数据时,从一数据库中获取一分布式计算服务器按照最近的第一时间间隔得到的各地分布的用户数的统计数据;在需要查询的统计数据为历史数据时,从所述数据库中查询所述分布式计算服务器对应第二时间间隔得到的各地分布的用户数及消息量的统计数据。The query module 203 is configured to: when the statistical data that needs to be queried is real-time data, obtain statistics of the number of users distributed by the distributed computing server according to the latest first time interval from a database; and the statistical data that needs to be queried When the data is historical data, the statistical data of the number of distributed users and the amount of messages obtained by the distributed computing server corresponding to the second time interval are queried from the database.
展示模块204用于展示所述查询模块查询到的统计数据。The display module 204 is configured to display the statistics queried by the query module.
在一个实施方式中,查询模块203还可进一步用于在需要查询的统计数据为实时数据时,从一实时检索分析服务器获取所述实时检索分析
服务器实时得到的各地分布的消息量的统计数据。In an embodiment, the query module 203 is further configured to obtain the real-time search and analysis from a real-time search and analysis server when the statistical data that needs to be queried is real-time data.
The statistics of the amount of distributed messages distributed around the server in real time.
本发明实施例中还提出一种业务数据的处理方法,该方法可在图1所示实施环境中实现。A method for processing service data is also proposed in the embodiment of the present invention, and the method can be implemented in the implementation environment shown in FIG.
图3为本发明实施例中一种业务数据的处理方法的示例性流程图。该方法可应用于分布式计算服务器中。如图3所示,该方法可包括如下步骤:FIG. 3 is an exemplary flowchart of a method for processing service data according to an embodiment of the present invention. This method can be applied to distributed computing servers. As shown in FIG. 3, the method may include the following steps:
步骤301,分布式计算服务器接收来自业务系统的业务消息及其属性信息;所述属性信息包括用户标识和来源地理位置信息。Step 301: The distributed computing server receives the service message and the attribute information from the service system, where the attribute information includes the user identifier and the source geographic location information.
步骤302,分布式计算服务器按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息。Step 302: The distributed computing server performs the de-recalculation of the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system according to the set first time interval. Get statistics on the number of users distributed across the country.
其中,不同的地域层级可根据实际需要进行划分,例如,可划分为包括城市、省份、国家三个地域层级,或者还可以划分为包括县、地区、省份、国家四个地域层级等。Among them, different geographical levels can be divided according to actual needs. For example, they can be divided into three levels: city, province, and country, or can be divided into four levels: county, region, province, and country.
步骤303,分布式计算服务器将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到一数据库中。Step 303: The distributed computing server stores the statistical data of the number of distributed users in the local time interval according to the first time interval into a database.
该方法还可进一步包括如下的步骤304和步骤305。The method may further include the following steps 304 and 305.
步骤304,分布式计算服务器按照设定的第二时间间隔,对来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;并对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据。其中,第二时间间隔大于所述第一时间间隔。Step 304: The distributed computing server recalculates the number of distributed users according to different geographical levels according to the geographical location information and the user identifier of the service message from the service system according to the set second time interval. The statistical data of the number of distributed users; and the distributed message amount calculation according to different geographical levels according to the geographic location information of the service message, and the statistical data of the distributed message volume is obtained. The second time interval is greater than the first time interval.
步骤305,将按照所述第二时间间隔得到的各地分布的用户数及消
息量的统计数据存储到一数据库中。 Step 305, the number of users distributed in each place obtained according to the second time interval
The statistics of the amount of interest are stored in a database.
上述方法中,对来自业务系统的业务消息数据根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据,可包括:分布式计算服务器中的每台计算服务器根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;分布式计算服务器中的每台汇总服务器对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据。In the above method, the service message data from the service system is de-recalculated according to the geographical location information and the user identifier according to the source geographic information and the user identifier, and the statistics of the number of users distributed in the local area are obtained, which may include: Each of the computing servers in the distributed computing server performs the deduplication of the number of users according to the user identifier of each service message according to the geographical location information of each service message, and obtains the user of the area according to the user identifier of each service message. Statistics data; each summary server in the distributed computing server aggregates the statistics of the number of users in the same area of different computing servers, and obtains statistical data of the number of users distributed in various places.
上述方法中,对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据,可包括:分布式计算服务器中的每台计算服务器根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息进行消息量统计,得到该地域的消息量统计数据;分布式计算服务器中的每台汇总服务器对不同计算服务器的相同地域的消息量统计结果进行汇总,得到各地分布的消息量的统计数据。In the above method, the service message is calculated according to the geographical location information of the source according to different geographical levels, and the statistical data of the distributed message volume is obtained, which may include: each of the distributed computing servers. The calculation server performs message volume statistics on the service messages of one of the different geographical levels according to the geographical location information of each service message, and obtains the statistics of the message volume of the area; each summary server in the distributed computing server performs different calculations. The statistics of the message volume of the same area of the server are summarized, and the statistics of the amount of the message distributed in each place are obtained.
此外,上述方法还可以进一步包括:实时检索分析服务器从业务系统接收来自业务系统的业务消息及其属性信息,并采用嵌套式的列存储以及位图的方式存储所述业务消息及其属性信息;根据存储的所述业务消息及其属性信息中的来源地理位置信息,实时确定业务消息在各地分布的消息量的统计数据。In addition, the foregoing method may further include: the real-time retrieval analysis server receives the service message and the attribute information from the service system from the service system, and stores the service message and the attribute information by using a nested column storage and a bitmap. And determining, according to the stored geographical information of the service message and the attribute information thereof, statistics of the amount of the message distributed by the service message in real time.
图4和图5分别为本发明实施例中一种业务数据的处理方法的流程示意图。该方法可应用于查询服务器中。FIG. 4 and FIG. 5 are respectively schematic flowcharts of a method for processing service data according to an embodiment of the present invention. This method can be applied to the query server.
对应步骤301~303,本实施例中提供的一种业务数据的处理方法可如图4所示,包括如下步骤:
Corresponding to steps 301-303, a method for processing service data provided in this embodiment may be as shown in FIG. 4, and includes the following steps:
步骤401,查询服务器接收来自用户的查询请求。In step 401, the query server receives a query request from the user.
步骤402,查询服务器根据所述查询请求从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据。Step 402: The query server obtains, according to the query request, statistics of the number of users distributed by the distributed computing server according to the first time interval from a database.
本步骤中,若需展示最近第一时间间隔得到的各地分布的用户数的统计数据,则从数据库中获取所述分布式计算服务器在最近的第一时间间隔得到的各地分布的用户数的统计数据;若需展示之前的某第一时间间隔得到的各地分布的用户数的统计数据,则从数据库中获取所述分布式计算服务器在对应的第一时间间隔得到的各地分布的用户数的统计数据。In this step, if statistics of the number of users distributed in the local area obtained in the first time interval are displayed, the statistics of the number of users distributed by the distributed computing server in the latest first time interval are obtained from the database. Data; if statistics of the number of users distributed in the local area obtained by the previous first time interval are displayed, the statistics of the number of users distributed by the distributed computing server at the corresponding first time interval are obtained from the database. data.
步骤403,展示所述各地分布的用户数的统计数据。 Step 403, displaying statistics of the number of users distributed in the localities.
此外,该图4所示方法还可以进一步包括:从实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据,并展示所述各地分布的消息量的统计数据。In addition, the method shown in FIG. 4 may further include: obtaining, from the real-time search and analysis server, statistical data of the distributed amount of the localized information obtained by the real-time search and analysis server in real time, and displaying statistical data of the distributed amount of the localized information.
对应步骤301~305,本实施例中提供的一种业务消息的目标信息地域分布展示方法可如图5所示,包括如下步骤:Corresponding to the steps 301-305, the method for displaying the geographical distribution of the target information of the service message provided in this embodiment may be as shown in FIG. 5, and includes the following steps:
步骤501,查询服务器接收来自用户的查询请求。In step 501, the query server receives a query request from the user.
步骤502,查询服务器接收来自用户的查询请求,判断需要查询的统计数据是实时数据还是历史数据,如果为实时数据,则执行步骤503;否则,执行步骤504。Step 502: The query server receives the query request from the user, and determines whether the statistical data to be queried is real-time data or historical data. If it is real-time data, step 503 is performed; otherwise, step 504 is performed.
步骤503,从所述数据库中获取所述分布式计算服务器在最近的第一时间间隔得到的各地分布的用户数的统计数据,并展示所述各地分布的用户数的统计数据。Step 503: Obtain statistics of the number of users distributed by the distributed computing server in the latest first time interval from the database, and display statistics of the number of users distributed in the local area.
步骤504,从所述数据库中查询所述分布式计算服务器在对应的第二时间间隔得到的各地分布的用户数及消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据。
Step 504: Query, from the database, statistics of the number of users and the amount of messages distributed by the distributed computing server at corresponding second time intervals, and display statistics of the number of users and the amount of messages distributed in the local area. data.
此外,该图5所示方法还可以进一步包括:在需要查询的统计数据为实时数据时,从实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据,并展示所述各地分布的消息量的统计数据。In addition, the method shown in FIG. 5 may further include: when the statistical data that needs to be queried is real-time data, obtaining, from the real-time search and analysis server, the statistical data of the distributed amount of the localized information obtained by the real-time search and analysis server in real time, and displaying Statistics of the amount of messages distributed throughout the country.
另外,本发明的每一个实施例可以通过由数据处理设备如计算机执行的数据处理程序来实现。显然,数据处理程序构成了本发明。此外,通常存储在一个存储介质中的数据处理程序通过直接将程序读取出存储介质或者通过将程序安装或复制到数据处理设备的存储设备(如硬盘和或内存)中执行。因此,这样的存储介质也构成了本发明。存储介质可以使用任何类型的记录方式,例如纸张存储介质(如纸带等)、磁存储介质(如软盘、硬盘、闪存等)、光存储介质(如CD-ROM等)、磁光存储介质(如MO等)等。Additionally, each of the embodiments of the present invention can be implemented by a data processing program executed by a data processing device such as a computer. Obviously, the data processing program constitutes the present invention. Further, a data processing program usually stored in a storage medium is executed by directly reading a program out of a storage medium or by installing or copying the program to a storage device (such as a hard disk and or a memory) of the data processing device. Therefore, such a storage medium also constitutes the present invention. The storage medium can use any type of recording method, such as paper storage medium (such as paper tape, etc.), magnetic storage medium (such as floppy disk, hard disk, flash memory, etc.), optical storage medium (such as CD-ROM, etc.), magneto-optical storage medium ( Such as MO, etc.).
此外,可以理解的是,将由存储介质读出的程序代码写到插入计算机内的扩展板中所设置的存储器中或者写到与计算机相连接的扩展单元中设置的存储器中,随后基于程序代码的指令使安装在扩展板或者扩展单元上的处理器等来执行部分和全部实际操作,从而实现上述实施例中任一实施例的功能。其中,该处理器可包括有一个或者一个以上处理核心。Further, it can be understood that the program code read out from the storage medium is written into a memory provided in an expansion board inserted into the computer or written in a memory set in an extension unit connected to the computer, and then based on the program code. The instructions cause a processor or the like mounted on the expansion board or the expansion unit to perform part and all of the actual operations, thereby realizing the functions of any of the above embodiments. Wherein, the processor may include one or more processing cores.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., which are included in the spirit and scope of the present invention, should be included in the present invention. Within the scope of protection.
Claims (17)
- 一种业务数据的处理方法,其特征在于,包括:A method for processing business data, comprising:分布式计算服务器接收来自业务系统的业务消息及其属性信息;所述属性信息包括用户标识和来源地理位置信息;The distributed computing server receives the service message and the attribute information from the service system; the attribute information includes the user identifier and the source geographic location information;分布式计算服务器按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;和The distributed computing server performs the de-recalculation of the number of distributed users according to the geographical location information and the user identifier according to the geographical location information and the user identifier of the service system according to the set first time interval, and obtains the distribution of the localities. Statistics on the number of users; and将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到一数据库中。The statistical data of the number of users distributed in each place obtained according to the first time interval is stored in a database.
- 根据权利要求1所述的方法,其特征在于,该方法进一步包括:实时检索分析服务器从业务系统接收来自业务系统的业务消息及其属性信息,并采用嵌套式的列存储以及位图的方式存储所述业务消息及其属性信息;根据存储的所述业务消息及其属性信息中的来源地理位置信息,实时确定业务消息在各地分布的消息量的统计数据;The method according to claim 1, wherein the method further comprises: the real-time retrieval analysis server receives the service message from the service system and the attribute information thereof from the service system, and adopts a nested column storage and a bitmap manner. And storing the service message and the attribute information thereof; determining, according to the stored geographical information of the service message and the attribute information thereof, the statistical data of the message volume distributed by the service message in real time;所述分布式计算服务器接收从所述实时检索分析服务器导出的来自业务系统的业务消息及其属性信息;或者,所述分布式计算服务器从所述业务系统接收来自所述业务系统的业务消息及其属性信息。The distributed computing server receives a service message from the service system and its attribute information derived from the real-time retrieval analysis server; or the distributed computing server receives a service message from the service system from the service system and Its attribute information.
- 根据权利要求1或2所述的方法,其特征在于,该方法进一步包括:分布式计算服务器按照设定的第二时间间隔,对来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;并对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据;和 The method according to claim 1 or 2, wherein the method further comprises: the distributed computing server separately, according to the set second time interval, the service message from the service system according to the source geographical location information and the user identifier respectively The distributed number of users is recalculated according to different geographical levels, and the statistical data of the number of users distributed in various places is obtained; and the distributed message amount is calculated according to different geographical levels according to the geographic location information of the service messages. , obtaining statistics on the amount of messages distributed throughout the country; and将按照所述第二时间间隔得到的各地分布的用户数及消息量的统计数据存储到一数据库中;Storing the statistics of the number of users and the amount of messages distributed in the local time interval according to the second time interval into a database;所述第二时间间隔大于所述第一时间间隔。The second time interval is greater than the first time interval.
- 根据权利要求3所述的方法,其特征在于,该方法进一步包括:查询服务器接收来自用户的查询请求,判断需要查询的统计数据是实时数据还是历史数据,如果为实时数据,则从所述数据库中获取在最近的第一时间间隔得到的各地分布的用户数的统计数据,从实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据,并展示所述各地分布的用户数的统计数据及各地分布的消息量的统计数据;如果为历史数据,则从所述数据库中查询在对应的第二时间间隔得到的各地分布的用户数及消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据。The method according to claim 3, wherein the method further comprises: the query server receiving the query request from the user, determining whether the statistical data to be queried is real-time data or historical data, and if it is real-time data, from the database Obtaining statistics of the number of users distributed in the local area obtained in the most recent first time interval, and obtaining statistics of the distributed amount of the localized information obtained by the real-time retrieval analysis server in real time from the real-time retrieval analysis server, and displaying the distribution of the localities The statistics of the number of users and the statistics of the amount of messages distributed in the localities; if it is historical data, the statistics of the number of users and the amount of messages distributed in the respective second time intervals are queried from the database, and The statistics of the number of users and the amount of messages distributed in the localities are displayed.
- 根据权利要求4所述的方法,其特征在于,在需要查询的统计数据为实时数据时,进一步从实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据,并展示所述各地分布的消息量的统计数据。The method according to claim 4, wherein when the statistical data to be queried is real-time data, the real-time retrieval analysis server further obtains statistical data of the distributed amount of the localized information obtained by the real-time retrieval analysis server in real time, and Statistics of the amount of messages distributed throughout the country are displayed.
- 根据权利要求3所述的方法,其特征在于,所述对来自业务系统的业务消息数据根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据,包括:The method according to claim 3, wherein the service message data from the service system is de-recalculated according to the geographical location information and the user identifier of the source according to different geographical levels, and is obtained by various places. Statistics on the number of users distributed, including:分布式计算服务器中的每台计算服务器根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;和Each of the computing servers in the distributed computing server performs the deduplication of the number of users according to the user identifier of each service message according to the geographical location information of each service message, and obtains the user of the area according to the user identifier of each service message. Statistics; and分布式计算服务器中的每台汇总服务器对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据; Each summary server in the distributed computing server aggregates the statistics of the number of users in the same area of different computing servers, and obtains statistics of the number of users distributed in different places;所述对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据,包括:The said service message is calculated according to the geographical location information of the source according to different geographical levels, and the statistics of the distributed message volume are obtained, including:分布式计算服务器中的每台计算服务器根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息进行消息量统计,得到该地域的消息量统计数据;和Each of the computing servers in the distributed computing server performs message volume statistics on the service messages of one of the different geographical levels according to the geographic location information of each service message, and obtains the statistics of the message volume of the area;分布式计算服务器中的每台汇总服务器对不同计算服务器的相同地域的消息量统计结果进行汇总,得到各地分布的消息量的统计数据。Each summary server in the distributed computing server aggregates the statistics of the message volume of the same region of different computing servers, and obtains statistics of the amount of messages distributed throughout the network.
- 一种业务数据的处理方法,其特征在于,包括:A method for processing business data, comprising:查询服务器接收来自用户的查询请求,根据所述查询请求从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据,并展示所述各地分布的用户数的统计数据;和The query server receives the query request from the user, and obtains statistics of the number of users distributed by the distributed computing server according to the first time interval according to the query request, and displays the number of users distributed in the local area. Statistics; and所述分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据为:所述分布式计算服务器按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息。The statistical data of the number of users distributed by the distributed computing server according to the first time interval is: the distributed computing server according to the set first time interval, according to the source of the service message from the service system The geographical location information and the user identification are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained.
- 根据权利要求7所述的方法,其特征在于,该方法进一步包括:The method of claim 7 wherein the method further comprises:根据所述查询请求判断需要查询的统计数据是实时数据还是历史数据,如果为实时数据,则从所述数据库中获取一分布式计算服务器按照最近的第一时间间隔得到的各地分布的用户数的统计数据,并展示所述各地分布的用户数的统计数据;如果为历史数据,则从所述数据库中查询所述分布式计算服务器对应第二时间间隔得到的各地分布的用户数及消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据; Determining, according to the query request, whether the statistical data that needs to be queried is real-time data or historical data, and if it is real-time data, obtaining, from the database, the number of users distributed by a distributed computing server according to the latest first time interval Statistics, and displaying statistical data of the number of users distributed in the localities; if it is historical data, querying the database for the number of users and the amount of messages distributed by the distributed computing server corresponding to the second time interval Statistical data, and display statistical data of the number of users and the amount of messages distributed in the localities;所述分布式计算服务器对应第二时间间隔得到的各地分布的用户数及消息量的统计数据为:分布式计算服务器按照设定的第二时间间隔,对来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;并对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据。The statistics of the number of users and the amount of messages distributed by the distributed computing server corresponding to the second time interval are: the distributed computing server according to the set second time interval, according to the source geography of the service message from the service system The location information and the user identifier are respectively calculated according to different geographical levels, and the statistics of the number of users distributed in each place are obtained, and the service messages are respectively according to different geographical levels according to the geographical location information of the source information. The distributed message volume is calculated, and the statistics of the distributed message volume are obtained.
- 根据权利要求8所述的方法,其特征在于,该方法进一步包括:The method of claim 8 further comprising:在需要查询的统计数据为实时数据时,进一步从一实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据,并展示所述各地分布的消息量的统计数据。When the statistic data to be queried is the real-time data, the statistic data of the distributed amount of the localized information obtained by the real-time search and analysis server in real time is further obtained from a real-time search and analysis server, and the statistical data of the distributed message amount is displayed.
- 一种业务数据的处理装置,其特征在于,包括:至少一个计算服务器和至少一个汇总服务器;其中,A processing device for business data, comprising: at least one computing server and at least one summary server; wherein每个计算服务器用于接收来自业务系统的业务消息及其属性信息,所述属性信息包括用户标识和来源地理位置信息;按照第一时间间隔根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息,根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;Each computing server is configured to receive a service message from the service system and attribute information thereof, where the attribute information includes a user identifier and a source geographic location information; and according to the geographical location information of each service message according to the first time interval, the different geographic level The service message of one area in the area is recalculated according to the user identifier of each service message, and the statistics of the number of users in the area are obtained;每个汇总服务器用于按照第一时间间隔对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据。Each summary server is configured to summarize the statistics of the number of users in the same area of different computing servers according to the first time interval, and obtain statistics of the number of users distributed in each place.
- 根据权利要求10所述的装置,其特征在于,所述计算服务器进一步用于按照第二时间间隔根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息根据各业务消息的用户标识进行用户数去重计算,得到该地域的用户数统计数据;以及根据各业务消息的来源地理位置信息,对不同地域层级中的一个地域的业务消息进行消 息量统计,得到该地域的消息量统计数据;The device according to claim 10, wherein the calculation server is further configured to: according to the source geographic location information of each service message according to the second time interval, the service message of one of the different geographical levels is according to each service message. The user ID is used to perform the recalculation of the number of users, and obtains the statistics of the number of users in the area; and the service information of one of the different geographical levels is cancelled according to the geographical location information of the source of each service message. Statistics on the amount of information, and get statistics on the volume of the area;每个汇总服务器用于按照第二时间间隔对不同计算服务器的相同地域的用户数统计结果进行汇总,得到各地分布的用户数的统计数据;以及对不同计算服务器的相同地域的消息量统计结果进行汇总,得到各地分布的消息量的统计数据。Each summary server is configured to summarize the statistics of the number of users in the same area of different computing servers according to the second time interval, and obtain statistical data of the number of users distributed in each area; and perform statistics on the statistics of the same area of different computing servers. Summarize and get statistics on the amount of messages distributed throughout the country.
- 一种业务数据的处理装置,其特征在于,包括:A device for processing business data, comprising:请求接收模块,用于接收来自用户的查询请求;a request receiving module, configured to receive a query request from a user;查询模块,用于根据所述查询请求,从一数据库中获取一分布式计算服务器按照第一时间间隔得到的各地分布的用户数的统计数据;和a query module, configured to obtain, according to the query request, statistics of a number of users distributed by a distributed computing server according to a first time interval from a database; and展示模块,用于展示所述查询模块查询到的统计数据。a display module, configured to display statistics queried by the query module.
- 根据权利要求12所述的装置,其特征在于,该装置进一步包括:The device of claim 12, wherein the device further comprises:判断模块,用于根据所述查询请求判断需要查询的统计数据是实时数据还是历史数据;a judging module, configured to determine, according to the query request, whether the statistical data that needs to be queried is real-time data or historical data;所述查询模块进一步用于在需要查询的统计数据为实时数据时,从所述数据库中获取一分布式计算服务器按照最近的第一时间间隔得到的各地分布的用户数的统计数据;在需要查询的统计数据为历史数据时,从所述数据库中查询所述分布式计算服务器对应第二时间间隔得到的各地分布的用户数及消息量的统计数据。The query module is further configured to: when the statistical data that needs to be queried is real-time data, obtain statistics of the number of users distributed by the distributed computing server according to the latest first time interval from the database; When the statistical data is historical data, the statistical data of the number of distributed users and the amount of messages obtained by the distributed computing server corresponding to the second time interval are queried from the database.
- 根据权利要求13所述的装置,其特征在于,所述查询模块进一步用于在需要查询的统计数据为实时数据时,从一实时检索分析服务器获取所述实时检索分析服务器实时得到的各地分布的消息量的统计数据。The device according to claim 13, wherein the query module is further configured to: when the statistical data that needs to be queried is real-time data, obtain, from a real-time search and analysis server, the distributed distribution of the real-time search and analysis server in real time. Statistics of the amount of messages.
- 一种业务数据的处理系统,其特征在于,包括:A processing system for business data, comprising:实时检索分析服务器,用于从业务系统接收来自业务系统的业务消息及其属性信息,并采用嵌套式的列存储以及位图的方式存储所述业务 消息及其属性信息;所述属性信息包括用户标识和来源地理位置信息;根据存储的所述业务消息及其属性信息中的来源地理位置信息,实时确定业务消息在各地分布的消息量的统计数据;和Real-time retrieval analysis server for receiving business messages and attribute information from the business system from the business system, and storing the services in a nested column storage and bitmap manner a message and attribute information thereof; the attribute information includes a user identifier and a source geographic location information; and determining, according to the stored geographical information of the service message and the attribute information thereof, real-time statistics of the amount of the message distributed by the service message ;with分布式计算服务器,用于从业务系统或所述实时检索分析服务器接收来自业务系统的业务消息及其属性信息;按照设定的第一时间间隔,对所述来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计信息;将按照所述第一时间间隔得到的各地分布的用户数的统计数据存储到一数据库中。a distributed computing server, configured to receive a service message from the service system and attribute information thereof from the service system or the real-time retrieval analysis server; according to the set first time interval, the service message from the service system is based on the source thereof The geographical location information and the user identifier are respectively calculated according to the number of distributed users in different geographical levels, and the statistical information of the number of users distributed in various places is obtained; the statistical data of the number of users distributed in each place according to the first time interval is obtained. Stored in a database.
- 根据权利要求15所述的系统,其特征在于,所述分布式计算服务器进一步用于按照设定的第二时间间隔,对来自业务系统的业务消息根据其来源地理位置信息及用户标识分别按照不同的地域层级进行分布式的用户数去重计算,得到各地分布的用户数的统计数据;并对所述业务消息根据其来源地理位置信息分别按照不同的地域层级进行分布式的消息量计算,得到各地分布的消息量的统计数据;将按照所述第二时间间隔得到的各地分布的用户数及消息量的统计数据存储到一数据库中;所述第二时间间隔大于所述第一时间间隔。The system according to claim 15, wherein the distributed computing server is further configured to: according to the set second time interval, the service message from the service system is different according to the source geographical location information and the user identifier. The geographical level of the distributed user number is recalculated to obtain statistical data of the number of users distributed in each place; and the distributed message amount is calculated according to different geographical levels according to the geographic location information of the service message. Statistic data of the distributed amount of the information; the statistical data of the number of distributed users and the amount of the message obtained according to the second time interval are stored in a database; the second time interval is greater than the first time interval.
- 根据权利要求16所述的系统,其特征在于,该系统进一步包括:The system of claim 16 wherein the system further comprises:查询服务器,用于接收来自用户的查询请求,判断需要查询的统计数据是实时数据还是历史数据,如果为实时数据,则从所述数据库中获取按照最近的第一时间间隔得到的各地分布的用户数的统计数据,从所述实时检索分析服务器获取所述实时得到的各地分布的消息量的统计数据,并展示所述各地分布的用户数及消息量的统计数据;如果为历史数据,则从所述数据库中查询对应第二时间间隔得到的各地分布的用户数及消息量的统计数据,并展示所述各地分布的用户数及消息量的统计 数据。 The query server is configured to receive a query request from the user, determine whether the statistical data to be queried is real-time data or historical data, and if it is real-time data, obtain, from the database, the distributed users according to the latest first time interval. Statistics of the number, the statistical data of the distributed amount of the localized distributed information is obtained from the real-time search and analysis server, and statistics of the number of users and the amount of messages distributed in the local area are displayed; if it is historical data, Querying statistics of the number of users and the amount of messages distributed in the local time interval corresponding to the second time interval, and displaying statistics on the number of users and the amount of messages distributed in the local area. data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610071149.8A CN107026881B (en) | 2016-02-02 | 2016-02-02 | Method, device and system for processing service data |
CN201610071149.8 | 2016-02-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017133539A1 true WO2017133539A1 (en) | 2017-08-10 |
Family
ID=59500271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/072185 WO2017133539A1 (en) | 2016-02-02 | 2017-01-23 | Service data processing method, device and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107026881B (en) |
WO (1) | WO2017133539A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532253A (en) * | 2019-09-05 | 2019-12-03 | 北京博睿宏远数据科技股份有限公司 | A kind of business diagnosis method, system and cluster |
CN112131276A (en) * | 2020-09-27 | 2020-12-25 | 深圳市欢太科技有限公司 | Data statistics method, electronic equipment and readable storage medium |
CN113469741A (en) * | 2021-06-30 | 2021-10-01 | 杭州云深科技有限公司 | APP regional distribution grade determination method and device, computer equipment and storage medium |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273512B (en) * | 2017-06-21 | 2020-06-16 | 深圳市盛路物联通讯技术有限公司 | Method and device for data deduplication based on device type and geographic position |
CN108427725B (en) * | 2018-02-11 | 2021-08-03 | 华为技术有限公司 | Data processing method, device and system |
CN108491732A (en) * | 2018-03-13 | 2018-09-04 | 山东超越数控电子股份有限公司 | A kind of mass storage data protection system and method based on business isolated storage |
CN110166344B (en) * | 2018-04-25 | 2021-08-24 | 腾讯科技(深圳)有限公司 | Identity identification method, device and related equipment |
CN108764532B (en) * | 2018-05-04 | 2021-07-09 | 金华市智甄通信设备有限公司 | Logistics flow prediction system and method based on router |
CN110347343B (en) * | 2019-07-16 | 2020-09-18 | 珠海格力电器股份有限公司 | Data management method and device |
CN111160975A (en) * | 2019-12-30 | 2020-05-15 | 中国移动通信集团黑龙江有限公司 | Target user determination method, device, equipment and computer storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030037110A1 (en) * | 2001-08-14 | 2003-02-20 | Fujitsu Limited | Method for providing area chat rooms, method for processing area chats on terminal side, computer-readable medium for recording processing program to provide area chat rooms, apparatus for providing area chat rooms, and terminal-side apparatus for use in a system to provide area chat rooms |
CN102760174A (en) * | 2012-08-06 | 2012-10-31 | 吴建辉 | Distributed actual condition search engine based on geographic locations and trading system |
CN102789508A (en) * | 2012-07-27 | 2012-11-21 | 吴建辉 | Distributed practical condition search engine and chat system on basis of geographical position |
CN103092950A (en) * | 2013-01-15 | 2013-05-08 | 重庆邮电大学 | Online public opinion geographical location real time monitoring system and method |
EP2955879A1 (en) * | 2014-06-12 | 2015-12-16 | Geo Communication Group bvba | A method and system for providing electronic information to a virtual mailbox based on a geographical address |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9451401B2 (en) * | 2011-05-27 | 2016-09-20 | Qualcomm Incorporated | Application transport level location filtering of internet protocol multicast content delivery |
CN103310087B (en) * | 2012-03-16 | 2016-03-16 | 腾讯科技(深圳)有限公司 | Business datum statistical analysis technique and device |
CN103227821B (en) * | 2013-04-03 | 2015-07-01 | 腾讯科技(深圳)有限公司 | Method and device for processing position data of target user |
CN104598503A (en) * | 2014-05-14 | 2015-05-06 | 腾讯科技(深圳)有限公司 | Geographic information data inquiry method, device and system |
-
2016
- 2016-02-02 CN CN201610071149.8A patent/CN107026881B/en active Active
-
2017
- 2017-01-23 WO PCT/CN2017/072185 patent/WO2017133539A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030037110A1 (en) * | 2001-08-14 | 2003-02-20 | Fujitsu Limited | Method for providing area chat rooms, method for processing area chats on terminal side, computer-readable medium for recording processing program to provide area chat rooms, apparatus for providing area chat rooms, and terminal-side apparatus for use in a system to provide area chat rooms |
CN102789508A (en) * | 2012-07-27 | 2012-11-21 | 吴建辉 | Distributed practical condition search engine and chat system on basis of geographical position |
CN102760174A (en) * | 2012-08-06 | 2012-10-31 | 吴建辉 | Distributed actual condition search engine based on geographic locations and trading system |
CN103092950A (en) * | 2013-01-15 | 2013-05-08 | 重庆邮电大学 | Online public opinion geographical location real time monitoring system and method |
EP2955879A1 (en) * | 2014-06-12 | 2015-12-16 | Geo Communication Group bvba | A method and system for providing electronic information to a virtual mailbox based on a geographical address |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532253A (en) * | 2019-09-05 | 2019-12-03 | 北京博睿宏远数据科技股份有限公司 | A kind of business diagnosis method, system and cluster |
CN112131276A (en) * | 2020-09-27 | 2020-12-25 | 深圳市欢太科技有限公司 | Data statistics method, electronic equipment and readable storage medium |
CN113469741A (en) * | 2021-06-30 | 2021-10-01 | 杭州云深科技有限公司 | APP regional distribution grade determination method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107026881B (en) | 2020-04-03 |
CN107026881A (en) | 2017-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017133539A1 (en) | Service data processing method, device and system | |
US20230188452A1 (en) | Performance monitoring in a distributed storage system | |
US9996565B2 (en) | Managing an index of a table of a database | |
US9338594B1 (en) | Processing location information | |
US10848903B2 (en) | Determining timing for determination of applicable geo-fences | |
US10281284B2 (en) | Hybrid road network and grid based spatial-temporal indexing under missing road links | |
US10002170B2 (en) | Managing a table of a database | |
US11681927B2 (en) | Analyzing geotemporal proximity of entities through a knowledge graph | |
US10586245B1 (en) | Push reporting | |
CN111859187A (en) | POI query method, device, equipment and medium based on distributed graph database | |
US9774696B1 (en) | Using a polygon to select a geolocation | |
US8914357B1 (en) | Mapping keywords to geographic features | |
AU2019422010B2 (en) | Intelligent geofence provisioning | |
US11702080B2 (en) | System and method for parking tracking using vehicle event data | |
CN118394800A (en) | Index query method and device, electronic equipment and readable storage medium | |
US9009155B2 (en) | Parallel set aggregation | |
CN115422236A (en) | Data dynamic publishing method and system based on differential privacy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17746859 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17746859 Country of ref document: EP Kind code of ref document: A1 |