WO2022082892A1 - Big data analysis method and system, and computer device and storage medium thereof - Google Patents

Big data analysis method and system, and computer device and storage medium thereof Download PDF

Info

Publication number
WO2022082892A1
WO2022082892A1 PCT/CN2020/127948 CN2020127948W WO2022082892A1 WO 2022082892 A1 WO2022082892 A1 WO 2022082892A1 CN 2020127948 W CN2020127948 W CN 2020127948W WO 2022082892 A1 WO2022082892 A1 WO 2022082892A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
cache server
analysis
big data
cache
Prior art date
Application number
PCT/CN2020/127948
Other languages
French (fr)
Chinese (zh)
Inventor
彭加山
彭晓芳
Original Assignee
苏州莱锦机电自动化有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州莱锦机电自动化有限公司 filed Critical 苏州莱锦机电自动化有限公司
Publication of WO2022082892A1 publication Critical patent/WO2022082892A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Definitions

  • the embodiments of the present invention relate to the technical field of big data analysis, in particular to a big data analysis method, system, computer equipment and storage medium thereof.
  • the purpose of the embodiments of the present invention is to provide a big data analysis method, a system, a computer device and a storage medium thereof, so as to solve the problems raised in the above background art.
  • a big data analysis method includes the following steps:
  • the corresponding analysis result data is called in the data cache server as the query result according to the parameters of the data query request, and the query result is output to the front-end for visual display.
  • the distributed database is an Hbase database.
  • the step of caching the analysis result configuration data cache server includes:
  • the analysis result data in the second cache server is transferred to the first cache server.
  • the step of caching the analysis result configuration data cache server further includes:
  • the step of caching the analysis result configuration data cache server further includes:
  • Big data analysis system the system includes:
  • the acquisition unit is used to acquire the big data information accessed by the user within the target time period
  • the storage unit is configured to store the big data information in a distributed database by time slices;
  • an execution unit configured to perform a data analysis task on the big data information stored in the distributed database based on a predetermined rule to obtain an analysis result
  • the cache unit is configured to cache the analysis result configuration data cache server;
  • the output unit is used to call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request when the front-end data query request is obtained, and output the query result to the front-end and visualize it exhibit.
  • the cache unit includes:
  • the configuration module is used to configure the first cache server and the second cache server;
  • the storage module is configured to store the analysis result data whose access times are not greater than a preset threshold in the second cache server;
  • a transfer module configured to transfer the analysis result data to the first cache server when the access times of the analysis result data in the second cache server is greater than a preset threshold.
  • a computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
  • a storage medium storing a computer program, the computer program implementing the steps of the method when executed by the processor.
  • the big data analysis method obtains the big data information accessed by the user within the target time period; stores the big data information in a distributed database by time slices; Predetermined rules perform data analysis tasks on the big data information stored in the distributed database to obtain analysis results; configure a data cache server to cache the analysis results; and when the front-end data query request is obtained, according to the data query
  • the requested parameters call the corresponding analysis result data in the data cache server as the query result, and output the query result to the front end and display it visually, which can reduce the burden on the processor of the big data analysis system, improve the user's access speed, and avoid generating Access to the situation of stuck, to ensure the smoothness of user access.
  • FIG. 1 is an architectural diagram of a big data analysis method applicable to an embodiment of the present invention provided by an embodiment of the present invention.
  • FIG. 2 is an implementation flowchart of the big data analysis method provided in Embodiment 1 of the present invention.
  • FIG. 3 is an implementation flowchart of the big data analysis method provided in Embodiment 2 of the present invention.
  • FIG. 4 is an implementation flowchart of the big data analysis method provided in Embodiment 3 of the present invention.
  • FIG. 5 is an implementation flowchart of the big data analysis method provided in Embodiment 4 of the present invention.
  • FIG. 6 is a structural block diagram of a big data analysis system according to Embodiment 5 of the present invention.
  • FIG. 7 is a structural block diagram of a cache unit in a big data analysis system according to Embodiment 6 of the present invention.
  • the big data information accessed by the user within the target time period is obtained; the big data information is stored in the distributed database by time slices; and then based on predetermined rules, the data is stored in the distributed database Execute data analysis task on the big data information of the database to obtain the analysis result; configure the data cache server to cache the analysis result; and when the front-end data query request is obtained, call the data cache server according to the parameters of the data query request.
  • the corresponding analysis result data is used as the query result, and the query result is output to the front end for visual display, which can reduce the burden on the processor of the big data analysis system, improve the user's access speed, avoid access jams, and ensure smooth user access. sex.
  • FIG. 1 shows an exemplary system architecture diagram to which an embodiment of the big data analysis method of the present disclosure can be applied.
  • the system architecture may include terminals, distributed databases and cache servers.
  • the user can use the terminal to interact with the cache server through the network to receive or send messages and so on.
  • Terminals can be hardware or software.
  • the terminal When the terminal is hardware, it can be various electronic devices with communication functions, including but not limited to smart phones, tablet computers, e-book readers, MP3 players, MP4 players, laptop computers and desktop computers, etc. .
  • the terminal When the terminal is software, it can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules, and can also be implemented as a single software or software module. There is no specific limitation here.
  • the cache server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or can be implemented as a single server.
  • the server is software, it may be implemented as multiple software or software modules, or may be implemented as a single software or software module. There is no specific limitation here.
  • terminals and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • Embodiment 1 The embodiment of the present invention provides a big data analysis method.
  • FIG. 2 shows a flow of an embodiment of a big data analysis method. This embodiment is mainly illustrated by applying the method to an electronic device having a certain computing capability, and the electronic device may be the terminal shown in FIG. 1 .
  • the big data analysis method includes the following steps:
  • Step S100 acquiring the big data information accessed by the user within the target time period
  • step S100 when a user uses a terminal to perform search access, the terminal acquires data of the user's search access within a target time period, and stores the time node and text information of the data, and stores the data.
  • the time node information and text information are sent through the network.
  • the network may be the medium used to provide the communication link between the terminal and the server.
  • the network may include various connection types, such as wired, wireless communication links, or fiber optic cables, etc., without limitation.
  • Step S200 storing the big data information in a distributed database by time slices
  • the time slice may be set to one week as required, and the big data called by the server will be overwritten by the new big data after one week, thereby realizing the updating of the big data.
  • the integrity verification and legality verification of the big data are also included.
  • Step S300 performing a data analysis task on the big data information stored in the distributed database based on a predetermined rule to obtain an analysis result
  • Step S400 Cache the analysis result configuration data cache server
  • Step S500 When the data query request from the front end is obtained, call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request, and output the query result to the front end for visual display.
  • step S500 the visual display uses the display screen of the terminal to display the output query result, so that the user can obtain the query result.
  • the distributed database is an Hbase database
  • the big data is stored in the form of row keys (rowkeys) and column names.
  • the integrity verification and legality verification of the big data are also included, wherein the integrity verification is completed by redis in the network system. Send big data to the server to complete legality verification locally.
  • Redis is an open-source, network-supporting log-type, key-value database that can be memory-based or persistent.
  • FIG. 3 discloses a schematic flowchart of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention, wherein the data cache server is configured for the analysis result.
  • the step S400 of caching includes:
  • Step S401 configure the first cache server and the second cache server
  • Step S402 Store the analysis result data whose access times are not greater than a preset threshold in the second cache server;
  • Step S403 When the access times of the analysis result data in the second cache server is greater than a preset threshold, transfer the analysis result data to the first cache server.
  • step S400 of configuring a data cache server to cache the analysis result two cache servers are configured to store cached data with more access times and less access times respectively, and the two cache servers adopt independent elimination strategies for data storage. Elimination can avoid inaccurate judgment of the single-cache server, and eliminate some data expected to be cached, thereby effectively improving the accuracy of cached data.
  • FIG. 4 discloses a schematic flowchart of a further description of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention.
  • the step S400 of caching the analysis result configuration data cache server further includes:
  • Step S4401 Eliminate the analysis result data with the earliest access time in the first cache server
  • Step S4501 Transfer the analysis result data eliminated in the first cache server to the second cache server;
  • Step S4601 Clear the analysis result data with the earliest storage time in the second cache server.
  • data elimination is performed after the data storage of the first cache server is full.
  • the data with the farthest access time from the current time in the first cache server within a preset period of time is obtained, and the data with the earliest access time within the preset time period is preferentially eliminated, and the eliminated data can be transferred to the second cache server.
  • FIG. 5 discloses a schematic flowchart of a further description of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention.
  • the step S400 of caching the analysis result configuration data cache server further includes:
  • Step S4401 Eliminate the analysis result data with the least access times within the preset time in the first cache server
  • Step S4502 Transfer the analysis result data eliminated in the first cache server to the second cache server;
  • Step S4602 Clear the analysis result data with the earliest storage time in the second cache server.
  • data elimination is performed after the data storage of the first cache server is full.
  • the number of times that each data in the first cache server is accessed within a preset period of time is obtained, and the data with the least number of visits within the preset period of time is preferentially eliminated, and the eliminated data can be transferred to the second cache server.
  • Embodiment 5 The embodiment of the present invention provides a big data analysis system 600 .
  • FIG. 6 is a structural block diagram of a big data analysis system according to Embodiment 5 of the present invention.
  • the big data analysis system 600 includes:
  • the obtaining unit 601 is used to obtain the big data information accessed by the user within the target time period;
  • a storage unit 602 which is configured to store the big data information in a distributed database by time slices;
  • the time slicing can be set to one week as required, and the big data called by the server will be covered by the new big data after one week, so as to realize the updating of the big data.
  • the integrity verification and legality verification of the big data are also included.
  • the distributed database is an Hbase database
  • the big data is stored in the form of row keys (rowkeys) and column names.
  • the integrity verification and legality verification of the big data are also included, wherein the integrity verification is completed by redis in the network system. Send big data to the server to complete legality verification locally.
  • an execution unit 603 the execution unit is configured to perform a data analysis task on the big data information stored in the distributed database based on a predetermined rule, and obtain an analysis result;
  • the cache unit is configured to cache the analysis result configuration data cache server;
  • Output unit 605 the output unit is used to call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request when the front-end data query request is obtained, and output the query result to the front-end and perform the query.
  • Visual display the output unit is used to call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request when the front-end data query request is obtained, and output the query result to the front-end and perform the query.
  • FIG. 7 shows a structural block diagram of the cache unit 604 in the big data analysis system provided by Embodiment 6 of the present invention.
  • the cache unit 604 includes:
  • a configuration module 6041 the configuration module is used to configure the first cache server and the second cache server;
  • a storage module 6042 the storage module is configured to store the analysis result data whose access times are not greater than a preset threshold in the second cache server;
  • Transfer module 6043 the transfer module is configured to transfer the analysis result data to the first cache server when the access times of the analysis result data in the second cache server is greater than a preset threshold.
  • the cache unit 604 configures two cache servers to store cached data with more visits and fewer visits respectively, and the two cache servers adopt independent elimination strategies to eliminate data, which can avoid inaccurate judgment of a single cache server, and use Some expect cached data eviction, thereby effectively improving the accuracy of cached data.
  • Embodiment 7 of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes The computer program implements the steps of the big data analysis method.
  • Embodiment 8 of the present invention further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the steps of the big data analysis method are implemented.
  • a computer program may be divided into one or more modules, and the one or more modules are stored in a memory and executed by a processor to accomplish the present invention.
  • One or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the terminal device.
  • the above-mentioned computer program can be divided into units or modules of the berth status display system provided by each of the above-mentioned system embodiments.
  • the above description of the terminal device is only an example, and does not constitute a limitation on the terminal device, and may include more or less components than the above description, or combine some components, or different components, such as It can include input and output devices, network access devices, buses, etc.
  • the processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. .
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
  • the above-mentioned processor is the control center of the above-mentioned terminal equipment, and uses various interfaces and lines to connect various parts of the entire user terminal.
  • the above-mentioned memory can be used to store computer programs and/or modules, and the above-mentioned processor implements various functions of the above-mentioned terminal device by running or executing the computer programs and/or modules stored in the memory and calling the data stored in the memory.
  • the memory can mainly include a stored program area and a stored data area, wherein the stored program area can store the operating system, application programs required for at least one function (such as information collection template display function, product information release function, etc.), etc.; Store the data created according to the use of the berth status display system (such as product information collection templates corresponding to different product types, product information that different product providers need to publish, etc.), etc.
  • the memory may include high-speed random access memory, and may also include non-volatile memory such as hard disks, internal memory, plug-in hard disks, smart memory cards, secure digital cards, flash memory cards, at least one magnetic disk storage device, flash memory devices, or other volatile solid-state storage devices.
  • non-volatile memory such as hard disks, internal memory, plug-in hard disks, smart memory cards, secure digital cards, flash memory cards, at least one magnetic disk storage device, flash memory devices, or other volatile solid-state storage devices.
  • modules/units integrated in the terminal equipment are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium.
  • the present invention realizes all or part of the modules/units in the system of the above-mentioned embodiments, and can also be completed by instructing the relevant hardware through a computer program, and the above-mentioned computer program can be stored in a computer-readable storage medium, the When the computer program is executed by the processor, the functions of the above-described various system embodiments can be realized.
  • the computer program includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate forms, and the like.
  • Computer readable media may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, removable hard disks, magnetic disks, optical discs, computer memory, read-only memory, random access memory, electrical carrier signals, telecommunication signals and software distribution media.
  • the big data analysis method and big data analysis system obtain the big data information accessed by users within the target time period; and store the big data information in a distributed database by time slices Then perform data analysis task on the big data information stored in the distributed database based on predetermined rules, and obtain the analysis result; configure the data cache server to cache the analysis result; and when obtaining the front-end data query request, According to the parameters of the data query request, the corresponding analysis result data is called in the data cache server as the query result, and the query result is output to the front end for visual display, which can reduce the burden on the processor of the big data analysis system and improve the user's access speed. , to avoid access freezes and ensure the smoothness of user access.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated.

Abstract

A big data analysis method and system, and a computer device and a storage medium thereof, relating to the technical field of big data analysis. Said method comprises: acquiring big data information accessed by a user within a target time period (S100); segmenting the big data information according to time and storing same in a distributed database (S200); on the basis of a predetermined rule, executing a data analysis task on the big data information stored in the distributed database, so as to obtain analysis results (S300); configuring a data buffering server to buffer the analysis results (S400); and when a data query request of a front end is acquired, invoking, according to parameters of the data query request, the corresponding analysis result data in the data buffering server as a query result, and outputting the query result to the front end and performing visual display (S500). The present invention can reduce the burden of a processor of a big data analysis system, increase the access speed of a user, avoid the occurrence of stuck access, and ensure smooth user access.

Description

大数据分析方法、系统、计算机设备及其存储介质Big data analysis method, system, computer equipment and storage medium thereof 技术领域technical field
本发明实施例涉及大数据分析技术领域,具体是大数据分析方法、系统、计算机设备及其存储介质。 The embodiments of the present invention relate to the technical field of big data analysis, in particular to a big data analysis method, system, computer equipment and storage medium thereof.
背景技术Background technique
随着大数据时代的来临,网络中的信息量呈现指数式增长,随之带来了信息过载问题;推荐系统是解决信息过载最有效的方式之一,大数据推荐系统已经逐渐成为信息领域的研究热点,“大数据时代”已然来临。随着“大数据”时代的到来,人们对于海量数据的挖掘和运用,这预示着新一波生产率增长和消费者盈余浪潮的到来。大数据作为云计算、物联网之后IT行业又一大颠覆性的技术革命。随着大数据的来临,使得人们对大数据的需要量不断的增加,这样会增加智能分析系统的负担,造成智能分析系统处理器的负担,当智能的分析系统产生负担时,会降低用户的访问速度,产生卡顿的情况,或者不能加载出数据,从而给用户带来访问不流畅情况。With the advent of the era of big data, the amount of information in the network increases exponentially, which brings the problem of information overload. Recommendation systems are one of the most effective ways to solve information overload. The research hotspot, the "big data era" has arrived. With the advent of the era of "big data", people are mining and using massive data, which heralds the arrival of a new wave of productivity growth and consumer surplus. Big data is another major disruptive technological revolution in the IT industry after cloud computing and the Internet of Things. With the advent of big data, people's demand for big data continues to increase, which will increase the burden on the intelligent analysis system and cause the burden on the processor of the intelligent analysis system. When the intelligent analysis system is burdened, it will reduce the user's The access speed may be stuck, or the data cannot be loaded, which will cause the user to experience unsmooth access.
技术解决方案technical solutions
本发明实施例的目的在于提供大数据分析方法、系统、计算机设备及其存储介质,以解决上述背景技术中提出的问题。The purpose of the embodiments of the present invention is to provide a big data analysis method, a system, a computer device and a storage medium thereof, so as to solve the problems raised in the above background art.
为实现上述目的,本发明实施例提供如下技术方案:To achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
大数据分析方法,所述方法包括以下步骤:A big data analysis method, the method includes the following steps:
获取目标时间段内用户访问的大数据信息;Obtain big data information accessed by users within the target time period;
将所述大数据信息按时间分片存储在分布式数据库中;storing the big data information in a distributed database according to time slices;
基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;Perform data analysis tasks on the big data information stored in the distributed database based on predetermined rules to obtain analysis results;
对所述分析结果配置数据缓存服务器进行缓存;configuring the data cache server to cache the analysis results;
在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。When the front-end data query request is obtained, the corresponding analysis result data is called in the data cache server as the query result according to the parameters of the data query request, and the query result is output to the front-end for visual display.
作为本发明实施例技术方案进一步的限定,所述分布式数据库为Hbase数据库。As a further limitation of the technical solutions of the embodiments of the present invention, the distributed database is an Hbase database.
作为本发明实施例技术方案进一步的限定,对所述分析结果配置数据缓存服务器进行缓存的步骤包括:As a further limitation of the technical solution of the embodiment of the present invention, the step of caching the analysis result configuration data cache server includes:
配置第一缓存服务器与第二缓存服务器;configure the first cache server and the second cache server;
将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;storing the analysis result data whose number of visits is not greater than a preset threshold in the second cache server;
当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。When the access times of the analysis result data in the second cache server is greater than a preset threshold, the analysis result data is transferred to the first cache server.
作为本发明实施例技术方案进一步的限定,对所述分析结果配置数据缓存服务器进行缓存的步骤还包括:As a further limitation of the technical solution of the embodiment of the present invention, the step of caching the analysis result configuration data cache server further includes:
淘汰所述第一缓存服务器中访问时间最早的分析结果数据;Eliminate the analysis result data with the earliest access time in the first cache server;
将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;transferring the analysis result data eliminated in the first cache server to the second cache server;
清除所述第二缓存服务器中存储时间最早的分析结果数据。Clear the analysis result data with the earliest storage time in the second cache server.
作为本发明实施例技术方案进一步的限定,对所述分析结果配置数据缓存服务器进行缓存的步骤还包括:As a further limitation of the technical solution of the embodiment of the present invention, the step of caching the analysis result configuration data cache server further includes:
淘汰所述第一缓存服务器中在预设时间内访问次数最少的分析结果数据;Eliminate the analysis result data with the least number of visits within the preset time in the first cache server;
将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;transferring the analysis result data eliminated in the first cache server to the second cache server;
清除所述第二缓存服务器中存储时间最早的分析结果数据。Clear the analysis result data with the earliest storage time in the second cache server.
大数据分析系统,所述系统包括:Big data analysis system, the system includes:
获取单元,所述获取单元用于获取目标时间段内用户访问的大数据信息;an acquisition unit, the acquisition unit is used to acquire the big data information accessed by the user within the target time period;
存储单元,所述存储单元用于将所述大数据信息按时间分片存储在分布式数据库中;a storage unit, the storage unit is configured to store the big data information in a distributed database by time slices;
执行单元,所述执行单元用于基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;an execution unit, configured to perform a data analysis task on the big data information stored in the distributed database based on a predetermined rule to obtain an analysis result;
缓存单元,所述缓存单元用于对所述分析结果配置数据缓存服务器进行缓存;以及a cache unit, the cache unit is configured to cache the analysis result configuration data cache server; and
输出单元,所述输出单元用于在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。an output unit, the output unit is used to call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request when the front-end data query request is obtained, and output the query result to the front-end and visualize it exhibit.
作为本发明实施例技术方案进一步的限定,所述缓存单元包括:As a further limitation of the technical solutions of the embodiments of the present invention, the cache unit includes:
配置模块,所述配置模块用于配置第一缓存服务器与第二缓存服务器;a configuration module, the configuration module is used to configure the first cache server and the second cache server;
存储模块,所述存储模块用于将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;以及a storage module, the storage module is configured to store the analysis result data whose access times are not greater than a preset threshold in the second cache server; and
转移模块,所述转移模块用于当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。A transfer module, configured to transfer the analysis result data to the first cache server when the access times of the analysis result data in the second cache server is greater than a preset threshold.
计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述方法的步骤。A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现所述方法的步骤。A storage medium storing a computer program, the computer program implementing the steps of the method when executed by the processor.
有益效果beneficial effect
与现有技术相比,本发明实施例提供的大数据分析方法通过获取目标时间段内用户访问的大数据信息;并将所述大数据信息按时间分片存储在分布式数据库中;然后基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;对所述分析结果配置数据缓存服务器进行缓存;且在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,并输出查询结果至前端并进行可视化展示,能够降低大数据分析系统处理器的负担,提高用户的访问速度,避免产生访问卡顿的情况,保证用户访问的流畅性。Compared with the prior art, the big data analysis method provided by the embodiment of the present invention obtains the big data information accessed by the user within the target time period; stores the big data information in a distributed database by time slices; Predetermined rules perform data analysis tasks on the big data information stored in the distributed database to obtain analysis results; configure a data cache server to cache the analysis results; and when the front-end data query request is obtained, according to the data query The requested parameters call the corresponding analysis result data in the data cache server as the query result, and output the query result to the front end and display it visually, which can reduce the burden on the processor of the big data analysis system, improve the user's access speed, and avoid generating Access to the situation of stuck, to ensure the smoothness of user access.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例。In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only for the present invention. some examples.
图1为本发明实施例提供的适用于本发明实施例的大数据分析方法的架构图。FIG. 1 is an architectural diagram of a big data analysis method applicable to an embodiment of the present invention provided by an embodiment of the present invention.
图2为本发明实施例一提供的大数据分析方法的实现流程图。FIG. 2 is an implementation flowchart of the big data analysis method provided in Embodiment 1 of the present invention.
图3为本发明实施例二提供的大数据分析方法的实现流程图。FIG. 3 is an implementation flowchart of the big data analysis method provided in Embodiment 2 of the present invention.
图4为本发明实施例三提供的大数据分析方法的实现流程图。FIG. 4 is an implementation flowchart of the big data analysis method provided in Embodiment 3 of the present invention.
图5为本发明实施例四提供的大数据分析方法的实现流程图。FIG. 5 is an implementation flowchart of the big data analysis method provided in Embodiment 4 of the present invention.
图6为本发明实施例五提供的大数据分析系统的结构框图。FIG. 6 is a structural block diagram of a big data analysis system according to Embodiment 5 of the present invention.
图7为本发明实施例六提供的大数据分析系统中缓存单元的结构框图。FIG. 7 is a structural block diagram of a cache unit in a big data analysis system according to Embodiment 6 of the present invention.
本发明的实施方式Embodiments of the present invention
在为了使本发明所要解决的技术问题、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the technical problems, technical solutions and beneficial effects to be solved by the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
本发明实施例中,通过获取目标时间段内用户访问的大数据信息;并将所述大数据信息按时间分片存储在分布式数据库中;然后基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;对所述分析结果配置数据缓存服务器进行缓存;且在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,并输出查询结果至前端并进行可视化展示,能够降低大数据分析系统处理器的负担,提高用户的访问速度,避免产生访问卡顿的情况,保证用户访问的流畅性。In the embodiment of the present invention, the big data information accessed by the user within the target time period is obtained; the big data information is stored in the distributed database by time slices; and then based on predetermined rules, the data is stored in the distributed database Execute data analysis task on the big data information of the database to obtain the analysis result; configure the data cache server to cache the analysis result; and when the front-end data query request is obtained, call the data cache server according to the parameters of the data query request The corresponding analysis result data is used as the query result, and the query result is output to the front end for visual display, which can reduce the burden on the processor of the big data analysis system, improve the user's access speed, avoid access jams, and ensure smooth user access. sex.
图1示出了可以应用本公开的大数据分析方法实施例的示例性系统架构图。FIG. 1 shows an exemplary system architecture diagram to which an embodiment of the big data analysis method of the present disclosure can be applied.
如图1所示,系统架构可以包括终端、分布式数据库和缓存服务器。As shown in Figure 1, the system architecture may include terminals, distributed databases and cache servers.
用户可以使用终端通过网络与缓存服务器进行交互,以接收或发送消息等。The user can use the terminal to interact with the cache server through the network to receive or send messages and so on.
终端可以是硬件,也可以是软件。当终端为硬件时,可以是具有通信功能的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器、MP4播放器、膝上型便携计算机和台式计算机等等。当终端为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。Terminals can be hardware or software. When the terminal is hardware, it can be various electronic devices with communication functions, including but not limited to smart phones, tablet computers, e-book readers, MP3 players, MP4 players, laptop computers and desktop computers, etc. . When the terminal is software, it can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules, and can also be implemented as a single software or software module. There is no specific limitation here.
需要说明的是,缓存服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the cache server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or can be implemented as a single server. When the server is software, it may be implemented as multiple software or software modules, or may be implemented as a single software or software module. There is no specific limitation here.
应该理解,图1中的终端和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminals and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
实施例一:本发明实施例提供了一种大数据分析方法。Embodiment 1: The embodiment of the present invention provides a big data analysis method.
请参考图2,其示出了大数据分析方法的一个实施例的流程。本实施例主要以该方法应用于有一定运算能力的电子设备中来举例说明,该电子设备可以是图1示出的终端。所述的大数据分析方法,包括以下步骤:Please refer to FIG. 2 , which shows a flow of an embodiment of a big data analysis method. This embodiment is mainly illustrated by applying the method to an electronic device having a certain computing capability, and the electronic device may be the terminal shown in FIG. 1 . The big data analysis method includes the following steps:
步骤S100:获取目标时间段内用户访问的大数据信息;Step S100: acquiring the big data information accessed by the user within the target time period;
在本发明实施例提供的步骤S100中,用户在利用终端进行搜索访问时,通过终端获取到该用户在目标时间段内进行搜索访问的数据,并对数据的时间节点和文本信息进行存储,存储的时间节点信息和文本信息通过网络进行发送。In step S100 provided by the embodiment of the present invention, when a user uses a terminal to perform search access, the terminal acquires data of the user's search access within a target time period, and stores the time node and text information of the data, and stores the data. The time node information and text information are sent through the network.
所述网络可以是用以在终端和服务器之间提供通信链路的介质。网络可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等,于此不作限定。The network may be the medium used to provide the communication link between the terminal and the server. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, etc., without limitation.
步骤S200:将所述大数据信息按时间分片存储在分布式数据库中;Step S200: storing the big data information in a distributed database by time slices;
在本发明实施例提供的步骤S200中,可以根据需要将时间分片设置成一周,一周后服务器调用的大数据会被新的大数据所覆盖,从而实现大数据的更新。In step S200 provided by the embodiment of the present invention, the time slice may be set to one week as required, and the big data called by the server will be overwritten by the new big data after one week, thereby realizing the updating of the big data.
在该大数据分析方法中,在将数据存储到分布式数据库之前,还包括对大数据的完整性验证及合法性验证。In the big data analysis method, before the data is stored in the distributed database, the integrity verification and legality verification of the big data are also included.
步骤S300:基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;Step S300: performing a data analysis task on the big data information stored in the distributed database based on a predetermined rule to obtain an analysis result;
步骤S400:对所述分析结果配置数据缓存服务器进行缓存;Step S400: Cache the analysis result configuration data cache server;
步骤S500:在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。Step S500: When the data query request from the front end is obtained, call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request, and output the query result to the front end for visual display.
在本发明实施例提供的步骤S500中,所述的可视化展示采用终端的显示屏对输出查询结果进行显示,以供用户对查询结果进行获取。In step S500 provided by the embodiment of the present invention, the visual display uses the display screen of the terminal to display the output query result, so that the user can obtain the query result.
进一步的,在本发明提供的优选实施方式中,所述分布式数据库为Hbase数据库,采用行键(rowkey)和列名的方式存储大数据。Further, in a preferred embodiment provided by the present invention, the distributed database is an Hbase database, and the big data is stored in the form of row keys (rowkeys) and column names.
在该大数据分析方法中,在将数据存储到分布式数据库之前,还包括对大数据的完整性验证及合法性验证,其中,完整性验证是由网络系统中的redis完成的,通过后,将大数据发送给服务器本地完成合法性验证。In the big data analysis method, before the data is stored in the distributed database, the integrity verification and legality verification of the big data are also included, wherein the integrity verification is completed by redis in the network system. Send big data to the server to complete legality verification locally.
其中,Redis是网络系统中的一个开源的,支撑网络,可基于内存亦可持久化的日志型、键值数据库。Among them, Redis is an open-source, network-supporting log-type, key-value database that can be memory-based or persistent.
实施例二:图3公开了在本发明实施例提供的大数据分析方法中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400的流程示意图,其中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400包括:Embodiment 2: FIG. 3 discloses a schematic flowchart of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention, wherein the data cache server is configured for the analysis result. The step S400 of caching includes:
步骤S401:配置第一缓存服务器与第二缓存服务器;Step S401: configure the first cache server and the second cache server;
步骤S402:将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;Step S402: Store the analysis result data whose access times are not greater than a preset threshold in the second cache server;
步骤S403:当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。Step S403: When the access times of the analysis result data in the second cache server is greater than a preset threshold, transfer the analysis result data to the first cache server.
在上述对所述分析结果配置数据缓存服务器进行缓存的步骤S400,通过配置两个缓存服务器,分别存储访问次数较多和访问次数较少的缓存数据,两个缓存服务器采取独立的淘汰策略进行数据淘汰,能够避免单缓存服务器判断不准确,将一些期望进行缓存的数据淘汰,从而有效地提高了缓存数据的准确性。In the above step S400 of configuring a data cache server to cache the analysis result, two cache servers are configured to store cached data with more access times and less access times respectively, and the two cache servers adopt independent elimination strategies for data storage. Elimination can avoid inaccurate judgment of the single-cache server, and eliminate some data expected to be cached, thereby effectively improving the accuracy of cached data.
实施例三:图4公开了在本发明实施例提供的大数据分析方法中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400进一步说明的流程示意图。其中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400还包括:Embodiment 3: FIG. 4 discloses a schematic flowchart of a further description of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention. Wherein, the step S400 of caching the analysis result configuration data cache server further includes:
步骤S4401:淘汰所述第一缓存服务器中访问时间最早的分析结果数据;Step S4401: Eliminate the analysis result data with the earliest access time in the first cache server;
步骤S4501:将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;Step S4501: Transfer the analysis result data eliminated in the first cache server to the second cache server;
步骤S4601:清除所述第二缓存服务器中存储时间最早的分析结果数据。Step S4601: Clear the analysis result data with the earliest storage time in the second cache server.
具体地,在一个优选的实施例中,第一缓存服务器的数据存储满后进行数据淘汰。Specifically, in a preferred embodiment, data elimination is performed after the data storage of the first cache server is full.
其中,获取一段预设时间内,第一缓存服务器中各访问时间距离当前最远的数据,并将预设时间内访问时间最早的数据优先被淘汰,淘汰的数据可以转移到第二缓存服务器。The data with the farthest access time from the current time in the first cache server within a preset period of time is obtained, and the data with the earliest access time within the preset time period is preferentially eliminated, and the eliminated data can be transferred to the second cache server.
从而使得某些访问时间较早,但最近访问频率较低的数据不会被直接剔除,避免了误判情况,能够避免单缓存服务器判断不准确,将一些期望进行缓存的数据淘汰,从而有效地提高了缓存数据的准确性。As a result, some data with earlier access time but less recent access frequency will not be directly eliminated, avoiding misjudgment, inaccurate judgment of single-cache server, and eliminating some data that is expected to be cached, thus effectively. Improved the accuracy of cached data.
实施例四:图5公开了在本发明实施例提供的大数据分析方法中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400更进一步说明的流程示意图。其中,对所述分析结果配置数据缓存服务器进行缓存的步骤S400还包括:Embodiment 4: FIG. 5 discloses a schematic flowchart of a further description of step S400 of configuring a data cache server to cache the analysis result in the big data analysis method provided by the embodiment of the present invention. Wherein, the step S400 of caching the analysis result configuration data cache server further includes:
步骤S4401:淘汰所述第一缓存服务器中在预设时间内访问次数最少的分析结果数据;Step S4401: Eliminate the analysis result data with the least access times within the preset time in the first cache server;
步骤S4502:将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;Step S4502: Transfer the analysis result data eliminated in the first cache server to the second cache server;
步骤S4602:清除所述第二缓存服务器中存储时间最早的分析结果数据。Step S4602: Clear the analysis result data with the earliest storage time in the second cache server.
具体地,在一个优选的实施例中,第一缓存服务器的数据存储满后进行数据淘汰。Specifically, in a preferred embodiment, data elimination is performed after the data storage of the first cache server is full.
其中,获取一段预设时间内,第一缓存服务器中各数据被访问的次数,并将预设时间内访问次数最少的数据优先被淘汰,淘汰的数据可以转移到第二缓存服务器。The number of times that each data in the first cache server is accessed within a preset period of time is obtained, and the data with the least number of visits within the preset period of time is preferentially eliminated, and the eliminated data can be transferred to the second cache server.
从而使得某些访问次数较多,但最近访问频率较低的数据不会被直接剔除,避免了误判情况,能够避免单缓存服务器判断不准确,将一些期望进行缓存的数据淘汰,从而有效地提高了缓存数据的准确性。As a result, some data with a large number of visits but a low frequency of recent visits will not be directly eliminated, avoiding misjudgment, inaccurate judgment of the single-cache server, and eliminating some data that are expected to be cached, thus effectively. Improved the accuracy of cached data.
实施例五:本发明实施例提供了大数据分析系统600。Embodiment 5: The embodiment of the present invention provides a big data analysis system 600 .
图6为本发明实施例五提供的大数据分析系统的结构框图。FIG. 6 is a structural block diagram of a big data analysis system according to Embodiment 5 of the present invention.
具体的,所述大数据分析系统600包括:Specifically, the big data analysis system 600 includes:
获取单元601,所述获取单元用于获取目标时间段内用户访问的大数据信息;Obtaining unit 601, the obtaining unit is used to obtain the big data information accessed by the user within the target time period;
存储单元602,所述存储单元用于将所述大数据信息按时间分片存储在分布式数据库中;A storage unit 602, which is configured to store the big data information in a distributed database by time slices;
在本发明实施例中,可以根据需要将时间分片设置成一周,一周后服务器调用的大数据会被新的大数据所覆盖,从而实现大数据的更新。在该大数据分析方法中,在将数据存储到分布式数据库之前,还包括对大数据的完整性验证及合法性验证。In the embodiment of the present invention, the time slicing can be set to one week as required, and the big data called by the server will be covered by the new big data after one week, so as to realize the updating of the big data. In the big data analysis method, before the data is stored in the distributed database, the integrity verification and legality verification of the big data are also included.
进一步的,在本发明提供的优选实施方式中,所述分布式数据库为Hbase数据库,采用行键(rowkey)和列名的方式存储大数据。Further, in a preferred embodiment provided by the present invention, the distributed database is an Hbase database, and the big data is stored in the form of row keys (rowkeys) and column names.
在该大数据分析方法中,在将数据存储到分布式数据库之前,还包括对大数据的完整性验证及合法性验证,其中,完整性验证是由网络系统中的redis完成的,通过后,将大数据发送给服务器本地完成合法性验证。In the big data analysis method, before the data is stored in the distributed database, the integrity verification and legality verification of the big data are also included, wherein the integrity verification is completed by redis in the network system. Send big data to the server to complete legality verification locally.
执行单元603,所述执行单元用于基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;an execution unit 603, the execution unit is configured to perform a data analysis task on the big data information stored in the distributed database based on a predetermined rule, and obtain an analysis result;
缓存单元604,所述缓存单元用于对所述分析结果配置数据缓存服务器进行缓存;以及a cache unit 604, the cache unit is configured to cache the analysis result configuration data cache server; and
输出单元605,所述输出单元用于在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。Output unit 605, the output unit is used to call the corresponding analysis result data in the data cache server as the query result according to the parameters of the data query request when the front-end data query request is obtained, and output the query result to the front-end and perform the query. Visual display.
实施例六:图7示出了本发明实施例六所提供的大数据分析系统中缓存单元604的结构框图。其中,所述缓存单元604包括:Embodiment 6: FIG. 7 shows a structural block diagram of the cache unit 604 in the big data analysis system provided by Embodiment 6 of the present invention. Wherein, the cache unit 604 includes:
配置模块6041,所述配置模块用于配置第一缓存服务器与第二缓存服务器;A configuration module 6041, the configuration module is used to configure the first cache server and the second cache server;
存储模块6042,所述存储模块用于将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;以及A storage module 6042, the storage module is configured to store the analysis result data whose access times are not greater than a preset threshold in the second cache server; and
转移模块6043,所述转移模块用于当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。Transfer module 6043, the transfer module is configured to transfer the analysis result data to the first cache server when the access times of the analysis result data in the second cache server is greater than a preset threshold.
所述缓存单元604通过配置两个缓存服务器,分别存储访问次数较多和访问次数较少的缓存数据,两个缓存服务器采取独立的淘汰策略进行数据淘汰,能够避免单缓存服务器判断不准确,将一些期望进行缓存的数据淘汰,从而有效地提高了缓存数据的准确性。The cache unit 604 configures two cache servers to store cached data with more visits and fewer visits respectively, and the two cache servers adopt independent elimination strategies to eliminate data, which can avoid inaccurate judgment of a single cache server, and use Some expect cached data eviction, thereby effectively improving the accuracy of cached data.
实施例七:本发明实施例七还提供了计算机设备,所述的计算机设备包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述大数据分析方法的步骤。Embodiment 7: Embodiment 7 of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes The computer program implements the steps of the big data analysis method.
实施例八:本发明实施例八还提供了存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现所述大数据分析方法的步骤。Embodiment 8: Embodiment 8 of the present invention further provides a storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a processor, the steps of the big data analysis method are implemented.
示例性的,计算机程序可以被分割成一个或多个模块,一个或者多个模块被存储在存储器中,并由处理器执行,以完成本发明。一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述计算机程序在终端设备中的执行过程。例如,上述计算机程序可以被分割成上述各个系统实施例提供的泊位状态显示系统的单元或模块。Exemplarily, a computer program may be divided into one or more modules, and the one or more modules are stored in a memory and executed by a processor to accomplish the present invention. One or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the terminal device. For example, the above-mentioned computer program can be divided into units or modules of the berth status display system provided by each of the above-mentioned system embodiments.
本领域技术人员可以理解,上述终端设备的描述仅仅是示例,并不构成对终端设备的限定,可以包括比上述描述更多或更少的部件,或者组合某些部件,或者不同的部件,例如可以包括输入输出设备、网络接入设备、总线等。Those skilled in the art can understand that the above description of the terminal device is only an example, and does not constitute a limitation on the terminal device, and may include more or less components than the above description, or combine some components, or different components, such as It can include input and output devices, network access devices, buses, etc.
所称处理器可以是中央处理单元,还可以是其他通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,上述处理器是上述终端设备的控制中心,利用各种接口和线路连接整个用户终端的各个部分。The processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. . The general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc. The above-mentioned processor is the control center of the above-mentioned terminal equipment, and uses various interfaces and lines to connect various parts of the entire user terminal.
上述存储器可用于存储计算机程序和/或模块,上述处理器通过运行或执行存储在存储器内的计算机程序和/或模块,以及调用存储在存储器内的数据,实现上述终端设备的各种功能。存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如信息采集模板展示功能、产品信息发布功能等)等;存储数据区可存储根据泊位状态显示系统的使用所创建的数据(比如不同产品种类对应的产品信息采集模板、不同产品提供方需要发布的产品信息等)等。此外,存储器可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡,安全数字卡,闪存卡、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The above-mentioned memory can be used to store computer programs and/or modules, and the above-mentioned processor implements various functions of the above-mentioned terminal device by running or executing the computer programs and/or modules stored in the memory and calling the data stored in the memory. The memory can mainly include a stored program area and a stored data area, wherein the stored program area can store the operating system, application programs required for at least one function (such as information collection template display function, product information release function, etc.), etc.; Store the data created according to the use of the berth status display system (such as product information collection templates corresponding to different product types, product information that different product providers need to publish, etc.), etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory such as hard disks, internal memory, plug-in hard disks, smart memory cards, secure digital cards, flash memory cards, at least one magnetic disk storage device, flash memory devices, or other volatile solid-state storage devices.
终端设备集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例系统中的全部或部分模块/单元,也可以通过计算机程序来指令相关的硬件来完成,上述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个系统实施例的功能。If the modules/units integrated in the terminal equipment are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. Based on this understanding, the present invention realizes all or part of the modules/units in the system of the above-mentioned embodiments, and can also be completed by instructing the relevant hardware through a computer program, and the above-mentioned computer program can be stored in a computer-readable storage medium, the When the computer program is executed by the processor, the functions of the above-described various system embodiments can be realized.
其中,计算机程序包括计算机程序代码,计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。计算机可读介质可以包括:能够携带计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器、随机存取存储器、电载波信号、电信信号以及软件分发介质等。Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate forms, and the like. Computer readable media may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, removable hard disks, magnetic disks, optical discs, computer memory, read-only memory, random access memory, electrical carrier signals, telecommunication signals and software distribution media.
综上所述,本发明实施例提供的大数据分析方法和大数据分析系统通过获取目标时间段内用户访问的大数据信息;并将所述大数据信息按时间分片存储在分布式数据库中;然后基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;对所述分析结果配置数据缓存服务器进行缓存;且在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,并输出查询结果至前端并进行可视化展示,能够降低大数据分析系统处理器的负担,提高用户的访问速度,避免产生访问卡顿的情况,保证用户访问的流畅性。To sum up, the big data analysis method and big data analysis system provided by the embodiments of the present invention obtain the big data information accessed by users within the target time period; and store the big data information in a distributed database by time slices Then perform data analysis task on the big data information stored in the distributed database based on predetermined rules, and obtain the analysis result; configure the data cache server to cache the analysis result; and when obtaining the front-end data query request, According to the parameters of the data query request, the corresponding analysis result data is called in the data cache server as the query result, and the query result is output to the front end for visual display, which can reduce the burden on the processor of the big data analysis system and improve the user's access speed. , to avoid access freezes and ensure the smoothness of user access.
可以替换的,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。Alternatively, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated.
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (9)

  1. 大数据分析方法,其特征在于,所述方法包括以下步骤:获取目标时间段内用户访问的大数据信息;将所述大数据信息按时间分片存储在分布式数据库中;基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;对所述分析结果配置数据缓存服务器进行缓存;A big data analysis method, characterized in that the method comprises the following steps: acquiring big data information accessed by users within a target time period; storing the big data information in a distributed database by time slices; Perform data analysis tasks on the big data information in the distributed database to obtain analysis results; configure a data cache server to cache the analysis results;
    在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。When the front-end data query request is obtained, the corresponding analysis result data is called in the data cache server as the query result according to the parameters of the data query request, and the query result is output to the front-end for visual display.
  2. 根据权利要求1所述的大数据分析方法,其特征在于,所述分布式数据库为Hbase数据库。The big data analysis method according to claim 1, wherein the distributed database is an HBase database.
  3. 根据权利要求1或2所述的大数据分析方法,其特征在于,对所述分析结果配置数据缓存服务器进行缓存的步骤包括:配置第一缓存服务器与第二缓存服务器;将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。The big data analysis method according to claim 1 or 2, wherein the step of configuring a data cache server to cache the analysis result comprises: configuring a first cache server and a second cache server; The analysis result data with a threshold is stored in the second cache server; when the number of accesses to the analysis result data in the second cache server is greater than a preset threshold, the analysis result data is transferred to the first cache server .
  4. 根据权利要求3所述的大数据分析方法,其特征在于,对所述分析结果配置数据缓存服务器进行缓存的步骤还包括:淘汰所述第一缓存服务器中访问时间最早的分析结果数据;将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;清除所述第二缓存服务器中存储时间最早的分析结果数据。The big data analysis method according to claim 3, wherein the step of configuring a data cache server for caching the analysis results further comprises: eliminating the analysis result data with the earliest access time in the first cache server; The analysis result data eliminated in the first cache server is transferred to the second cache server; the analysis result data with the earliest storage time in the second cache server is cleared.
  5. 根据权利要求3所述的大数据分析方法,其特征在于,对所述分析结果配置数据缓存服务器进行缓存的步骤还包括:淘汰所述第一缓存服务器中在预设时间内访问次数最少的分析结果数据;将所述第一缓存服务器中淘汰的分析结果数据转移至所述第二缓存服务器;清除所述第二缓存服务器中存储时间最早的分析结果数据。The big data analysis method according to claim 3, wherein the step of configuring a data cache server to cache the analysis results further comprises: eliminating the analysis with the least number of visits within a preset time in the first cache server result data; transfer the analysis result data eliminated in the first cache server to the second cache server; clear the analysis result data with the earliest storage time in the second cache server.
  6. 大数据分析系统,其特征在于,所述系统包括:获取单元,所述获取单元用于获取目标时间段内用户访问的大数据信息;存储单元,所述存储单元用于将所述大数据信息按时间分片存储在分布式数据库中;执行单元,所述执行单元用于基于预定规则对存储在所述分布式数据库中的大数据信息执行数据分析任务,得到分析结果;缓存单元,所述缓存单元用于对所述分析结果配置数据缓存服务器进行缓存;以及输出单元,所述输出单元用于在获取到前端的数据查询请求时,依据数据查询请求的参数在所述数据缓存服务器中调用对应的分析结果数据作为查询结果,输出查询结果至前端并进行可视化展示。A big data analysis system, characterized in that the system comprises: an acquisition unit, which is used for acquiring big data information accessed by users within a target time period; and a storage unit, which is used for storing the big data information Stored in a distributed database by time slices; an execution unit, configured to perform a data analysis task on the big data information stored in the distributed database based on predetermined rules, and obtain analysis results; a cache unit, the The cache unit is configured to cache the analysis result configuration data cache server; and the output unit is configured to call the data cache server according to the parameters of the data query request when the front-end data query request is obtained. The corresponding analysis result data is used as the query result, and the query result is output to the front end and displayed visually.
  7. 根据权利要求6所述的大数据分析系统,其特征在于,所述缓存单元包括:配置模块,所述配置模块用于配置第一缓存服务器与第二缓存服务器;存储模块,所述存储模块用于将访问次数不大于预设阈值的分析结果数据存储在所述第二缓存服务器中;以及转移模块,所述转移模块用于当所述第二缓存服务器中的分析结果数据的访问次数大于预设阈值时,将该分析结果数据转移至所述第一缓存服务器。The big data analysis system according to claim 6, wherein the cache unit comprises: a configuration module, the configuration module is used to configure the first cache server and the second cache server; a storage module, the storage module is used for for storing the analysis result data whose number of visits is not greater than a preset threshold in the second cache server; and a transfer module, the transfer module is used when the number of visits of the analysis result data in the second cache server is greater than a predetermined threshold; When the threshold is set, the analysis result data is transferred to the first cache server.
  8. 计算机设备,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1-5任意一项权利要求所述方法的步骤。Computer equipment, characterized in that it includes a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, when the processor executes the computer program, the implementation of the claims 1-5 steps of the method of any one of claims.
  9. 存储介质,所述存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-5任意一项权利要求所述方法的步骤。A storage medium, wherein the storage medium stores a computer program, characterized in that, when the computer program is executed by the processor, the steps of the method according to any one of claims 1-5 are implemented.
PCT/CN2020/127948 2020-10-20 2020-11-11 Big data analysis method and system, and computer device and storage medium thereof WO2022082892A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011127437.3 2020-10-20
CN202011127437.3A CN112269830A (en) 2020-10-20 2020-10-20 Big data analysis method, system, computer equipment and storage medium thereof

Publications (1)

Publication Number Publication Date
WO2022082892A1 true WO2022082892A1 (en) 2022-04-28

Family

ID=74341583

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/127948 WO2022082892A1 (en) 2020-10-20 2020-11-11 Big data analysis method and system, and computer device and storage medium thereof

Country Status (2)

Country Link
CN (1) CN112269830A (en)
WO (1) WO2022082892A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115145712A (en) * 2022-09-06 2022-10-04 云账户技术(天津)有限公司 Interface calling method, device, equipment and medium for external service platform
CN116489178A (en) * 2023-04-25 2023-07-25 安徽亚高科技有限公司 Method and device for distributed storage of communication information

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860774A (en) * 2022-05-19 2022-08-05 宁波奥克斯电气股份有限公司 Big data real-time analysis method and system of air conditioner, storage medium and air conditioner
CN114880362A (en) * 2022-07-08 2022-08-09 中化现代农业有限公司 Data analysis system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140067920A1 (en) * 2012-08-31 2014-03-06 International Business Machines Corporation Data analysis system
CN108268468A (en) * 2016-12-30 2018-07-10 北京京东尚科信息技术有限公司 The analysis method and system of a kind of big data
CN109359095A (en) * 2018-09-11 2019-02-19 东华大学 A kind of DLK method that big data is quickly read
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium
CN111046083A (en) * 2019-12-13 2020-04-21 北京中电普华信息技术有限公司 Data analysis method and system and big data platform
CN111737325A (en) * 2020-05-25 2020-10-02 南京华盾电力信息安全测评有限公司 Power data analysis method and device based on big data technology

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140067920A1 (en) * 2012-08-31 2014-03-06 International Business Machines Corporation Data analysis system
CN108268468A (en) * 2016-12-30 2018-07-10 北京京东尚科信息技术有限公司 The analysis method and system of a kind of big data
CN109359095A (en) * 2018-09-11 2019-02-19 东华大学 A kind of DLK method that big data is quickly read
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium
CN111046083A (en) * 2019-12-13 2020-04-21 北京中电普华信息技术有限公司 Data analysis method and system and big data platform
CN111737325A (en) * 2020-05-25 2020-10-02 南京华盾电力信息安全测评有限公司 Power data analysis method and device based on big data technology

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115145712A (en) * 2022-09-06 2022-10-04 云账户技术(天津)有限公司 Interface calling method, device, equipment and medium for external service platform
CN116489178A (en) * 2023-04-25 2023-07-25 安徽亚高科技有限公司 Method and device for distributed storage of communication information
CN116489178B (en) * 2023-04-25 2023-09-22 安徽亚高科技有限公司 Method and device for distributed storage of communication information

Also Published As

Publication number Publication date
CN112269830A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
WO2022082892A1 (en) Big data analysis method and system, and computer device and storage medium thereof
US9817879B2 (en) Asynchronous data replication using an external buffer table
CN109947668B (en) Method and device for storing data
CN109379395B (en) Interface data cache setting method and terminal equipment
CN108629029B (en) Data processing method and device applied to data warehouse
CN110119304B (en) Interrupt processing method and device and server
JP2020531949A (en) Lazy update of database hash code in blockchain
CN110019496B (en) Data reading and writing method and system
CN108536617B (en) Cache management method, medium, system and electronic device
CN113391765A (en) Data storage method, device, equipment and medium based on distributed storage system
CN111125107A (en) Data processing method, device, electronic equipment and medium
CN108363741B (en) Big data unified interface method, device, equipment and storage medium
US20200349081A1 (en) Method, apparatus and computer program product for managing metadata
CN107169115A (en) Add the method and device of self-defined participle
CN115269063A (en) Process creation method, system, device and medium
CN111984723A (en) Data synchronization method and device and terminal equipment
CN112100211B (en) Data storage method, apparatus, electronic device, and computer readable medium
CN114780361A (en) Log generation method, device, computer system and readable storage medium
US20210149874A1 (en) Selectively processing an event published responsive to an operation on a database record that relates to consent
CN112182111A (en) Block chain based distributed system layered processing method and electronic equipment
US9852142B2 (en) Method for EN passant workload shift detection
CN113362097B (en) User determination method and device
US20190034249A1 (en) Message processing
WO2019126720A1 (en) A system and method for optimization and load balancing of computer clusters
US11941074B2 (en) Fetching a query result using a query filter

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20958470

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20958470

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 241023)