WO2021082858A1 - 一种数据采集方法及装置 - Google Patents

一种数据采集方法及装置 Download PDF

Info

Publication number
WO2021082858A1
WO2021082858A1 PCT/CN2020/119039 CN2020119039W WO2021082858A1 WO 2021082858 A1 WO2021082858 A1 WO 2021082858A1 CN 2020119039 W CN2020119039 W CN 2020119039W WO 2021082858 A1 WO2021082858 A1 WO 2021082858A1
Authority
WO
WIPO (PCT)
Prior art keywords
collection
preset
queue
event
reported
Prior art date
Application number
PCT/CN2020/119039
Other languages
English (en)
French (fr)
Inventor
谢雪彦
林挺
卢道和
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2021082858A1 publication Critical patent/WO2021082858A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the present invention relates to the technical field of financial technology (Fintech), in particular to a data collection method and device.
  • the user churn model corresponding to the product can be used for the product. Analyze the churn of users in a period of time in the future, so that the marketing department can adjust the marketing strategy of the product and reduce the churn of users.
  • the data collection process is generally performed by using a receive-and-send strategy.
  • the business system detects a business triggering a buried event, it will generate a collection event corresponding to the buried event, and report the collected event to the collection at the same time. server.
  • this approach will have some problems in business systems that support parallel processing. For example, when multiple burial events are triggered at the same time, the business system will simultaneously report multiple collection events corresponding to the multiple burial events to the collection server in parallel. In this case, the collection server needs to obtain multiple Collect events, so that the collection server will face greater pressure, and the normal operation of the collection server may be affected.
  • the invention provides a data collection method and device, which are used to reduce the pressure of the collection server.
  • the present invention provides a data collection method.
  • the method includes: after detecting that a user triggers a preset operation, generating corresponding collection events when performing tasks corresponding to the preset operation, and reporting through multiple reporting processes. Report multiple collection events, and in each reporting process, report one or more collection events that are not reported among the multiple collection events to the collection server. Among them, each of the multiple collection events is used to record the events of executing each subtask in the task.
  • the collection events are reported through multiple reporting processes, and there is no need to report all the collection events to the collection server at once.
  • the above design can reduce the collection rate.
  • the pressure of the server can improve the efficiency of data collection.
  • reporting multiple collection events through multiple reporting processes including: storing multiple collection events in a preset queue in sequence, and determining the collection events reported in the last reporting process during each reporting process After successfully reporting to the collection server, a collection event that has not been reported in the preset queue is reported to the collection server.
  • the generated collection events can be stored with less space, and multiple collection events can be accurately stored in the time sequence of generation, thereby improving the accuracy of data collection.
  • this solution will start the next reporting process only after a collection event reported in each reporting process is successfully reported. This reduces the pressure on the collection server, while also reducing the loss rate of collection events and improving the reception of the collection server. The success rate of the collection event.
  • reporting an unreported collection event in the preset queue to the collection server includes: acquiring the collection event stored in the next location next to the location indicated by the preset cursor in the preset queue, and reporting it The collection server then uses the position adjacent to the position indicated by the preset cursor to update the preset cursor.
  • the preset cursor is used to indicate the position of the reported collection event in the preset queue during each reporting process.
  • the method further includes: after determining that the collection event reported in the previous reporting process was not successfully reported to the collection server, repeating the reporting of the collection event reported in the previous reporting process, when the number of repeated reporting exceeds the preset When the number of times is set, the collection events between the position indicated by the preset cursor in the preset queue and the position at the end of the preset queue are buffered to the preset position. After the fault is eliminated, the collection events buffered in the preset position are reported to the collection server .
  • the reason for the failure of multiple reporting processes is generally a network failure. Therefore, by buffering the collection events that have not been successfully reported, and re-reporting the unreported collection events after the network failure is restored, the collection server can collect the most Collect data comprehensively, which can improve the success rate of data collection.
  • the position indicated by the preset cursor is the position at the end of the preset queue, or if the collection events between the position indicated by the preset cursor and the position at the end of the preset queue have been buffered to the preset position To delete the preset queue.
  • the preset queue after the cache is successful, or deleting the preset queue after all the collection events in the preset queue are reported, meaningless space occupation behavior can be avoided, and the performance loss of the system can be reduced.
  • reporting an unreported collection event in the preset queue to the collection server includes: obtaining the collection event stored in the starting position in the preset queue and reporting it to the collection server; further, if it is determined If the collection event reported in this reporting process is successfully reported to the collection server, the collection event stored at the starting position in the preset queue is deleted, and the collection event stored in each position in the preset queue is moved forward one position in turn. Wherein, the collection event stored in the position after the starting position in the preset queue is moved to the starting position.
  • the storage space occupied by the preset queue can be reduced, so that there is no need to re-expand the storage space for the preset queue when the collection events increase, and the storage space of the preset queue is improved. Availability and utilization of storage space.
  • the present invention provides a data collection device, which includes:
  • the processing module is used to generate multiple corresponding collection events when the task corresponding to the preset operation is executed after detecting that the user triggers the preset operation; each collection event is used to record the event of each subtask in the execution of the task;
  • the transceiver module is used to report multiple collection events through multiple reporting processes; in each reporting process, one or more collection events that are not reported among the multiple collection events are reported to the collection server.
  • the transceiver module is specifically used to store multiple collection events in a preset queue in sequence, and in each reporting process, after determining that the collection event reported in the previous reporting process is successfully reported to the collection server , To report a collection event that has not been reported in the preset queue to the collection server.
  • the transceiver module is specifically used to: obtain the collection events stored in the next position adjacent to the position indicated by the preset cursor in the preset queue, and report to the collection server, and then use the position indicated by the preset cursor The adjacent position updates the preset cursor.
  • the preset cursor is used to indicate the position of the reported collection event in the preset queue during each reporting process.
  • the device also includes a cache module.
  • the transceiver module is also used to: after determining that the collection event reported in the previous reporting process has not been successfully reported to the collection server, repeat the reporting of the collection event reported in the previous reporting process.
  • the buffer module is used to buffer the collection event between the position indicated by the preset cursor in the preset queue and the position at the end of the preset queue to the preset position.
  • the transceiver module is also used to report the collection event buffered in the preset location to the collection server after troubleshooting.
  • the processing module is also used: if the position indicated by the preset cursor is the position at the end of the preset queue, or if the collection event between the position indicated by the preset cursor and the position at the end of the preset queue If it has been cached to the preset position, delete the preset queue.
  • the buffer module stores multiple collection events in sequence before the preset queue, and can also query whether there are collection events in the preset location, and if so, use the initial location of the preset queue to store the preset location cache After the initial position of the preset queue, multiple acquisition events are sequentially stored from the position after the initial position of the preset queue. If not, multiple acquisition events are sequentially stored from the initial position of the preset queue.
  • the transceiver module is specifically used to: obtain the collection event stored in the initial position of the preset queue and report it to the collection server. If it is determined that the collection event reported in this reporting process is successfully reported to the collection server, then Delete the collection events stored in the starting position in the preset queue, and move the collection events stored in each position in the preset queue forward one position in turn. Wherein, the collection event stored in the position after the starting position in the preset queue is moved to the starting position.
  • the present invention provides a computing device, including at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, the processing unit executes any of the above-mentioned first aspects.
  • the method of data collection including at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, the processing unit executes any of the above-mentioned first aspects.
  • a computer-readable storage medium provided by an embodiment of the present invention stores a computer program executable by a computing device. When the program runs on the computing device, the computing device executes any data as in the first aspect above. Collection method.
  • FIG. 1 is a schematic diagram of a possible system architecture provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a process corresponding to a data collection method provided by an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a method for reporting collection events based on a preset cursor according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of a method for reporting collection events based on a first-in first-out queue algorithm according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of the implementation process of a data collection method provided by an embodiment of the present invention.
  • Fig. 6 is a schematic structural diagram of a data acquisition device provided by an embodiment of the present invention.
  • bank transactions can include card selling transactions, deposit transactions, loan transactions, insurance transactions, wealth management transactions, and so on.
  • the daily transaction volume of banks can reach thousands or even tens of thousands.
  • banks can usually collect events of users performing various transactions, and then perform big data analysis on the collected events to obtain analysis models.
  • analysis models such as user churn models, user portraits, and so on. In this way, the bank can use the analysis model to predict the user's behavior, so that it can provide users with corresponding services based on the predicted user's behavior.
  • FIG. 1 is a schematic diagram of a possible system architecture provided by an embodiment of the present invention.
  • the system architecture may include a collection server 110 and at least one client device, such as a client device 121, a client device 122, and a client. ⁇ 123 ⁇ ⁇ 123.
  • client device may be respectively connected to the collection server 110 in communication, for example, the connection may be realized in a wired manner, or the connection may also be realized in a wireless manner, which is not specifically limited.
  • At least one client device may be respectively set in at least one business system.
  • the client device 121 is a client device set in a loan business system
  • the client device 122 is a client device set in an insurance business system.
  • Device the client device 123 is a client device set in the wealth management business system.
  • the client device may refer to a server that provides an application (Application, APP), and the APP may refer to an APP with an embedded browser.
  • the client The end device can detect the operation triggered by the user on the APP browser, and can provide the user with the service corresponding to the operation.
  • Figure 1 is only an exemplary simple description.
  • the number of client devices listed is only for the convenience of illustrating the solution, and does not constitute a limitation on the solution. In specific implementation, the number of client devices The number can be far greater than three, for example, it can be four or more.
  • FIG. 2 is a schematic flowchart of a data collection method according to an embodiment of the present invention.
  • the method is suitable for client devices, such as the client device 121 and the client device shown in FIG. 122 and client device 123.
  • the method includes:
  • step 201 after detecting that the user triggers a preset operation, the client device generates multiple corresponding collection events when performing tasks corresponding to the preset operation.
  • the preset operation may be an operation indicating that the buried point event is triggered.
  • the client device can execute the service corresponding to the preset operation, and the service execution process will trigger the embedding event. In this way, the client device can generate the collection event corresponding to the buried point event.
  • the burial event as a loan event
  • there can be multiple preset operations corresponding to the loan event such as clicking the loan application icon on the loan interface, or pressing the preset button on the keyboard (which can be a single button , It can also be a combination of keys), or it can also be used to input preset text or voice input in the text box, or it can also be used to detect the preset brain waves, or it can also be used to capture a specified action or scan to Default barcodes, etc., are not specifically limited.
  • the business execution process and the data collection process can be executed in separate processes, and the data collection process is determined by the business execution process and does not affect the business execution process. For example, after the loan event is triggered, the loan operation is performed in the business process, and the buried point event triggered by the loan operation is collected in the data collection process, such as collecting the event of a user applying for a loan.
  • the following takes the preset operation of clicking the loan application icon on the loan interface as an example to describe the specific implementation process of generating multiple collection events.
  • any operation of the user on the APP can be collected by the client device, and the client device can also respond to the user's operation.
  • the client device can sequentially query the user's related information from multiple third-party platforms to determine whether it can provide the user with a loan. In this case, the client device 121 can first query the user’s personal information from the personal information maintenance platform.
  • the personal information query After the personal information query is passed, it can query the user’s account opening information from the account opening information maintenance platform. When the account opening information query is passed, Query the user’s loan eligibility from the loan information maintenance platform. If the personal information, account opening information, and loan eligibility are all passed, the client device can provide the user with a loan. If any of the personal information, account opening information, and loan eligibility fails, the user’s loan application can be rejected.
  • the client device when performing the loan service corresponding to the "one-click loan", the client device separately performs three subtasks of querying the user's personal information, querying the user's account opening information, and querying the user's loan qualification.
  • the first embedding event indicates that the client device (or user) has performed the query of personal information
  • the subtask of maintaining personal information in the platform The second point event indicates that the client device (or user) has performed the subtask of querying account information in the account opening information maintenance platform.
  • the third point event indicates that the client device (or user) )
  • the subtask of querying loan qualifications in the loan information maintenance platform is executed.
  • the first, second, and third burial events are sequentially Trigger, so the client device can sequentially generate the first collection event corresponding to the first burial event, the second collection event corresponding to the second burial event, and the third collection event corresponding to the third burial event.
  • the execution interval of the three subtasks of querying the user’s personal information, querying the user’s account opening information, and querying the user’s loan eligibility is very short, perhaps only a few milliseconds. Therefore, the client device is performing the loan business process. It is equivalent to generating the first collection event, the second collection event, and the third collection event in parallel.
  • the above is only an exemplary simple description.
  • the number of collection events listed is only for the convenience of explaining the solution, and does not constitute a limitation on the solution. In specific implementation, the number of collection events can be There are 2 or more than 3, such as 4 or more, which is not specifically limited.
  • step 202 the client device reports multiple collection events through multiple reporting processes, and in each reporting process, reports one or more collection events that are not reported among the multiple collection events to the collection server.
  • the client device may first store multiple collection events in the first preset location, and then acquire one or more collection events that have not been reported from the first preset location and report them to the collection server 110 each time. In this way, by reporting multiple collection events through multiple reporting processes, there is no need to report multiple collection events to the collection server at one time. Compared with the prior art adopting the data collection strategy of receiving and sending, the collection server can be reduced. The pressure, and can improve the efficiency of data collection.
  • the first preset position can be set by those skilled in the art based on experience, or can also be set according to business needs, for example, it can be the storage of the client device (disk, external storage, external device, etc.), or It can also be a database server, or can also be cloud storage (such as cloud space, cloud disk, etc.).
  • the collection event may be stored in the first preset location in the form of a data table, or may also be stored in the first preset location in the form of a linked list, or may also be stored in the first preset location in the form of machine language, etc. Etc., the specific is not limited.
  • a queue can be used to store collection events.
  • the queue is a special linear table that can be composed of multiple elements. Each element can store an address pointer.
  • the queue can be set in the client device. Run the memory (Random Access Memory, RAM).
  • RAM Random Access Memory
  • the client device can first store the multiple collection events in a preset queue in sequence. In this way, each time it is reported, one or more events can be obtained from the preset queue. The reported collection event is reported to the collection server 110. In this way, by setting a preset queue, the generated multiple collection events can be stored with less space, and multiple collection events can be accurately stored in the time sequence generated, thereby improving the accuracy of data collection .
  • the client device may obtain an unreported collection event from the preset queue and report it to the collection server 110 every time it reports, so that the collection events are reported to the collection server 110 one by one.
  • the client device starts the next reporting process. .
  • the pressure on the collection server can be reduced while the loss rate of collection events can be reduced, thereby increasing the collection server The success rate of receiving collection events.
  • the collection server 110 after receiving the collection event reported by the client device, the collection server 110 sends a response message to the client device.
  • the response message includes a first type response message and a second type response message.
  • the first type of response message is used to identify that the collection server 110 has successfully received the collection event reported by the client device
  • the second type of response message is used to identify that the collection server 110 has not received the collection event reported by the client device within a set time period.
  • the collection server 110 receives a collection event in an illegal format. In this way, during each report, if the client device receives the first type response message sent by the collection server 110, it is determined that the report of the reported collection event is successful, so that the next report process can be started. If the client device receives the second type response message sent by the collection server 110, it is determined that the reporting of the collected event reported this time has failed, so that the next reporting process may not be started.
  • the client device can repeatedly report the reported collection event. If the number of repeated reporting exceeds the preset number of times, it indicates that the current network is faulty. Therefore, the client device All unreported collection events (including the reported collection event) stored in the preset queue can be cached to the second preset location; wherein, the preset number of times can be set by those skilled in the art based on experience, for example, it can be 3 times , Or it may be 4 times or more; the second preset position may be the same as the first preset position, or may also be different, which is not specifically limited.
  • the first preset location may be the RAM of the client device, and the second preset location may be the memory of the device (Read- only memory, ROM).
  • the multiple collection events can be stored in the RAM of the client device first, and then each time it is reported, an unreported collection event is obtained from the RAM, and the This collection event is placed in the browser request queue, which is used to store the request data to be processed by the browser.
  • each report process of the client device can occupy only one position in the browser request queue, compared to the way in the prior art that reports multiple collection events at one time, which leads to occupying multiple positions in the browser request queue. In other words, it can reduce the working pressure of the browser, so as not to affect the normal working efficiency of the browser.
  • the client device can obtain all unreported collection events (including this collection event) from RAM, and cache these unreported collection events through the browser To the ROM of the device. In this way, when it is determined that the network failure is restored, if a new collection event is generated, the collection event cached in the ROM of the device can be reported to the collection server 110 first, and then the new collection event is reported to the collection server 110.
  • the reason for the failure of multiple reporting processes is generally a network failure. Therefore, by buffering the collection events that were not successfully reported, and re-reporting the unreported collection events after the network failure is restored, the collection server can collect The most comprehensive data collection can improve the success rate of data collection.
  • the client device may preset a preset cursor, and the preset cursor is used to indicate the position of the collection event to be reported in the preset queue.
  • FIG. 3 is a schematic diagram of a method for reporting collection events based on a preset cursor according to an embodiment of the present invention.
  • a first queue for reporting collection events is already set in the client device. Contains the head and tail of the queue. Collection events can be inserted into the first queue from the end of the first queue.
  • the first queue stores the collection event 1, the collection event 2, and the collection event 3 in sequence.
  • the collection event 1 occupies the first queue.
  • the initial position is 0, the acquisition event 2 occupies the position 1 of the first queue, and the acquisition event 3 occupies the position 2 of the first queue.
  • collection event 4 can be inserted into the first queue from the end of the first queue. In this way, collection event 4 can occupy position 3 of the first queue, collection event 5 can occupy position 4 of the first queue, and collection event 6 Can occupy position 5 of the first queue.
  • the preset cursor can indicate the initial position of the first queue in the initial state, that is, position 0. Therefore, when the client device reports the collection event for the first time, it can first obtain the collection event stored in position 0 indicated by the preset cursor 1. Then, the collection event 1 can be placed in the browser request queue, and the collection event 1 can be reported to the collection server 110 via the browser. Further, if the client device receives the first type of response message sent by the collection server 110, it means that the collection event 1 is successfully reported to the collection server 110. Therefore, the client device can control the preset cursor to move from the head of the first queue to the queue. The trailing direction is shifted one bit backward. In this way, the preset cursor can indicate position 1 in the first queue, so that the client device can report the collection event 2 stored in position 1 of the first queue to the collection server 110 when reporting next time.
  • the client device when reporting the collection event 2, if the client device receives the second type response message sent by the collection server 110, it means that the collection event 2 has not been successfully reported to the collection server 110, so the client device still The collection event 2 can be reported repeatedly. If the second type of response message sent by the collection server 110 is received after three repeated reporting, it means that the collection event 2 has not been successfully reported to the collection server 110 in the three repeated reports, and the current network is faulty. Therefore, the client device can buffer the collection events (ie, collection event 2 and collection event 3) between the position 1 indicated by the preset cursor in the first queue and the position 2 at the end of the first queue to the second preset position.
  • the collection events ie, collection event 2 and collection event 3
  • the client device may Deleting the first queue can prevent the first queue from occupying meaningless space and reduce the performance loss of the system.
  • the client device can create a second queue, and then the client device can create a second queue.
  • the second preset location can be queried whether there are collection events that have not been reported before. If there are collection event 2 and collection event 3 in the second preset location, you can first store collection event 2 and collection event 3 in the second queue in sequence, and then store collection event 7 and collection event 8 in the second queue in sequence in.
  • collection event 2 can occupy position 0 of the second queue
  • collection event 3 can occupy position 1 of the second queue
  • collection event 7 can occupy position 2 of the second queue
  • collection event 8 can occupy position 3 of the second queue.
  • collection event 7 and collection event 8 can be directly stored in the second queue in sequence; in this way, collection event 7 can occupy position 0 of the second queue, and collection event 8 can occupy position 1 of the second queue.
  • the buffered collection event is first obtained from a preset location and placed in the initial position of the preset queue. After the new collection event is placed after the buffered collection event, It can be ensured that the collection events are reported to the collection server in sequence in the time sequence of generation, so that the accuracy and success rate of data collection can be improved.
  • the preset cursor since the preset cursor is used to record the position of the collection event to be reported currently in the preset queue, the position of the collection event in the preset queue will not change. There is no need to frequently move collection events, reducing system performance loss, and the location of the next collection event to be reported can be easily obtained according to the preset cursor, thereby improving the flexibility of data collection.
  • the client device can use the first-in first-out queue algorithm to report the collected data. Specifically, each time the client device successfully reports a collection event, the collection event can be deleted from the preset queue, and Other collection events can be moved forward one by one, so as to keep the elements in the preset queue continuously updated.
  • Fig. 4 is a schematic diagram of a method for reporting collected data based on a first-in first-out queue algorithm according to an embodiment of the present invention.
  • the first queue sequentially stores collection event 1, collection event 2, and collection event 3.
  • the collection event 1 occupies the initial position 0 of the first queue
  • the collection event 2 occupies the position 1 of the first queue
  • the collection event 3 occupies the position 2 of the first queue.
  • the client device may first obtain the collection event 1 stored in position 0 of the first queue, and then place the collection event 1 in the browser request queue, and report the collection event 1 to the collection server 110 via the browser. Further, if the client device receives the first type response message sent by the collection server 110, it means that the collection event 1 is successfully reported to the collection server 110, so the client device can control the collection event 1 to be moved out of the head of the first queue. The first queue (or directly delete the collection event 1), and sequentially control the collection event 2 and the collection event 3 to move forward one position, so that the collection event 2 occupies the position 0 of the first queue, and the collection event 3 occupies the position 1 of the first queue . In this way, each collection event reported by the client device is a collection event located in the initial position of the first queue.
  • the client device when reporting the collection event 2, if the client device receives the second type response message sent by the collection server 110, it means that the collection event 2 has not been successfully reported to the collection server 110, so the client device can repeatedly report the collection event 2 If the second type of response message sent by the collection server 110 is received after three repeated reports, it means that the collection event 2 has not been successfully reported to the collection server 110 in the three repeated reports.
  • the current network is faulty, so the client device can The collection events in the first queue that are not reported to the collection server 110 are buffered to the second preset location.
  • the collection events stored in the first queue are all collection events that have not been reported to the collection server 110, that is, When determining the network failure, the client device may buffer all the collection events stored in the first queue in the first queue.
  • the client device can delete the first queue, thereby avoiding the first queue from occupying meaningless space and reducing the performance loss of the system.
  • the storage space occupied by the preset queue can be reduced, so that when the collection events increase, there is no need to re-expand the storage space for the preset queue and increase the preset queue.
  • FIG. 5 is a schematic diagram of the implementation process of a data collection method provided by an embodiment of the present invention.
  • the client device can sequentially generate the collection event 1 corresponding to Point Point Event 1, the collection event 2 corresponding to Point Point Event 2, and the collection event corresponding to Point Point Event 3 3.
  • the client device can sequentially insert the collection event 1, the collection event 2, and the collection event 3 from the end of the queue a 1 into the queue a 1
  • the collection events stored in queue a 1 have occupied position 0 to position i-1 of queue a 1
  • collection event 1 can occupy position i of queue a 1
  • collection event 2 can occupy position i of queue a 1.
  • +1 3 can occupy the position capture event queue of a 1 i + 2.
  • i can be an integer greater than or equal to zero.
  • the client device when the capture event memory 1, the client device may be assigned to a queue in a capture event unique identifier.
  • the unique identifier of the collection event can be set by those skilled in the art based on experience. In one example, the unique identifier can be set as a combination of a timestamp and a random number, and the length is 16 bits.
  • the collection event may include, but is not limited to: APP ID, user ID, collection event ID, device information, user information, and service information.
  • the device information can refer to the type of processor, the number of CPU cores, the model of the device, etc.
  • the user information can refer to the user’s name, gender, mobile phone number, and habitual residence, etc.
  • the business information can refer to the location.
  • the type of event, the serial number of the business, etc. are not specifically limited.
  • the client device can perform the reporting process according to a preset period. For example, if the preset period is 2 minutes, the client device can obtain an unreported collection event from the queue a 1 every 2 minutes (such as collection Event k) is reported to the collection server. Before reporting the collection event k, the client device can first query whether there is a collection event being reported in the reporting process. If it does not exist, it can directly report the collection event k to the collection server. If it does, it can first wait for the reported collection event. The reporting of collection events (such as collection event h) is complete (that is, a response message sent by the collection server is received).
  • the collection event k can be reported to the collection server; if it is determined that the reporting of the collection event h has failed, you can query the number of times the collection event h has been reported to the collection server. The number of reported times is greater than or equal to the preset number, indicating that the current network is faulty. Therefore , all collection events from the collection event h to the end of the queue can be cached from queue a 1. If the number of reported times is less than the preset number, you can Re-report the collection event h, and after the reporting is complete, determine whether the collection event h is reported successfully, and repeat the above steps.
  • the client device can queue delete a 1.
  • the client device can generate the buried event in sequence 3 the corresponding collection event 3, the buried point event 4 corresponding to the collection event 4, and the buried point event 5 corresponding to the collection event 5. Further, since there is no queue for reporting the collected data in the client device, the client device can create a queue a 2 and store the collection event 3, the collection event 4, and the collection event 5 in the queue a 2 .
  • the client device Before storing the collection event 3, the collection event 4, and the collection event 5 in the queue a 2 , the client device can first query to determine whether there is a buffered collection event. If there is a buffered collection event, it can first capture event from the queue buffer in the tail a 2 a 2 is inserted in the queue, then capture event 3, 4 and capture event capture event queue 5 is inserted from a 2 a 2 of the tail queue, the queue finally performed based on a 2 Escalation process. In this way, the client device can first report the collection event that was not successfully reported last time to the collection server, and then report the newly generated collection event this time to the collection server, thereby increasing the success rate of collection event collection.
  • the client device can compress the collection event before buffering the collection event, and by caching the compressed collection event, the space occupied by the collection event can be reduced.
  • the client device can perform compression on multiple collection events once to obtain a compressed file, or it can perform compression on each collection event to obtain multiple compressed files, or it can also perform compression on each collection event to obtain multiple compressed files.
  • a continuous preset number of collection events are compressed to obtain multiple compressed files, and the number of compressed collection events can be specifically set according to business needs, which is not limited in the embodiment of the present invention.
  • the collection server after detecting that the user triggers a preset operation, multiple corresponding collection events are generated when the task corresponding to the preset operation is executed; each collection event is used to record each execution of the task. For events of each subtask, further, the multiple collection events are reported through a multiple reporting process, and in each reporting process, one or more collection events that are not reported among the multiple collection events are reported to the collection server. In the embodiment of the present invention, the collection events are reported through multiple reporting processes, which eliminates the need to report all collection events to the collection server at one time. Compared with the prior art adopting the data collection strategy of receiving and sending, the collection server can be reduced. The pressure, and can improve the efficiency of data collection.
  • an embodiment of the present invention also provides a data collection device, and the specific content of the device can be implemented with reference to the foregoing method.
  • Fig. 6 is a schematic structural diagram of a data collection device provided by an embodiment of the present invention, including:
  • the processing module 601 is configured to generate multiple collection events when the task corresponding to the preset operation is executed after detecting that the user triggers a preset operation; each collection event is used to record the execution of each subtask in the task event;
  • the transceiver module 602 is configured to report the multiple collection events through multiple reporting processes; in each reporting process, report one or more collection events that are not reported among the multiple collection events to the collection server.
  • the transceiver module 602 is specifically configured to:
  • the transceiver module 602 is specifically configured to:
  • the preset cursor is used to indicate that the collection event reported during each reporting process is The position in the preset queue;
  • the preset cursor is updated using the position adjacent to the position indicated by the preset cursor.
  • the device further includes a cache module 603;
  • the transceiver module 602 is further configured to: after determining that the collection event reported in the previous reporting process has not been successfully reported to the collection server, repeat reporting the collection event reported in the previous reporting process;
  • the buffer module 603 is configured to: change the position indicated by the preset cursor in the preset queue to the position at the end of the preset queue The collected events are buffered to the preset location;
  • the transceiver module 602 is further configured to report the collection event buffered at the preset location to the collection server after the fault is eliminated.
  • processing module 601 is further configured to:
  • the position indicated by the preset cursor is the position at the end of the preset queue, or if the collection event between the position indicated by the preset cursor and the position at the end of the preset queue has been buffered in the preset Set the position, then delete the preset queue.
  • the buffer module 603 sequentially stores the multiple collection events before the preset queue, and is also used to:
  • the transceiver module 602 is specifically configured to:
  • the collection server after detecting that a user triggers a preset operation, multiple corresponding collection events are generated when the task corresponding to the preset operation is executed; each collection event is used for recording The events of each subtask in the task are executed, and further, the multiple collection events are reported through a multiple reporting process, and during each reporting process, one or more of the multiple collection events that have not been reported are reported.
  • the collection event is reported to the collection server.
  • the collection events are reported through multiple reporting processes, which eliminates the need to report all collection events to the collection server at one time. Compared with the prior art adopting the data collection strategy of receiving and sending, the collection server can be reduced. The pressure, and can improve the efficiency of data collection.
  • embodiments of the present invention also provide a computing device, including at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, The processing unit is caused to execute the data collection method described in any of the above-mentioned FIG. 2.
  • embodiments of the present invention also provide a computer-readable storage medium that stores a computer program executable by a computing device.
  • the computing device executes The data collection method described in any of Figure 2 above.
  • the embodiments of the present invention can be provided as methods or computer program products. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种数据采集方法及装置,其中方法包括:检测到用户触发预设操作后,在执行预设操作对应的任务时生成对应的多条采集事件,并通过多次上报过程上报多条采集事件,在每次上报过程中,将多条采集事件中未上报的一条或多条采集事件上报给采集服务器。通过多次上报过程上报采集事件,可以无需一次将全部的采集事件上报给采集服务器,相比于现有技术采用即收即发的数据采集策略来说,可以降低采集服务器的压力,并可以提高数据采集的效率。

Description

一种数据采集方法及装置
相关申请的交叉引用
本申请要求在2019年10月28日提交中国专利局、申请号为201911032119.6、申请名称为“一种数据采集方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及金融科技(Fintech)技术领域,尤其涉及一种数据采集方法及装置。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,然而,由于金融行业的安全性、实时性要求,金融科技也对技术提出了更高的要求。数据采集是金融行业中常用的数据处理方法,通过采集预设时段内的用户行为数据,可以基于用户行为数据建立业务预测模型,并可以使用业务预测模型分析未来一段时间的业务执行情况,从而有助于建立更好的决策应对机制。比如,在分析某一商品的销售情况时,可以采集该商品在一定时段内的用户行为数据,并建立该商品对应的用户流失模型,如此,通过该商品对应的用户流失模型即可对该商品在未来一段时间内的用户流失情况进行分析,以便于营销部门调整该商品的营销策略,降低用户流失情况。
现有技术一般采用即收即发的策略执行数据采集过程,具体地说,业务系统只要检测到业务触发埋点事件,就会生成埋点事件对应的采集事件,并同时将采集事件上报给采集服务器。然而,这种方式在支持并行处理的业务系统中会存在一些问题。例如,当多个埋点事件在同一时刻被触发时,业务系统会同时将多个埋点事件对应的多个采集事件并行地上报给采集服务器,这种情况下,采集服务器需要同时获取多个采集事件,从而采集服务器会面临较大的压力,采集服务器的正常工作可能会受到影响。
综上,目前亟需一种数据采集方法,用以降低采集服务器的压力。
发明内容
本发明提供一种数据采集方法及装置,用以降低采集服务器的压力。
第一方面,本发明提供一种数据采集方法,该方法包括:检测到用户触发预设操作后,在执行预设操作对应的任务时生成对应的多条采集事件,并通过多次上报过程来上报多条采集事件,在每次上报过程中,将多条采集事件中未上报的一条或多条采集事件上报给采集服务器。其中,多条采集事件中的每条采集事件用于记录执行任务中的每个子任务的事件。
在上述设计中,通过多次上报过程上报采集事件,可以无需一次将全部的采集事件上报给采集服务器,相比于现有技术采用即收即发的数据采集策 略来说,上述设计可以降低采集服务器的压力,并可以提高数据采集的效率。
在一种可能的设计中,通过多次上报过程上报多条采集事件,包括:将多条采集事件依次存储在预设队列中,在每次上报过程中,确定上一次上报过程上报的采集事件成功上报给采集服务器后,将预设队列中未上报的一条采集事件上报给采集服务器。在上述设计中,通过设置预设队列,可以以较少的占用空间来存储生成的采集事件,并可以按照生成的时间顺序准确地保存多条采集事件,从而可以提高数据采集的准确性。且,该方案在每次上报过程所上报的一条采集事件上报成功后,才会启动下一次上报过程,从而在降低采集服务器的压力的同时,还能降低采集事件的丢失率,提高采集服务器接收采集事件的成功率。
在一种可能的设计中,将预设队列中未上报的一条采集事件上报给采集服务器,包括:获取预设队列中预设游标指示的位置相邻的后一位置存储的采集事件,并上报给采集服务器,进而使用预设游标指示的位置相邻的位置更新预设游标。其中,预设游标用于指示每次上报过程中上报的采集事件在预设队列中的位置。在上述设计中,通过使用预设游标记录每次上报的采集事件所在的位置,可以根据预设游标方便地获取到下一次待上报的采集事件所在的位置,从而可以提高数据采集的灵活性和准确性。
在一种可能的设计中,该方法还包括:在确定上一次上报过程上报的采集事件未成功上报给采集服务器后,重复上报上一次上报过程上报的采集事件,当重复上报的次数超过预设次数时,将预设队列中预设游标指示的位置至预设队列末端的位置之间的采集事件缓存至预设位置,在故障排除后,再将预设位置缓存的采集事件上报给采集服务器。在上述设计中,由于多次上报过程失败的原因一般为网络故障,因此通过缓存未能成功上报的采集事件,并在网络故障恢复后重新上报未上报的采集事件,可以使得采集服务器采集到最全面的采集数据,从而可以提高数据采集的成功率。
在一种可能的设计中,若预设游标指示的位置为预设队列末端的位置,或者,若预设游标指示的位置至预设队列末端的位置之间的采集事件已缓存至预设位置,则删除预设队列。在上述设计中,通过在缓存成功后删除预设队列,或者在预设队列中的采集事件均上报完成后删除预设队列,可以避免无意义的占用空间行为,从而可以降低系统的性能损耗。
在一种可能的设计中,将多条采集事件依次存储在预设队列之前,还可以查询预设位置是否缓存有采集事件,若是,则使用预设队列的初始位置存储预设位置缓存的采集事件,再从预设队列的初始位置之后的位置依次存储多条采集事件,若否,则从预设队列的初始位置开始依次存储多条采集事件。在上述设计中,在生成新的采集事件后,通过先从预设位置获取缓存的采集事件,并放置在预设队列的初始位置,在将新的采集事件放置在缓存的采集事件之后,可以保证采集事件按照生成的时间顺序依次被上报给采集服务器,从而可以提高数据采集的准确性和成功率。
在一种可能的设计中,将预设队列中未上报的一条采集事件上报给采集服务器,包括:获取预设队列中起始位置存储的采集事件,并上报给采集服 务器;进一步地,若确定本次上报过程上报的采集事件成功上报给采集服务器,则删除预设队列中起始位置存储的采集事件,并依次将预设队列中各位置存储的采集事件前移一个位置。其中,预设队列中位于起始位置之后的位置存储的采集事件移动至起始位置。在上述设计中,通过从预设队列中删除已上报的采集事件,可以降低预设队列所占用的存储空间,从而在采集事件增多时无需为预设队列重新扩展存储空间,提高预设队列的可用性以及存储空间的使用率。
第二方面,本发明提供一种数据采集装置,该装置包括:
处理模块,用于检测到用户触发预设操作后,在执行预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行任务中的每个子任务的事件;
收发模块,用于通过多次上报过程上报多条采集事件;在每次上报过程中,将多条采集事件中未上报的一条或多条采集事件上报给采集服务器。
在一种可能的设计中,收发模块具体用于:将多条采集事件依次存储在预设队列中,在每次上报过程中,在确定上一次上报过程上报的采集事件成功上报给采集服务器后,将预设队列中未上报的一条采集事件上报给采集服务器。
在一种可能的设计中,收发模块具体用于:获取预设队列中预设游标指示的位置相邻的后一位置存储的采集事件,并上报给采集服务器,进而使用预设游标指示的位置相邻的位置更新预设游标。其中,预设游标用于指示每次上报过程中上报的采集事件在预设队列中的位置。
在一种可能的设计中,装置还包括缓存模块。这种情况下,收发模块还用于:在确定上一次上报过程上报的采集事件未成功上报给采集服务器后,重复上报上一次上报过程上报的采集事件。在收发模块重复上报的次数超过预设次数时,缓存模块用于:将预设队列中预设游标指示的位置至预设队列末端的位置之间的采集事件缓存至预设位置。相应地,收发模块还用于:在故障排除后,将预设位置缓存的采集事件上报给采集服务器。
在一种可能的设计中,处理模块还用于:若预设游标指示的位置为预设队列末端的位置,或者,若预设游标指示的位置至预设队列末端的位置之间的采集事件已缓存至预设位置,则删除预设队列。
在一种可能的设计中,缓存模块将多条采集事件依次存储在预设队列之前,还可以查询预设位置是否缓存有采集事件,若是,则使用预设队列的初始位置存储预设位置缓存的采集事件,再从预设队列的初始位置之后的位置依次存储多条采集事件,若否,则从预设队列的初始位置开始依次存储多条采集事件。
在一种可能的设计中,收发模块具体用于:获取预设队列中起始位置存储的采集事件,并上报给采集服务器,若确定本次上报过程上报的采集事件成功上报给采集服务器,则删除预设队列中起始位置存储的采集事件,并依次将预设队列中各位置存储的采集事件前移一个位置。其中,预设队列中位于起始位置之后的位置存储的采集事件移动至起始位置。
第三方面,本发明提供一种计算设备,包括至少一个处理单元以及至少一个存储单元,其中,存储单元存储有计算机程序,当程序被处理单元执行时,使得处理单元执行如上述第一方面任意的数据采集方法。
第四方面,本发明实施例提供的一种计算机可读存储介质,其存储有可由计算设备执行的计算机程序,当程序在计算设备上运行时,使得计算设备执行如上述第一方面任意的数据采集方法。
本发明的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种可能的系统架构示意图;
图2为本发明实施例提供的一种数据采集方法对应的流程示意图;
图3为本发明实施例提供的一种基于预设游标上报采集事件的方法示意图;
图4为本发明实施例提供的一种基于先进先出的队列算法上报采集事件的方法示意图;
图5为本发明实施例提供的一种数据采集方法的实现流程示意图;
图6为本发明实施例提供的一种数据采集装置的结构示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
金融科技(Fintech)技术领域通常涉及到多种交易,比如,银行的交易可以包括售卡交易、存款交易、贷款交易、保险交易、理财交易等。银行每天的交易量可以达到数千笔甚至数万笔。为了保证各项交易的顺利进行,银行通常可以对用户执行各种交易的事件进行采集,进而对采集的各项事件进行大数据分析,得到分析模型。分析模型可以有多种,比如用户流失模型、用户画像等等。如此,银行可以使用分析模型对用户的行为进行预测,从而可以基于预测的用户的行为为用户提供对应的服务。
图1为本发明实施例提供的一种可能的系统架构示意图,如图1所示,该系统架构可以包括采集服务器110和至少一个客户端设备,比如客户端设备121、客户端设备122和客户端设备123。其中,至少一个客户端设备可以分别与采集服务器110通信连接,比如可以通过有线方式实现连接,或者也可以通过无线方式实现连接,具体不作限定。
本发明实施例中,至少一个客户端设备可以分别设置在至少一个业务系统中,比如客户端设备121为贷款业务系统中设置的客户端设备,客户端设备122为保险业务系统中设置的客户端设备,客户端设备123为理财业务系统中设置的客户端设备。在一个示例中,客户端设备可以是指提供应用程序(Application,APP)的服务器,APP可以是指内嵌浏览器的APP,如此,若用户终端上安装有客户端设备提供的APP,则客户端设备可以检测到用户在APP的浏览器上所触发的操作,并可以向用户提供操作对应的服务。
需要说明的是,图1仅是一种示例性的简单说明,其所列举的客户端设备的数量仅是为了便于说明方案,并不构成对方案的限定,在具体实施中,客户端设备的数量可以远远大于3个,比如可以为4个或4个以上。
基于图1所示意的系统架构,图2为本发明实施例提供的一种数据采集方法对应的流程示意图,该方法适用于客户端设备,例如图1所示意的客户端设备121、客户端设备122和客户端设备123。该方法包括:
步骤201,客户端设备检测到用户触发预设操作后,在执行预设操作对应的任务时生成对应的多条采集事件。
在一种可能的前提下,预设操作可以为指示埋点事件被触发的操作。当用户触发预设操作时,客户端设备可以执行预设操作对应的业务,且业务执行过程会触发埋点事件。如此,客户端设备可以生成埋点事件对应的采集事件。以埋点事件为贷款事件为例,贷款事件对应的预设操作可以有多种,比如可以为点击贷款界面上的贷款申请图标,或者也可以为按压键盘上的预设按键(可以为单一按键,也可以为组合按键),或者还可以为在文本框中输入预设文字或语音输入预设文字,或者还可以为检测到预设脑电波形,或者还可以为拍摄到指定动作或扫描到预设条码,等等,具体不作限定。
需要说明的是,本发明实施例中,业务执行过程与数据采集过程可以分别在各自的进程中执行,数据采集过程由业务执行过程来决定,且不影响业务执行过程。举例来说,贷款事件被触发后,在业务进程中执行贷款操作,而在数据采集进程中采集执行贷款操作所触发的埋点事件,比如采集用户申请贷款这一事件。
下面以预设操作为点击贷款界面上的贷款申请图标为例,描述生成多条采集事件的具体实现过程。
具体实施中,若客户端设备向用户提供了应用程序APP,则用户在APP上的任何操作均可以被客户端设备所采集,且,客户端设备还可以对用户的操作作出响应。举例来说,若微粒贷APP为客户端设备提供的APP,则当用户打开用户终端中的微粒贷APP的微粒贷产品页面后,若用户在微粒贷产品页面上触发了“一键借款”的按钮,则客户端设备可以依次从多个第三方平台查询用户的相关信息,从而确定是否可以向用户提供借款。这种情况下,客户端设备121可以先从个人信息维护平台中查询用户的个人信息,当个人信息查询通过后,可以从开户信息维护平台查询用户的开户信息,当开户信息查询通过后,可以从贷款信息维护平台中查询用户的贷款资格。若个人信息、开户信息和贷款资格均查询通过,则客户端设备可以向用户提供借款,若个人信 息、开户信息和贷款资格中存在任意一个查询不通过,则可以拒绝用户的借款申请。
在上述过程中,在执行“一键贷款”对应的贷款业务时,客户端设备分别执行了查询用户的个人信息、查询用户的开户信息以及查询用户的贷款资格这三个子任务。如此,若客户端设备预先在微粒贷APP上设置了第一埋点事件、第二埋点事件和第三埋点事件,第一埋点事件指示客户端设备(或用户)执行了查询个人信息维护平台中的个人信息的子任务,第二埋点事件指示客户端设备(或用户)执行了查询开户信息维护平台中的开户信息的子任务,第三埋点事件指示客户端设备(或用户)执行了查询贷款信息维护平台中的贷款资格的子任务,则在用户触发了“一键借款”的按钮后,由于第一埋点事件、第二埋点事件和第三埋点事件依次被触发,因此客户端设备可以依次生成第一埋点事件对应的第一采集事件、第二埋点事件对应的第二采集事件,以及第三埋点事件对应的第三采集事件。
本发明实施例中,查询用户的个人信息、查询用户的开户信息以及查询用户的贷款资格这三个子任务的执行间隔非常短,可能只有几毫秒,因此客户端设备在执行贷款业务的过程中,相当于并行地生成第一采集事件、第二采集事件和第三采集事件。
需要说明的是,上述仅是一种示例性的简单说明,其所列举的采集事件的数量仅是为了便于说明方案,并不构成对方案的限定,在具体实施中,采集事件的数量可以为2条,或者也可以大于3条,比如4条或4条以上,具体不作限定。
步骤202,客户端设备通过多次上报过程上报多条采集事件,在每次上报过程中,将多条采集事件中未上报的一条或多条采集事件上报给采集服务器。
具体实施中,客户端设备可以先将多条采集事件存储在第一预设位置,然后每次从第一预设位置获取未上报的一条或多条采集事件并上报给采集服务器110。如此,通过多次上报过程来上报多条采集事件,可以无需一次将多条采集事件全部上报给采集服务器,相比于现有技术采用即收即发的数据采集策略来说,可以降低采集服务器的压力,并可以提高数据采集的效率。
本发明实施例中,第一预设位置可以由本领域技术人员根据经验进行设置,或者也可以根据业务需要进行设置,比如可以为客户端设备的存储器(磁盘、外部存储器、外挂设备等),或者也可以为数据库服务器,或者还可以为云存储(比如云空间、云盘等)。且,采集事件可以以数据表的形式存储在第一预设位置,或者也可以以链表的形式存储在第一预设位置,或者还可以以机器语言的形式存储在第一预设位置,等等,具体不作限定。
在一种可能的实现方式,可以使用队列来存储采集事件,队列为一种特殊的线性表,可以由多个元素构成,每个元素均可以存储一个地址指针,队列可以设置在客户端设备的运行内存(Random Access Memory,RAM)中。具体实施中,在生成多条采集事件之后,客户端设备可以先将多条采集事件依次存储在预设队列中,如此,在每次上报时,可以从预设队列中获取一条或多条未上报的采集事件,并上报给采集服务器110。采用该种方式,通过设 置预设队列,可以以较少的占用空间来存储生成的多条采集事件,并可以按照生成的时间顺序准确地保存多条采集事件,从而可以提高数据采集的准确性。
可选地,在每次上报时,客户端设备可以从预设队列中获取一条未上报的采集事件,并上报给采集服务器110,从而使得采集事件逐条被上报给采集服务器110。相应地,在每次上报之前,均需要确定上一次上报的采集事件上报成功,也就是说,在上一次上报的采集事件成功被上报给采集服务器110后,客户端设备再启动下一次上报过程。采用该种方式,通过设置每次上报过程所上报的一条采集事件上报成功后,再启动下一次上报过程,可以在降低采集服务器的压力的同时,降低采集事件的丢失率,从而可以提高采集服务器接收采集事件的成功率。
本发明实施例中,采集服务器110在接收客户端设备上报的采集事件后,会向客户端设备发送响应消息,响应消息包括第一类型响应消息和第二类型响应消息。其中,第一类型响应消息用于标识采集服务器110成功接收到客户端设备上报的采集事件,第二类型响应消息用于标识采集服务器110在设定时长内未接收到客户端设备上报的采集事件,或者采集服务器110接收到非法格式的采集事件。如此,在每次上报时,若客户端设备接收到采集服务器110发送的第一类型响应消息,则确定该次上报的采集事件上报成功,从而可以启动下次上报过程。若客户端设备接收到采集服务器110发送的第二类型响应消息,则确定该次上报的采集事件上报失败,从而可以不启动下次上报过程。
在一个示例中,在确定某次上报的采集事件上报失败后,客户端设备可以重复上报该次上报的采集事件,若重复上报的次数超过预设次数,说明当前网络故障,因此,客户端设备可以将预设队列存储的全部未上报的采集事件(包含该次上报的采集事件)缓存至第二预设位置;其中,预设次数可以由本领域技术人员根据经验进行设置,比如可以为3次,或者也可以为4次或4次以上;第二预设位置可以与第一预设位置相同,或者也可以不同,具体不作限定。
本发明实施例中,若预设操作为设置在内嵌浏览器的APP上的操作,则第一预设位置可以为客户端设备的RAM,第二预设位置可以为设备的内存(Read-only memory,ROM)。具体实施中,当客户端设备生成多条采集事件后,可以先将多条采集事件存储在客户端设备的RAM中,然后在每次上报时,从RAM中获取未上报的一条采集事件,将该条采集事件放置在浏览器请求队列中,浏览器请求队列用于存储浏览器待处理的请求数据。采用该种方式,客户端设备的每次上报过程可以仅占用浏览器请求队列中的一个位置,相比于现有技术一次上报多个采集事件导致占用浏览器请求队列的多个位置的方式来说,可以降低浏览器的工作压力,从而不影响浏览器的正常工作效率。
进一步地,当某一条采集事件多次重复上报均失败时,客户端设备可以从RAM中获取全部未上报的采集事件(包括该条采集事件),并将这些未上报的采集事件通过浏览器缓存到设备的ROM中。如此,当确定网络故障恢复后,若生成了新的采集事件,则可以先将设备的ROM中缓存的采集事件上报给采 集服务器110,然后再将新的采集事件上报给采集服务器110。本发明实施例中,由于多次上报过程失败的原因一般为网络故障,因此通过缓存未能成功上报的采集事件,并在网络故障恢复后重新上报未上报的采集事件,可以使得采集服务器采集到最全面的采集数据,从而可以提高数据采集的成功率。
本发明实施例中,从预设队列中获取未上报的采集事件的方式可以有多种,下面主要介绍两种可能的实现方式,可以理解的,获取未上报的采集事件的方式也可以为其它方式,本发明实施例不作限定:
实现方式一
在实现方式一中,客户端设备可以预先设置预设游标,预设游标用于指示待上报的采集事件在预设队列中的位置。
图3为本发明实施例提供的一种基于预设游标上报采集事件的方法示意图,如图3所示,当前时刻,客户端设备中已设置有用于上报采集事件的第一队列,第一队列包含队头和队尾,采集事件可以从第一队列的队尾依次插入第一队列,第一队列中依次存储有采集事件1、采集事件2和采集事件3,采集事件1占用第一队列的初始位置0,采集事件2占用第一队列的位置1,采集事件3占用第一队列的位置2。在将第一队列中的采集事件上报给采集服务器110的过程中,若客户端设备根据用户新触发的预设操作生成了新的采集事件4、采集事件5和采集事件6,则采集事件4、采集事件5和采集事件6可以从第一队列的队尾插入第一队列,如此,采集事件4可以占用第一队列的位置3,采集事件5可以占用第一队列的位置4,采集事件6可以占用第一队列的位置5。
具体实施中,预设游标在初始状态时可以指示第一队列的初始位置,即位置0,因此,客户端设备在初次上报采集事件时,可以先获取预设游标指示的位置0存储的采集事件1,然后可以将采集事件1放置在浏览器请求队列中,经由浏览器将采集事件1上报给采集服务器110。进一步地,若客户端设备接收到采集服务器110发送的第一类型响应消息,说明采集事件1成功上报给采集服务器110,因此客户端设备可以控制预设游标从第一队列的队头方向向队尾方向后移一位,如此,预设游标可以指示在第一队列的位置1,从而可以使得客户端设备在下次上报时将第一队列的位置1存储的采集事件2上报给采集服务器110。
在一种可能的情形中,在上报采集事件2时,若客户端设备接收到采集服务器110发送的第二类型响应消息,说明采集事件2未能成功上报给采集服务器110,因此客户端设备还可以重复上报采集事件2,若重复上报3次均接收到采集服务器110发送的第二类型响应消息,则说明采集事件2在3次重复上报中均未成功上报给采集服务器110,当前网络故障,因此客户端设备可以将第一队列中预设游标指示的位置1至第一队列末端的位置2之间的采集事件(即采集事件2和采集事件3)缓存至第二预设位置。
本发明实施例中,当第一队列中未上报的采集事件被缓存至第二预设位置后,和/或,当预设游标指示的位置为第一队列末端的位置时,客户端设备可以删除第一队列,从而可以避免第一队列占用无意义的占用空间,降低系统的性能损耗。
进一步地,当第一队列被删除后,若客户端设备检测到用户新触发了预设操作,并生成新的采集事件7和采集事件8,则客户端设备可以创建第二队列,在将采集事件7和采集事件8存储在的第二队列之前,可以先查询第二预设位置是否缓存有之前未上报的采集事件。若第二预设位置缓存有采集事件2和采集事件3,则可以先将采集事件2和采集事件3依次存储在第二队列中,再将采集事件7和采集事件8依次存储在第二队列中。如此,采集事件2可以占用第二队列的位置0,采集事件3可以占用第二队列的位置1,采集事件7可以占用第二队列的位置2,采集事件8可以占用第二队列的位置3。相应地,若第二预设位置未缓存有采集事件,则可以直接将采集事件7和采集事件8依次存储在第二队列中;如此,采集事件7可以占用第二队列的位置0,采集事件8可以占用第二队列的位置1。
本发明实施例中,在生成新的采集事件后,通过先从预设位置获取缓存的采集事件,并放置在预设队列的初始位置,在将新的采集事件放置在缓存的采集事件之后,可以保证采集事件按照生成的时间顺序依次被上报给采集服务器,从而可以提高数据采集的准确性和成功率。
在实现方式一中,由于使用预设游标记录当前待上报的采集事件在预设队列中的位置,因此采集事件在预设队列中的位置不会发生变化。从而可以无需频繁地移动采集事件,降低系统性能损耗,且,根据预设游标可以方便地获取到下一次待上报的采集事件所在的位置,从而可以提高数据采集的灵活性。
实现方式二
在实现方式二中,客户端设备可以采用先进先出的队列算法上报采集数据,具体地说,客户端设备每成功上报一条采集事件,即可将该条采集事件从预设队列中删除,并可以将其它采集事件依次前移,从而保持预设队列中的元素持续更新。
图4为本发明实施例提供的一种基于先进先出的队列算法上报采集数据的实现方法示意图,如图4所示,第一队列依次存储有采集事件1、采集事件2和采集事件3,采集事件1占用第一队列的初始位置0,采集事件2占用第一队列的位置1,采集事件3占用第一队列的位置2。
具体实施中,客户端设备可以先获取第一队列的位置0存储的采集事件1,然后可以将采集事件1放置在浏览器请求队列中,经由浏览器将采集事件1上报给采集服务器110。进一步地,若客户端设备接收到采集服务器110发送的第一类型响应消息,说明采集事件1成功上报给采集服务器110,因此客户端设备可以控制将采集事件1从第一队列的队头方向移出第一队列(或者直接删除采集事件1),并依次控制采集事件2和采集事件3前移一个位置,从而使得采集事件2占用第一队列的位置0,采集事件3占用第一队列的位置1。如此,客户端设备在每次上报的采集事件均为位于第一队列的初始位置的采集事件。
进一步地,在上报采集事件2时,若客户端设备接收到采集服务器110发送的第二类型响应消息,说明采集事件2未能成功上报给采集服务器110,因此客户端设备可以重复上报采集事件2,若重复上报3次均接收到采集服务器 110发送的第二类型响应消息,则说明采集事件2在3次重复上报中均未成功上报给采集服务器110,当前网络故障,因此客户端设备可以将第一队列中未上报给采集服务器110的采集事件缓存至第二预设位置。具体地说,由于采集事件被成功上报给采集服务器110后即将采集事件从第一队列中删除,因此第一队列中存储的采集事件均为未上报给采集服务器110的采集事件,也就是说,在确定网络故障时,客户端设备可以将第一队列中存储的全部采集事件缓存在第一队列。
相应地,当某一采集事件无法被上报给采集服务器,从而导致第一队列中的全部采集事件被缓存至第二预设位置后,和/或,当第一队列中的采集事件被全部上报给采集服务器110,从而导致第一队列中不存在采集事件后,客户端设备可以删除第一队列,从而可以避免第一队列占用无意义的占用空间,降低系统的性能损耗。
在实现方式二中,通过从预设队列中删除已上报的采集事件,可以降低预设队列所占用的存储空间,从而在采集事件增多时无需为预设队列重新扩展存储空间,提高预设队列的可用性以及存储空间的使用率。
为了便于理解,下面从另一个角度描述本发明实施例中数据采集方法的具体实现过程。
图5为本发明实施例提供的一种数据采集方法的实现流程示意图,如图5所示,当用户触发第一预设操作后,若在执行第一预设操作对应的业务时触发了埋点事件1、埋点事件2和埋点事件3,则客户端设备可以依次生成埋点事件1对应的采集事件1、埋点事件2对应的采集事件2,以及埋点事件3对应的采集事件3。进一步地,若客户端设备中已设置有用于上报采集数据的队列a 1,则客户端设备可以将采集事件1、采集事件2和采集事件3从队列a 1的队尾依次插入到队列a 1中,假设队列a 1中存储的采集事件已占用了队列a 1的位置0~位置i-1,则采集事件1可以占用队列a 1的位置i、采集事件2可以占用队列a 1的位置i+1,采集事件3可以占用队列a 1的位置i+2。其中,i可以为大于或等于0的整数。
本发明实施例中,当采集事件存储在队列a 1时,客户端设备可以为采集事件分配一个唯一标识。其中,采集事件的唯一标识可以由本领域技术人员根据经验进行设置,在一个示例中,可以设置唯一标识为时间戳与随机数的组合形式,且长度为16位。
需要说明的是,本发明实施例中,采集事件可以包括但不限于:APP的标识、用户的标识、采集事件的标识、设备的信息、用户的信息以及业务的信息。其中,设备的信息可以是指处理器的类型、CPU的核数、设备的型号等,用户的信息可以是指用户的姓名、性别、手机号码、常住地等,业务的信息可以是指埋点事件的类型、业务的流水号等,具体不作限定。
本发明实施例中,客户端设备可以按照预设周期执行上报过程,比如若预设周期为2分钟,则客户端设备可以每2分钟从队列a 1中获取一条未上报的采集事件(比如采集事件k)并上报给采集服务器。在上报采集事件k之前,客户端设备可以先查询上报进程中是否存在正在上报的采集事件,若不存在,则可以直接将采集事件k上报给采集服务器,若存在,则可以先等待正在上报 的采集事件(比如采集事件h)上报完成(即接收到采集服务器发送的响应消息)。当采集事件h上报完成后,若确定采集事件h上报成功,则可以将采集事件k上报给采集服务器;若确定采集事件h上报失败,则可以查询采集事件h已上报给采集服务器的次数,若已上报的次数大于或等于预设次数,说明当前网络故障,因此可以从队列a 1中将采集事件h至队列末端位置的采集事件全部缓存起来,若已上报的次数小于预设次数,则可以重新上报采集事件h,并在上报完成后确定采集事件h是否上报成功,重复执行上述步骤。
进一步地,若队列a 1中未上报的采集事件被缓存,或者队列a 1中的采集事件全部被上报成功,则客户端设备可以将队列a 1删除。如此,当用户触发第二预设操作后,若在执行第二预设操作对应的业务时触发埋点事件3、埋点事件4和埋点事件5,则客户端设备可以依次生成埋点事件3对应的采集事件3、埋点事件4对应的采集事件4,以及埋点事件5对应的采集事件5。进一步地,由于客户端设备中未设置有用于上报采集数据的队列,因此客户端设备可以创建队列a 2,并将采集事件3、采集事件4和采集事件5存储在队列a 2中。其中,客户端设备在将采集事件3、采集事件4和采集事件5存储在队列a 2之前,可以先查询确定是否存在已缓存的采集事件,若存在已缓存的采集事件,则可以先将已缓存的采集事件从队列a 2的队尾插入队列a 2中,然后再将采集事件3、采集事件4和采集事件5从队列a 2的队尾插入队列a 2中,最后基于队列a 2执行上报过程。如此,客户端设备可以先将上次未能成功上报的采集事件上报给采集服务器,然后在将本次新生成的采集事件上报给采集服务器,从而提高采集事件采集的成功率。
在一个示例中,客户端设备在缓存采集事件之前,可以先压缩采集事件,通过缓存压缩的采集事件,可以降低采集事件占用的空间。其中,若待缓存的采集事件有多个,则客户端设备可以对多个采集事件执行一次压缩得到一个压缩文件,或者也可以对每个采集事件执行压缩得到多个压缩文件,或者还可以对连续的预设数量的采集事件执行压缩得到多个压缩文件,压缩采集事件的数量可以根据业务需要进行具体地设置,本发明实施例对此不作限定。
本发明的上述实施例中,检测到用户触发预设操作后,在执行所述预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行所述任务中的每个子任务的事件,进一步地,通过多次上报过程上报所述多条采集事件,在每次上报过程中,将所述多条采集事件中未上报的一条或多条采集事件上报给采集服务器。本发明实施例中,通过多次上报过程上报采集事件,可以无需一次将全部的采集事件上报给采集服务器,相比于现有技术采用即收即发的数据采集策略来说,可以降低采集服务器的压力,并可以提高数据采集的效率。
针对上述方法流程,本发明实施例还提供一种数据采集装置,该装置的具体内容可以参照上述方法实施。
图6为本发明实施例提供的一种数据采集装置的结构示意图,包括:
处理模块601,用于检测到用户触发预设操作后,在执行所述预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行所述任务 中的每个子任务的事件;
收发模块602,用于通过多次上报过程上报所述多条采集事件;在每次上报过程中,将所述多条采集事件中未上报的一条或多条采集事件上报给采集服务器。
可选地,所述收发模块602具体用于:
将所述多条采集事件依次存储在预设队列中;
在每次上报过程中,在确定上一次上报过程上报的采集事件成功上报给所述采集服务器后,将所述预设队列中未上报的一条采集事件上报给所述采集服务器。
可选地,所述收发模块602具体用于:
获取所述预设队列中预设游标指示的位置相邻的后一位置存储的采集事件,并上报给所述采集服务器;所述预设游标用于指示每次上报过程中上报的采集事件在所述预设队列中的位置;
使用所述预设游标指示的位置相邻的位置更新所述预设游标。
可选地,所述装置还包括缓存模块603;
所述收发模块602还用于:在确定上一次上报过程上报的采集事件未成功上报给所述采集服务器后,重复上报所述上一次上报过程上报的采集事件;
在所述收发模块602重复上报的次数超过预设次数时,所述缓存模块603用于:将所述预设队列中所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件缓存至预设位置;
所述收发模块602还用于:在故障排除后,将所述预设位置缓存的采集事件上报给采集服务器。
可选地,所述处理模块601还用于:
若所述预设游标指示的位置为所述预设队列末端的位置,或者,若所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件已缓存至所述预设位置,则删除所述预设队列。
可选地,所述缓存模块603将所述多条采集事件依次存储在预设队列之前,还用于:
查询所述预设位置是否缓存有采集事件,若是,则使用所述预设队列的初始位置存储所述预设位置缓存的采集事件,再从所述预设队列的初始位置之后的位置依次存储所述多条采集事件;若否,则从所述预设队列的初始位置开始依次存储所述多条采集事件。
可选地,所述收发模块602具体用于:
获取所述预设队列中起始位置存储的采集事件,并上报给所述采集服务器;
若确定本次上报过程上报的采集事件成功上报给所述采集服务器,则删除所述预设队列中起始位置存储的采集事件,并依次将所述预设队列中各位置存储的采集事件前移一个位置;其中,所述预设队列中位于所述起始位置之后的位置存储的采集事件移动至所述起始位置。
从上述内容可以看出:本发明的上述实施例中,检测到用户触发预设操 作后,在执行所述预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行所述任务中的每个子任务的事件,进一步地,通过多次上报过程上报所述多条采集事件,在每次上报过程中,将所述多条采集事件中未上报的一条或多条采集事件上报给采集服务器。本发明实施例中,通过多次上报过程上报采集事件,可以无需一次将全部的采集事件上报给采集服务器,相比于现有技术采用即收即发的数据采集策略来说,可以降低采集服务器的压力,并可以提高数据采集的效率。
基于同一发明构思,本发明实施例还提供一种计算设备,包括至少一个处理单元以及至少一个存储单元,其中,所述存储单元存储有计算机程序,当所述程序被所述处理单元执行时,使得所述处理单元执行如上述图2任意所述的数据采集方法。
基于同一发明构思,本发明实施例还提供一种计算机可读存储介质,其存储有可由计算设备执行的计算机程序,当所述程序在所述计算设备上运行时,使得所述计算设备执行如上述图2任意所述的数据采集方法。
本领域内的技术人员应明白,本发明的实施例可提供为方法、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要 求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (16)

  1. 一种数据采集方法,其特征在于,所述方法包括:
    检测到用户触发预设操作后,在执行所述预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行所述任务中的每个子任务的事件;
    通过多次上报过程上报所述多条采集事件;在每次上报过程中,将所述多条采集事件中未上报的一条或多条采集事件上报给采集服务器。
  2. 根据权利要求1所述的方法,其特征在于,所述通过多次上报过程上报所述多条采集事件,包括:
    将所述多条采集事件依次存储在预设队列中;
    在每次上报过程中,在确定上一次上报过程上报的采集事件成功上报给所述采集服务器后,将所述预设队列中未上报的一条采集事件上报给所述采集服务器。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述预设队列中未上报的一条采集事件上报给所述采集服务器,包括:
    获取所述预设队列中预设游标指示的位置相邻的后一位置存储的采集事件,并上报给所述采集服务器;所述预设游标用于指示每次上报过程中上报的采集事件在所述预设队列中的位置;
    使用所述预设游标指示的位置相邻的位置更新所述预设游标。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    在确定上一次上报过程上报的采集事件未成功上报给所述采集服务器后,重复上报所述上一次上报过程上报的采集事件,且在重复上报的次数超过预设次数时,将所述预设队列中所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件缓存至预设位置;
    在故障排除后,将所述预设位置缓存的采集事件上报给采集服务器。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    若所述预设游标指示的位置为所述预设队列末端的位置,或者,若所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件已缓存至所述预设位置,则删除所述预设队列。
  6. 根据权利要求4或5所述的方法,其特征在于,所述将所述多条采集事件依次存储在预设队列之前,还包括:
    查询所述预设位置是否缓存有采集事件,若是,则使用所述预设队列的初始位置存储所述预设位置缓存的采集事件,再从所述预设队列的初始位置之后的位置依次存储所述多条采集事件;若否,则从所述预设队列的初始位置开始依次存储所述多条采集事件。
  7. 根据权利要求2所述的方法,其特征在于,所述将所述预设队列中未上报的一条采集事件上报给所述采集服务器,包括:
    获取所述预设队列中起始位置存储的采集事件,并上报给所述采集服务器;
    若确定本次上报过程上报的采集事件成功上报给所述采集服务器,则删除所述预设队列中起始位置存储的采集事件,并依次将所述预设队列中各位置存储的采集事件前移一个位置;其中,所述预设队列中位于所述起始位置之后的位置存储的采集事件移动至所述起始位置。
  8. 一种数据采集装置,其特征在于,所述装置包括:
    处理模块,用于检测到用户触发预设操作后,在执行所述预设操作对应的任务时生成对应的多条采集事件;每条采集事件用于记录执行所述任务中的每个子任务的事件;
    收发模块,用于通过多次上报过程上报所述多条采集事件;在每次上报过程中,将所述多条采集事件中未上报的一条或多条采集事件上报给采集服务器。
  9. 根据权利要求8所述的装置,其特征在于,所述收发模块具体用于:
    将所述多条采集事件依次存储在预设队列中;
    在每次上报过程中,在确定上一次上报过程上报的采集事件成功上报给所述采集服务器后,将所述预设队列中未上报的一条采集事件上报给所述采集服务器。
  10. 根据权利要求9所述的装置,其特征在于,所述收发模块具体用于:
    获取所述预设队列中预设游标指示的位置相邻的后一位置存储的采集事件,并上报给所述采集服务器;所述预设游标用于指示每次上报过程中上报的采集事件在所述预设队列中的位置;
    使用所述预设游标指示的位置相邻的位置更新所述预设游标。
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括缓存模块;
    所述收发模块还用于:在确定上一次上报过程上报的采集事件未成功上报给所述采集服务器后,重复上报所述上一次上报过程上报的采集事件;
    在所述收发模块重复上报的次数超过预设次数时,所述缓存模块用于:将所述预设队列中所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件缓存至预设位置;
    所述收发模块还用于:在故障排除后,将所述预设位置缓存的采集事件上报给采集服务器。
  12. 根据权利要求11所述的装置,其特征在于,所述处理模块还用于:
    若所述预设游标指示的位置为所述预设队列末端的位置,或者,若所述预设游标指示的位置至所述预设队列末端的位置之间的采集事件已缓存至所述预设位置,则删除所述预设队列。
  13. 根据权利要求11或12所述的装置,其特征在于,所述缓存模块将所述多条采集事件依次存储在预设队列之前,还用于:
    查询所述预设位置是否缓存有采集事件,若是,则使用所述预设队列的初始位置存储所述预设位置缓存的采集事件,再从所述预设队列的初始位置之后的位置依次存储所述多条采集事件;若否,则从所述预设队列的初始位置开始依次存储所述多条采集事件。
  14. 根据权利要求9所述的装置,其特征在于,所述收发模块具体用于:
    获取所述预设队列中起始位置存储的采集事件,并上报给所述采集服务器;
    若确定本次上报过程上报的采集事件成功上报给所述采集服务器,则删除所述预设队列中起始位置存储的采集事件,并依次将所述预设队列中各位置存储的采集事件前移一个位置;其中,所述预设队列中位于所述起始位置之后的位置存储的采集事件移动至所述起始位置。
  15. 一种计算设备,其特征在于,包括至少一个处理单元以及至少一个存储单元,其中,所述存储单元存储有计算机程序,当所述程序被所述处理单元执行时,使得所述处理单元执行权利要求1~7任一权利要求所述的方法。
  16. 一种计算机可读存储介质,其特征在于,其存储有可由计算设备执行的计算机程序,当所述程序在所述计算设备上运行时,使得所述计算设备执行权利要求1~7任一权利要求所述的方法。
PCT/CN2020/119039 2019-10-28 2020-09-29 一种数据采集方法及装置 WO2021082858A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911032119.6A CN110764936A (zh) 2019-10-28 2019-10-28 一种数据采集方法及装置
CN201911032119.6 2019-10-28

Publications (1)

Publication Number Publication Date
WO2021082858A1 true WO2021082858A1 (zh) 2021-05-06

Family

ID=69334231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119039 WO2021082858A1 (zh) 2019-10-28 2020-09-29 一种数据采集方法及装置

Country Status (2)

Country Link
CN (1) CN110764936A (zh)
WO (1) WO2021082858A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190458A (zh) * 2021-05-24 2021-07-30 北京映客芝士网络科技有限公司 自动埋点数据分析的方法、装置、计算机设备和存储介质
CN113900901A (zh) * 2021-10-21 2022-01-07 北京达佳互联信息技术有限公司 数据上报方法、数据监控方法、装置、设备及存储介质
WO2023169251A1 (zh) * 2022-03-09 2023-09-14 北京字节跳动网络技术有限公司 一种指标确定方法、装置、服务器和介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764936A (zh) * 2019-10-28 2020-02-07 深圳前海微众银行股份有限公司 一种数据采集方法及装置
CN113750517B (zh) * 2020-11-30 2024-04-30 上海达龙信息科技有限公司 键盘操作数据传输方法及装置、键盘操作执行方法及装置
CN112764837B (zh) * 2021-01-29 2022-03-08 腾讯科技(深圳)有限公司 数据上报方法、装置、存储介质及终端
CN114567674B (zh) * 2022-02-25 2024-03-15 腾讯科技(深圳)有限公司 一种数据处理方法、装置、计算机设备以及可读存储介质
CN115393974A (zh) * 2022-08-01 2022-11-25 北京主线科技有限公司 自动驾驶车辆故障事件记录方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104348650A (zh) * 2013-08-05 2015-02-11 腾讯科技(深圳)有限公司 网站的监控方法、业务装置及系统
CN104865953A (zh) * 2015-03-20 2015-08-26 北京远特科技有限公司 一种车辆数据处理方法和装置
CN107239389A (zh) * 2017-06-07 2017-10-10 网易(杭州)网络有限公司 一种在混合app中确定用户操作记录的方法及装置
CN107885590A (zh) * 2017-11-30 2018-04-06 百度在线网络技术(北京)有限公司 用于智能设备的任务处理方法和装置
CN108199902A (zh) * 2018-01-24 2018-06-22 精硕科技(北京)股份有限公司 数据传输的处理方法及装置
CN110764936A (zh) * 2019-10-28 2020-02-07 深圳前海微众银行股份有限公司 一种数据采集方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104348650A (zh) * 2013-08-05 2015-02-11 腾讯科技(深圳)有限公司 网站的监控方法、业务装置及系统
CN104865953A (zh) * 2015-03-20 2015-08-26 北京远特科技有限公司 一种车辆数据处理方法和装置
CN107239389A (zh) * 2017-06-07 2017-10-10 网易(杭州)网络有限公司 一种在混合app中确定用户操作记录的方法及装置
CN107885590A (zh) * 2017-11-30 2018-04-06 百度在线网络技术(北京)有限公司 用于智能设备的任务处理方法和装置
CN108199902A (zh) * 2018-01-24 2018-06-22 精硕科技(北京)股份有限公司 数据传输的处理方法及装置
CN110764936A (zh) * 2019-10-28 2020-02-07 深圳前海微众银行股份有限公司 一种数据采集方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190458A (zh) * 2021-05-24 2021-07-30 北京映客芝士网络科技有限公司 自动埋点数据分析的方法、装置、计算机设备和存储介质
CN113900901A (zh) * 2021-10-21 2022-01-07 北京达佳互联信息技术有限公司 数据上报方法、数据监控方法、装置、设备及存储介质
WO2023169251A1 (zh) * 2022-03-09 2023-09-14 北京字节跳动网络技术有限公司 一种指标确定方法、装置、服务器和介质

Also Published As

Publication number Publication date
CN110764936A (zh) 2020-02-07

Similar Documents

Publication Publication Date Title
WO2021082858A1 (zh) 一种数据采集方法及装置
CN109034993B (zh) 对账方法、设备、系统及计算机可读存储介质
WO2020233212A1 (zh) 一种日志记录的处理方法、服务器及存储介质
US9646042B2 (en) Data consistency and rollback for cloud analytics
US9811577B2 (en) Asynchronous data replication using an external buffer table
US7292961B2 (en) Capturing session activity as in-memory snapshots using a time-based sampling technique within a database for performance tuning and problem diagnosis
CN110851465B (zh) 数据查询方法及系统
US10298469B2 (en) Automatic asynchronous handoff identification
US8825798B1 (en) Business event tracking system
CN106815254B (zh) 一种数据处理方法和装置
WO2021082859A1 (zh) 一种数据处理方法及装置
CN107341033A (zh) 一种数据统计方法、装置、电子设备和存储介质
US10033796B2 (en) SAAS network-based backup system
EP3937022B1 (en) Method and apparatus of monitoring interface performance of distributed application, device and storage medium
US11892976B2 (en) Enhanced search performance using data model summaries stored in a remote data store
CN113254320A (zh) 记录用户网页操作行为的方法及装置
CN115455058A (zh) 缓存数据的处理方法、装置、计算机设备及存储介质
CN112035205A (zh) 数据处理方法、装置、设备和存储介质
CN109389271B (zh) 应用性能管理方法及系统
US10360234B2 (en) Recursive extractor framework for forensics and electronic discovery
CN112506886B (zh) 一种多源业务操作日志采集方法及系统
US11841827B2 (en) Facilitating generation of data model summaries
US8024301B2 (en) Automatic database diagnostic usage models
CN106557530B (zh) 业务系统、数据修复方法及装置
CN111770080A (zh) 一种设备指纹的恢复方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20881997

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20881997

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.09.2022)