WO2018028326A1

WO2018028326A1 - Model updating method and apparatus

Info

Publication number: WO2018028326A1
Application number: PCT/CN2017/090609
Authority: WO
Inventors: 谭银燕; 周鹏飞; 汪芳山
Original assignee: 华为技术有限公司
Priority date: 2016-08-08
Filing date: 2017-06-28
Publication date: 2018-02-15
Also published as: CN107704929A; CN107704929B

Abstract

A model updating method and apparatus, relating to the technical field of computers and used to at least solve the problem of a waste of resources caused by the poor significance or even insignificance of a later model update in two model updates triggered by two adjacent updating triggering points, due to the fact that changes between data features of newly-added data of two adjacent updating triggering points and data features of previous data are non-obvious. The method comprises: acquiring first online service data received within a window where a triggering point to be detected is located; according to the data features of the first online service data, constructing a first feature sequence; determining an association relationship between the first feature sequence and at least one representative slice, wherein the representative slice is a slice of a feature sequence constructed according to the data feature of historical service data; if the association relationship between the first feature sequence and the at least one representative slice satisfies a pre-set condition, updating the current model.

Description

Model updating method and device

The present application claims priority to Chinese Patent Application No. 201610645496.7, entitled "A Model Updating Method and Apparatus" on August 8, 2016, the entire contents of which are incorporated herein by reference. .

Technical field

The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for updating a model.

Background technique

The machine learning algorithm is an algorithm that obtains a data model (hereinafter referred to as a model) by using known data, and uses the model to predict unknown data; for example, using the model and data to be received for content recommendation services, etc. . Traditional machine learning algorithms require all known data to be prepared before learning, and once the model is available, it is not changed.

With the development of online business (such as online recommendation business, online marketing business, etc.), the data scale is increasing and the data change rate is getting faster and faster. The model obtained by the traditional machine learning algorithm can not adapt well to the new one. The law of variation of the increased data makes the prediction accuracy of the unknown data using the model lower. Based on this, incremental modeling technology came into being. Incremental modeling technology supports the incremental updating of the acquired model with new data, so that the updated model can better adapt to the changing rules of the newly added data, thereby improving the accuracy of prediction of unknown data.

At present, the model update method provided by the incremental modeling technology is as follows: obtain new data, historical model and update trigger point; update the historical model with new data at the moment when the trigger point is updated, thereby training a new model . In incremental modeling techniques, when to trigger the update of the model is a key issue, which affects the frequency of update of the model and the accuracy with which the model predicts unknown data. At present, a fixed duration or a fixed amount of data is generally used as an update trigger point, that is, if the time period from the last update trigger point to the current time reaches a fixed duration, the trigger model is updated; or, if from above When the amount of data added at the time when the trigger point is updated reaches a fixed amount of data, the model is triggered to be updated.

In the process of updating the model by using the method for determining the update trigger point, if the data characteristics of the newly added data between the two adjacent update trigger points are not significantly changed from the data characteristics of the previous data, the phase may be caused. The next model update in the two model updates triggered by the two adjacent update trigger points The meaning is not big, even no doubt, resulting in waste of resources.

Summary of the invention

Embodiments of the present invention provide a model updating method and apparatus, which are used to at least solve the problem that the data feature of the newly added data between the two adjacent update trigger points does not change significantly with the data characteristics of the previous data. The subsequent model update in the two model updates triggered by the two adjacent update trigger points has little or no doubt, resulting in waste of resources.

In order to achieve the above object, embodiments of the present invention adopt the following technical solutions:

In one aspect, a method for updating a model is provided, including: acquiring first online service data received in a window where a trigger point to be tested is located, where the trigger point to be tested may be any one of the trigger points to be tested; a data feature of the online service data, constructing a first feature sequence; determining an association relationship between the first feature sequence and the at least one representative slice, wherein the representative slice is a slice of the feature sequence constructed according to the data feature of the historical service data; The current model is updated when the relationship between the first feature sequence and the at least one representative slice satisfies a preset condition. It can be seen that the technical solution provided by the embodiment of the present invention combines the data feature of the online service data, the data feature of the historical service, the association relationship between the feature sequences constructed by the two, and the preset conditions to determine the test to be tested. Whether the trigger point is an update trigger point; compared with the technical solution provided by the prior art that the fixed duration or the fixed data amount is used as the update trigger point, the data characteristics of the newly added data between the adjacent two update trigger points can be reduced. The change from the data characteristics of the previous data is not obvious, and the subsequent model update triggered by the two adjacent update trigger points has little meaning or even a doubtful problem. save resources.

The association relationship refers to the relationship between a feature sequence and a representative slice. In a specific implementation, if a vector is used to represent the first feature sequence and the representative slice, the relationship between a feature sequence and a representative slice may be represented by a distance or similarity between a feature sequence and a representative slice. If the at least one representative slice includes a plurality of representative slices, the relationship between the first feature sequence and the at least one representative slice satisfying the preset condition may include: the first feature sequence and any one or more of the plurality of representative slices The relationship between the representative slices satisfies the preset condition.

In a possible design, after determining the association relationship between the first feature sequence and the at least one representative slice, the method may further include: if the association relationship between the first feature sequence and the at least one representative slice does not satisfy the pre- Setting the condition, obtaining the second online service data received in the window of the subsequent test trigger point of the trigger point to be tested; and then, according to the data feature of the first online service data and the data feature of the second online service data, according to Receiving time sequence to construct a second feature sequence; determining And an association relationship between the second feature sequence and the at least one representative slice; if the association relationship between the second feature sequence and the at least one representative slice satisfies a preset condition, the current model is updated. In an actual implementation, if the association between the first feature sequence and the at least one representative slice does not meet the preset condition, the online received in the window where the next trigger point to be tested is located is obtained. Business data; then constructing a new feature sequence according to the receiving time order according to the data characteristics of the first online service data and the online service data received in the window of the next test trigger point, and determining the new feature sequence and At least one represents an association relationship between slices, and if the association relationship satisfies a preset condition, the current model is updated. If the association relationship does not meet the preset condition, the online service data received in the window of the next to-be-tested trigger point of the next to-be-tested trigger point is obtained... until the constructed new feature sequence and at least one representative The relationship between the slices meets the preset conditions, and the current model is updated.

In a possible design, the vector is used to represent the first feature sequence and the representative slice; determining the association relationship between the first feature sequence and the at least one representative slice may include: determining between the first feature sequence and the at least one representative slice the distance. In this case, if the association relationship between the first feature sequence and the at least one representative slice meets the preset condition, updating the current model may include: updating the current model if the distance is less than or equal to the first preset threshold.

In a possible design, the vector is used to represent the first feature sequence and the representative slice; determining the association relationship between the first feature sequence and the at least one representative slice may include: determining between the first feature sequence and the at least one representative slice Similarity. In this case, if the association relationship between the first feature sequence and the at least one representative slice meets the preset condition, updating the current model may include: if the similarity is greater than or equal to the second preset threshold, updating the current model.

In a possible design, the constructing the first feature sequence according to the data feature of the first online service data may include: constructing the first data sequence according to the data feature of the first online service data; wherein, in the first data sequence An element of the data is a data point, and the data point includes at least the following features: a time at which the data point is located, and a data feature of the service data corresponding to the data point; for example, the data point corresponding to the first online service data includes at least the following features: The time at which the data point is located (ie, at the end of the receiving window of the first online service data), the data characteristics of the first online service data. For example, the data point can be expressed as (t, v), where t represents the time at which the data point is located, and v represents the data characteristic of the service data corresponding to the data point. And generating, by the first data sequence, the first feature sequence; wherein the element in the first feature sequence includes at least the following feature: a time point of the data point, the data point is adjacent to the previous data point (ie, adjacent to the data point The rate of change between the previous data points; optionally, may also include the following characteristics: the time period between the time the data point is located and the time the previous data point was. For example, an element in the first feature sequence can be represented as (t, Δ, d), where t represents the time at which the data point is located, Δ represents the rate of change between the data point and the previous data point, and d represents The time period between the time the data point is located and the time the previous data point was. The optional design provides a specific implementation of the feature sequence constructed according to the data characteristics of the online service data, but the specific implementation is not limited thereto. For example, the number, meaning, and the like of the features included in each element in the feature sequence can be changed according to actual needs. Even so, the overall concept uses the concept in the possible design.

In a possible design, after constructing the first data sequence according to the data feature of the first online service data, the method may further include: extracting feature points in the first data sequence (ie, special data points, or For a representative data point, when it is specifically implemented, it can be determined according to actual needs. In the physical sense, the feature point is a local extremum point, an inflection point, etc. on the curve, wherein the local extremum point may include: a peak point , a valley point, etc.), and constructing a second data sequence based on feature points in the first data sequence. In this case, the generating the first feature sequence by the first data sequence may include: generating the first feature sequence by the second data sequence; wherein the element in the first feature sequence includes a time at which the feature point is located, and the feature point is the previous one The rate of change between feature points, and the time period between the time at which the feature point is located and the time at which the previous feature point is located.

Since the actual number of data points included in the first data sequence is large, the number of elements in the first feature sequence may be many if the first feature sequence is directly generated according to the first data sequence. The calculation amount in the process of determining the association relationship between the first feature sequence and the at least one representative slice is large; the possible design obtains the second data sequence by extracting the feature points in the first data sequence, and according to the The second data sequence generates a first feature sequence; the number of elements in the first feature sequence generated in the possible design is smaller than that, and the number of elements in the first feature sequence is obtained according to the first data feature, thereby reducing the determination The amount of calculation in the process of the association between the first feature sequence and at least one representative slice, thereby speeding up the processing. In addition, since the feature point is some special data point in the first data sequence, the association between the first feature sequence generated by the second feature sequence obtained by using the feature point in the first data sequence and the at least one representative slice The relationship between the relationship, and the relationship between the first feature sequence generated using the first data sequence and the at least one representative slice is not too large.

In a possible design, the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the window of the trigger point to be tested refers to receiving online from the beginning. The window between the time of the business data and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is from the i-1th The window between the trigger point to be tested and the trigger point to be tested. The possible design, if the server continuously receives the online service data, can ensure that the server acquires a set of data features at any of the trigger points to be tested in the subsequent steps, thereby ensuring whether each trigger point to be tested is determined to be Update the trigger point.

In a possible design, the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the window of the trigger point to be tested refers to receiving online from the beginning. 1/N of the window between the time of the service data and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is from the i-1th trigger point to be tested to the trigger point to be tested 1/N of the window. Where N>2, N is an integer, and 1/N represents one-N. The possible design, if the server continuously receives the online service data, can ensure that the server acquires multiple sets of data features at any trigger point to be tested in the subsequent step, thereby ensuring whether each trigger point to be tested is determined to be Updating the trigger point; and, compared to the last possible design, the granularity (ie, window) of acquiring the data feature is smaller in the possible design, and thus the number of data features obtained is larger, from a statistical point of view, This can improve the accuracy of the calculation.

In a possible design, the method may further include determining, as a trigger point to be tested, a time when an integer multiple of the preset duration from the time when the online service data is started to be received.

In a possible design, the method may further include determining, as a trigger point to be tested, from a time when the online service data is received to a time when the received online service data is an integer multiple of the preset data amount.

It should be noted that, in actual implementation, determining the trigger point to be tested according to any rule does not affect the basic concept of the technical solution provided by the embodiment of the present invention. Therefore, how to determine the specific implementation manner of the trigger point to be tested is not limited to the foregoing. Two possible designs.

In a possible design, before the determining the association relationship between the first feature sequence and the at least one representative slice, the method may further include: acquiring historical service data, and constructing according to the historical service data. a sequence of historical features; then, determining a model change point in the sequence of historical features, wherein the model change point is an update of the magnitude of the change between the two models before and after the triggered model update process is greater than or equal to a preset threshold Trigger point; then, the historical feature sequence is cut based on the model change point in the historical feature sequence to obtain a representative slice. For the specific implementation manner of determining the model change point and the cutting history feature sequence, reference may be made to FIG. 11 . The method provided in the possible design can be obtained in an offline state or in an online state; and, the representative slice can be changed without being generated, or updated when the representative slice needs to be updated, or Updated as the sequence of historical features is updated. In the specific implementation, the representative slice can also be determined empirically, and then these representative slices are stored in advance.

In a possible design, the above-mentioned cutting of the historical feature sequence based on the model change point to obtain a representative slice may include: cutting the historical feature sequence based on the model change point, and clustering the slice obtained after the cutting, to obtain Represents a slice. Compared with the last possible design, the possible design can reduce the number of representative slices, thereby saving the storage space occupied by the representative slice library; further, it can also reduce the determination of the online feature sequence (for example, the first feature sequence or the second feature) Sequence) A computational amount in the process of representing the association between slices, similar to these features, thereby increasing the rate of model update.

In another aspect, a model updating apparatus is provided, which can implement the functions performed in the above method examples. For example, the apparatus may include: an obtaining module, a building module, a determining module, and an updating module. The acquiring module is configured to obtain the first online service data received in the window where the trigger point to be tested is located. And a building module, configured to construct a first feature sequence according to data characteristics of the first online service data. And a determining module, configured to determine an association relationship between the first feature sequence and the at least one representative slice; the representative slice is a slice of the feature sequence constructed according to the data feature of the historical service data. And an updating module, configured to update the current model if an association relationship between the first feature sequence and the at least one representative slice satisfies a preset condition.

In a possible design, the acquiring module may be further configured to: if the association relationship between the first feature sequence and the at least one representative slice does not satisfy the preset condition, obtain a subsequent trigger point to be tested of the to-be-tested trigger point The second online service data received in the window; the building module may be further configured to: construct the second feature sequence according to the receiving time sequence according to the data feature of the first online service data and the data feature of the second online service data; The module may be further configured to: determine an association relationship between the second feature sequence and the at least one representative slice; the update module may be further configured to: if the relationship between the second feature sequence and the at least one representative slice meets a preset condition, Update the current model.

In a possible design, the vector is used to represent the first feature sequence and the representative slice; the determining module is specifically configured to: determine a distance between the first feature sequence and the at least one representative slice; the update module may be specifically configured to: if the distance If it is less than or equal to the first preset threshold, the current model is updated.

In a possible design, the vector is used to represent the first feature sequence and the representative slice; the determining module is specifically configured to: determine a similarity between the first feature sequence and the at least one representative slice; the update module may be specifically configured to: If the similarity is greater than or equal to the second preset threshold, the current model is updated.

In a possible design, the building module may be specifically configured to: construct a first data sequence according to data characteristics of the first online service data; wherein, one element in the first data sequence is a data point, and the data point includes at least The following features: a data point at which the data point is located, a data feature of the service data corresponding to the data point; generating a first feature sequence from the first data sequence; wherein the element in the first feature sequence The prime includes at least the following characteristics: the time at which the data point is located, the rate of change between the data point and the previous data point, and the time period between the time at which the data point is located and the time at which the previous data point is located.

In a possible design, the building module may be further configured to: extract feature points in the first data sequence, and construct a second data sequence according to the feature points in the first data sequence. In this case, the constructing module may be configured to: when the first data sequence is generated by the first data sequence, generate the first feature sequence; wherein the element in the first feature sequence includes the feature point Time, the rate of change between the feature point and the previous feature point, and the time period between the time at which the feature point is located and the time at which the previous feature point is located.

In a possible design, the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the window of the trigger point to be tested refers to receiving the online from the beginning. a window between the time of the service data and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is a window from the i-1th trigger point to be tested to the trigger point to be tested .

In a possible design, the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the window of the trigger point to be tested refers to receiving the online from the beginning. 1/N of the window between the time of the service data and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is from the i-1th trigger point to be tested to the trigger point to be tested 1/N between the windows. Where N ≥ 2 and N is an integer.

In a possible design, the determining module may be further configured to determine, as the trigger point to be tested, a time when an integer multiple of the preset duration from the time when the online service data is started to be received.

In a possible design, the determining module may be further configured to determine, as the trigger to be tested, from a moment when the online service data is started to be received, and when the received online service data is an integer multiple of the preset data amount. point.

In a possible design, the obtaining module may be further configured to: obtain historical business data; the building module may be further configured to: construct a historical feature sequence according to historical business data; and the determining module may be further configured to: determine a model in the historical feature sequence The change point; the apparatus may further include: a generating module, configured to cut the historical feature sequence based on the model change point in the historical feature sequence to obtain a representative slice.

In a possible design, the generating module may be specifically configured to: cut the historical feature sequence based on the model change point, and cluster the slice obtained after the cutting to obtain a representative slice.

In still another aspect, a model updating apparatus is provided, which can implement the functions performed in the above method examples, and the functions can be implemented by hardware or by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions.

In a possible design, the structure of the device includes processor memory, system bus and communication A signaling interface; the processor is configured to support the apparatus to perform the corresponding functions of the above methods. The communication interface is used to support communication between the device and other network elements. The apparatus can also include a memory for coupling with the processor that retains the program instructions and data necessary for the apparatus. The communication interface may specifically be a transceiver.

In still another aspect, an embodiment of the present invention provides a computer storage medium for storing computer software instructions corresponding to the foregoing method, which includes a program designed to execute the above aspects.

It can be understood that any of the model update devices or computer storage media provided above are used to perform the model update method provided above, and therefore, the beneficial effects that can be achieved can be referred to the corresponding model update provided above. The beneficial effects in the method are not described here.

DRAWINGS

1 is a schematic structural diagram of a system to which the technical solution provided by the embodiment of the present invention is applied;

2 is a schematic structural diagram of a model updating apparatus according to an embodiment of the present invention;

FIG. 3 is a schematic flowchart diagram of a method for updating a model according to an embodiment of the present disclosure;

FIG. 3a is a schematic flowchart diagram of another method for updating a model according to an embodiment of the present disclosure;

4 is a schematic diagram of a relationship between a window and a trigger point to be tested;

FIG. 5 is a schematic diagram of another relationship between a window and a trigger point to be tested;

FIG. 6 is a schematic flowchart diagram of another method for updating a model according to an embodiment of the present disclosure;

FIG. 6a is a schematic flowchart diagram of another method for updating a model according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of determining feature points according to an embodiment of the present invention;

FIG. 8 is a schematic flowchart diagram of a method for acquiring a representative slice according to an embodiment of the present invention;

FIG. 8 is a schematic flowchart of a method for acquiring a representative slice according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a curve drawn according to a first data sequence according to an embodiment of the present invention; FIG.

FIG. 10 is a schematic diagram of feature points determined by the curve shown in FIG. 9 according to an embodiment of the present invention; FIG.

FIG. 11 is a schematic diagram of a model change before and after an update trigger point according to an embodiment of the present invention; FIG.

FIG. 12 is a schematic diagram of a model change point determined based on the curve shown in FIG. 9 according to an embodiment of the present invention; FIG.

FIG. 13 is a schematic diagram of a curve drawn according to data points according to an embodiment of the present invention; FIG.

FIG. 14 is a schematic structural diagram of a model updating apparatus according to an embodiment of the present invention;

FIG. 15 is a schematic structural diagram of another model updating apparatus according to an embodiment of the present invention.

detailed description

The basic principle of the technical solution provided by the embodiment of the present invention is that the relationship between the feature sequence constructed according to the data feature of the online service data and the representative slice of the feature sequence constructed according to the data feature of the historical service data satisfies a preset condition. When the model is updated. The technical solution provided by the embodiment of the present invention combines the data characteristics of the online service data, the data characteristics of the historical service, the association relationship between the feature sequences constructed by the two, and the preset conditions to determine the trigger to be tested. Whether the point is an update trigger point; compared with the technical solution provided by the prior art that the fixed duration or the fixed data amount is used as the update trigger point, the data characteristics of the newly added data between the adjacent two update trigger points can be reduced. The data characteristics of the previous data change are not obvious, and the subsequent model update in the two model updates triggered by the adjacent two update trigger points has little meaning and even no doubt, thereby saving resources.

FIG. 1 is a schematic structural diagram of a system to which the technical solution provided by the embodiment of the present invention is applicable, where the system may include a server and one or more service clients connected to the server, and FIG. 1 is a system. Two business clients, Service Client 1 and Business Client 2, are included in the example. The service client can be used by users of the online service, for example, a set-top box of an internet protocol television (IPTV), a smart phone, a computer, and the like.

The service client can obtain and record the service data, and send the service data to the server according to the preset rule. For example, the service client uses the video client as an example, and the video player can obtain the video during the process of playing the video. The business data is recorded, and the business data is sent to the server one by one or in batches at the end of the video. The server is configured to receive service data sent by the service client, and maintain (or update) the model according to the service data, where the updated model is used to enable the server to perform prediction according to the service data to be received.

FIG. 2 is a schematic structural diagram of a model updating apparatus 20 according to an embodiment of the present invention. The model updating device 20 may be a server, and the model updating device 20 may include a processor 201, a memory 202, a system bus 203, and a communication interface 204. Wherein: the memory 202 is used to store computer execution instructions, the processor 201 is connected to the memory 202 via a system bus, and when the model updating apparatus 20 is in operation, the processor 201 executes computer execution instructions stored in the memory 203 to cause the model updating apparatus 20 to execute the present Any one of the model updating methods provided by the embodiment of the invention. For specific model update methods, refer to the related descriptions in the following and the drawings, and details are not described herein again.

The embodiment of the invention further provides a storage medium, which may include a memory 202.

The processor 201 can be a processor or a collective term for multiple processing elements. For example, at The processor 201 can be a central processing unit (CPU). The processor 201 can also be other general purpose processors, digital signal processing (DSP), application specific integrated circuit (ASIC), field-programmable gate array (FPGA) or Other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor or any conventional processor or the like. The processor 201 may also be a dedicated processor, which may include at least one of a baseband processing chip, a radio frequency processing chip, and the like. Further, the dedicated processor may also include a chip having other dedicated processing functions of the model updating device 20.

The memory 202 may include a volatile memory such as a random-access memory (RAM); the memory 202 may also include a non-volatile memory such as a read-only memory. Full name: read-only memory, abbreviation: ROM), flash memory, hard disk drive (HDD) or solid-state drive (SSD); memory 202 may also include the above types of memory The combination.

System bus 203 can include a data bus, a power bus, a control bus, and a signal status bus. For the sake of clarity in the present embodiment, various buses are illustrated as system bus 203 in FIG.

Communication interface 204 may specifically be a transceiver on model update device 20. The transceiver can be a wireless transceiver. For example, the wireless transceiver may be an antenna of the model updating device 20 or the like. The processor 201 transmits and receives data to and from other devices, such as a service client, via the communication interface 204.

In a specific implementation process, each step in the flow of any one of the model update methods provided below may be implemented by the processor 201 in hardware form executing a computer-executed instruction in the form of software stored in the memory 202. To avoid repetition, we will not repeat them here.

Some of the terms in the embodiments of the present invention are explained below to facilitate the reader's understanding:

1), business data, online business data, historical business data

Business data refers to the data generated by the business client in the process of using the business. The business data may include data of the service itself, and may also include feedback data of the user to the service. The service data is represented as a time series. The service client is an IPTV online video playback client. The service data of the IPTV online video service may include but is not limited to any of the following information: session ID, user account, and video. The start time of the play, the end play time of the video, the play type, the video type, the video ID, the operation record of the user on the video, etc., wherein the ID is an abbreviation of the identity number (identity); the operation record for the video can be Including but not limited to: user's collection of video, browsing View, whether the user recommends content for the video, etc.

The online service data and the historical service data in the embodiment of the present invention are all for the server. Specifically, the online service data refers to the server receiving from the current time and receiving within a preset time period before the current time. Business data. The historical service data refers to the service data that the server receives from the current time and is received outside the preset time period before the current time.

2), the trigger point to be tested, the update trigger point and the model change point

The trigger point to be tested, the update trigger point, and the model change point are concepts in the time domain, that is, a one-dimensional concept. For example, the trigger point to be tested, the update trigger point, and the model change point can all be represented by t. For example, the trigger point t1 to be tested is used to indicate that the time t1 is used as the trigger point to be tested, and, for example, the update trigger point t2 is used for Indicates that the time t2 is used as the update trigger point.

The trigger point to be tested refers to a trigger point (a point on the immediate domain, that is, a time point) that is set according to a certain rule and is used to cause the server to determine whether the model needs to be updated. It should be noted that the server may periodically or continuously receive online service data sent by one or more service clients connected to the server, and the server may determine whether to update the model at specific moments, and the specific moments are It is the trigger point to be tested. The embodiment of the present invention does not limit how to determine the trigger point to be tested. In theory, the server can use any time as the trigger point to be tested. In actual implementation, the server may include, but is not limited to, the following two implementation manners. Test trigger point:

Mode 1: The server may use the time when the integer time multiple of the preset duration from the time when the online service data is started to be received as the trigger point to be tested. For example, if the preset duration is T, the time when the server will receive the online service data from the beginning is t0, the server may use the time t0+nT as the trigger point to be tested; wherein T is greater than 0, and n may be greater than or equal to 0. Any integer. The specific value of T is not limited in the embodiment of the present invention.

Manner 2: The server may start as the trigger point to be tested from the moment when the online service data is received to the time when the received online service data is an integer multiple of the preset data amount. For example, if the preset data amount is R, the time when the server will start receiving the online service data is t0, and the server may start from t0 every time when R online service data is received as the trigger point to be tested.

Updating the trigger point can be understood as the actual trigger point or the effective trigger point, which refers to the trigger point for performing the model update. The trigger point to be tested may be the update trigger point, or it may not be the update trigger point. In the prior art, each trigger point to be tested determined according to the above manner 1 or mode 2 is used as an update trigger point. In the embodiment of the present invention, it is determined whether a trigger point to be tested is updated according to a certain rule. Trigger point. Specific examples can be referred to below.

The model change point is used to determine the process of representing the slice, which refers to the update trigger point whose amplitude between the two models before and after the triggered model update process is greater than or equal to the preset threshold. The update trigger point herein may be an update trigger point in the prior art, or may be an update trigger point provided by the embodiment of the present invention. Specific instructions can be found below.

3), data points and feature points

Both data points and feature points are concepts in the time domain and data feature domains, that is, two-dimensional concepts. For example, the data point can be expressed as (t, v), where t represents the time at which the data point is located, and v represents the data characteristic of the service data corresponding to the data point. Feature points are special data points. Specific instructions can be found below.

It should be noted that, in order to facilitate the clear description of the technical solutions of the embodiments of the present invention, in the embodiments of the present invention, the same items or similar items whose functions and functions are substantially the same are used in the words “first” and “second”. For the sake of distinction, those skilled in the art will understand that the words "first", "second" and the like do not limit the quantity and the order of execution. "Multiple" means two or more.

The technical solutions in the embodiments of the present invention are exemplarily described below with reference to the accompanying drawings in the embodiments of the present invention.

FIG. 3 is a schematic flowchart diagram of a method for updating a model according to an embodiment of the present invention. The execution body of the method shown in FIG. 3 may be a server, and the method may include the following steps:

S301. Acquire first online service data received in a window where the trigger point to be tested is located.

As can be understood, the server can periodically or continuously receive the online service data sent by the one or more service clients connected to the server. In the subsequent steps of the embodiment of the present invention, the server is based on the online service data received in the window. Data feature update model. Specifically, the S301 may include: the server obtains online service data received from the window in which the trigger point to be tested is located, and one or more service clients connected to the server, and uses the online service data as the first online service data.

The trigger point to be tested in S301 can be any one of the trigger points to be tested. The window can be a time window or a data volume window. A time window can refer to a time period in which a time period approaching zero is a time. A data volume window can refer to a fixed amount of data. The size of the window in which the trigger point is to be measured is not limited in the embodiment of the present invention.

It should be noted that, in actual implementation, the server may receive online service data in each window, and may not receive online service data in some windows. For example, during peak business hours, the server may receive online business data in each window for a period of time; during low peak periods, the server may not receive online business data in certain windows.

S302. Construct a first feature sequence according to data characteristics of the first online service data.

Before S302, the method may further include: acquiring data characteristics of the first online service data. The embodiment of the present invention does not limit the specific content and quantity of the data features of the online service data, and the acquisition manner, and may be determined according to factors such as the service data itself and actual requirements. For example, the IPTV online video is an animation. The data feature of the first online service data may include, but is not limited to, watching a cartoon in a receiving window of the first online service data (ie, a window in which the trigger point is to be tested). The number of people, the average playing time of the animation in the receiving window of the first online business data, and the like. Specific: If the service data of the IPTV online video service received by the server is: Session ID, user account, start time of video, end time of video, play type, video type, video ID, user operation record of video Then, the server can obtain the number of independent video accounts in the receiving window of the first online service data by counting the number of independent user accounts whose video type is an animation in the receiving window of the first online business data. The server can obtain the receiving window of the first online service data by counting the average value of the difference between the end time of the video and the start time of the video of the independent user watching the cartoon in the receiving window of the first online service data. The average playing time of the inner animation.

It should be noted that, for the sake of brevity in description, the feature values of the data features and the data features are collectively represented by data features. Those of ordinary skill in the art will appreciate that the data features described herein should be understood in some scenarios as characteristic values of data features. For example, the above “acquiring data characteristics of the first online service data” should be understood as: acquiring feature values of data features of the first online service data. Regarding the related descriptions below, they will not be described one by one.

Optionally, the vector may be used to represent the first feature sequence. In this case, the elements in the first feature sequence are obtained according to data characteristics of online service data acquired in one or more windows. Hereinafter, the description will be made by taking an example in which a first feature sequence is represented by a vector.

S303. Determine an association relationship between the first feature sequence and the at least one representative slice; the representative slice is a slice of the feature sequence constructed according to the data feature of the historical service data.

The at least one representative slice includes one or more representative slices, the representative slice may be determined by a service expert, or may be generated by the server according to a certain method; the representative slice may be pre-stored in the server, or may be pre-executed by S303 Server generated. The vector representation can be used to represent the slice. Of course, the specific implementation is not limited to this. The relationship between the first feature sequence and the at least one representative slice may be a similarity or distance between the two or the like.

Specifically, if the first feature sequence and the representative slice are represented by a vector, the S303 may include: obtaining Taking a representative slice of the at least one representative slice that is equal to the number of elements in the first feature sequence, and determining an association relationship between the first feature sequence and a representative slice equal to the number of elements in the first feature sequence . Specific examples thereof can be referred to below.

S304. If the relationship between the first feature sequence and the at least one representative slice meets a preset condition, update the current model.

If the at least one representative slice includes a plurality of representative slices, the relationship between the first feature sequence and the at least one representative slice satisfies a preset condition, and may include: the first feature sequence and the at least one of the plurality of representative slices represent a slice The relationship between the two meets the preset conditions. The preset condition may be predetermined according to one or more factors such as any representation of the relationship (such as distance or similarity, etc.), actual demand and experience.

Optionally, if the at least one representative slice includes multiple representative slices, the S303-S304 may include: determining, by the server, an association relationship between the first feature sequence and one of the plurality of representative slices, and determining the association When the relationship does not satisfy the preset condition, determining an association relationship between the first feature sequence and another representative slice of the plurality of representative slices, and so on, until the first feature sequence and one of the plurality of representative slices The relationship between the representative slices meets the preset condition, that is, the relationship between the first feature sequence and the plurality of representative slices satisfies the preset condition.

In the model updating method provided by the embodiment of the present invention, when the relationship between the feature sequence constructed according to the data feature of the online service data and the representative slice of the feature sequence constructed according to the data feature of the historical service data satisfies a preset condition, Update the model. The technical solution provided by the embodiment of the present invention combines the data characteristics of the online service data, the data characteristics of the historical service, the association relationship between the feature sequences constructed by the two, and the preset conditions to determine the trigger to be tested. Whether the point is an update trigger point; compared with the technical solution provided by the prior art that the fixed duration or the fixed data amount is used as the update trigger point, the data characteristics of the newly added data between the adjacent two update trigger points can be reduced. The data characteristics of the previous data change are not obvious, and the subsequent model update in the two model updates triggered by the adjacent two update trigger points has little meaning and even no doubt, thereby saving resources.

Optionally, as shown in FIG. 3a (FIG. 3a is drawn based on FIG. 3), after S303, the method may further include:

S305: If the association relationship between the first feature sequence and the at least one representative slice does not meet the preset condition, obtain the second online service data received in the window where the subsequent trigger point of the trigger point to be tested is located.

The at least one representative slice includes a plurality of representative slices, and the relationship between the first feature sequence and the at least one representative slice does not satisfy the preset condition, and may include: the first feature sequence and each of the plurality of representative slices The relationship between the representative slices does not satisfy the preset condition.

S306: The second feature sequence is constructed according to the receiving time sequence according to the data feature of the first online service data and the data feature of the second online service data.

S307: Determine an association relationship between the second feature sequence and the at least one representative slice.

S308: Update the current model if the relationship between the second feature sequence and the at least one representative slice satisfies a preset condition.

For example, the specific implementation manners of S307 to S308 may refer to the specific implementation manners of S303 to S304 in the foregoing, and details are not described herein again.

Optionally, S305-S308 may include: if the association relationship between the first feature sequence and the at least one representative slice does not satisfy the preset condition, the window in which the next trigger point to be tested is obtained is received in the window where the trigger point to be tested is obtained. The online service data; wherein the trigger point to be tested is represented as the i-th trigger point to be tested, and the next test trigger point of the to-be-tested trigger point is represented as the i+1th trigger point to be tested. Then, according to the data feature of the online service data (ie, the first online service data) received in the window where the i th test trigger is located, and the online received in the window of the i+1th test trigger point Business data, building a sequence of features. A relationship between the feature sequence and at least one representative slice is determined. If the association meets the preset condition, the current model is updated. If the association relationship does not meet the preset condition, the online service data received in the window where the i+2 test trigger points are located is obtained; and then, the online received according to the window where the i th test trigger point is located The data characteristics of the service data (that is, the first online service data), the online service data received in the window where the i+1th test trigger point is located, and the window in which the i+2 test trigger points are located are received. Online business data, building a sequence of features. A relationship between the feature sequence and at least one representative slice is determined. If the association meets the preset condition, the current model is updated. If the association relationship does not meet the preset condition, the online service data received in the window of the i+3th trigger point to be tested is obtained, and so on, until the current model is updated.

According to the description below, if the feature sequence (including the first feature sequence and the second feature sequence) is represented by a vector, it can be understood that the new feature sequence acquired by the server each time may be in each of the last feature sequences. After the element, the elements obtained from the data characteristics of the newly acquired online business data are added. For example, compared with S302 above, S306 can be understood as: after each element in the first feature sequence, an element obtained according to the data feature of the second online service data is added to obtain a second feature sequence.

It should be noted that, in specific implementation, the server may delete the feature sequence used in the current update process after each update of the model, or may use the feature sequence last used in the update process as a follow-up history. Part of the sequence of features.

Optionally, the method may further include: determining a size of a window where the trigger point to be tested in S301 is located, and specifically: assuming that the trigger point to be tested in S301 is the i-th trigger point to be tested, i≥1, i is Integer; then:

Implementation 1: If i=1, the window of the trigger point to be tested may be a window between the time when the server starts receiving online service data and the trigger point to be tested; if i≥2, the trigger to be tested The window where the point is located may be a window from the i-1th trigger point to be tested to the trigger point to be tested.

Implementation 2: If i=1, the window of the trigger point to be tested may be 1/N of the window between the time when the server starts receiving online service data and the trigger point to be tested; if i≥2, The window of the trigger point to be tested may be 1/N of the window between the i-1th trigger point to be tested and the trigger point to be tested; wherein N≥2, N is an integer, 1/N indicates One in N.

Wherein, if the window is a time window, the window between the two moments refers to the time period between the two moments. For example, if the time window is 10 min minutes, the window between two adjacent trigger points to be tested refers to the length of the time period between adjacent two test points to be tested is 10 min.

If the window is a data volume window, the window between the two moments refers to a window between the online service data that the server receives a fixed amount of data; wherein the amount of data of the online service data can be the traffic or number of the online service, etc. . For example, if the data volume window is 10M (megabytes), the window between two adjacent trigger points to be tested refers to the traffic of the online service data received by the server between two adjacent trigger points to be tested. It is 10M. If the data volume window is 500, the window between two adjacent trigger points to be tested refers to the number of online service data received by the server between two adjacent trigger points to be tested.

As shown in FIG. 4, it is a schematic diagram of a relationship between a window (specifically, a time window) and a trigger point to be tested. In the above-mentioned implementation manner 1, in FIG. 4, the two trigger points 1 to be tested and the trigger point 2 to be tested are included in the period from the time when the server starts receiving the online service data to the current time. In FIG. 4, the window where the trigger point 1 to be tested is located is the window 1, and the window where the trigger point 2 to be tested is located is the window 2.

As shown in FIG. 5, it is a schematic diagram of a relationship between a window (specifically, a time window) and a trigger point to be tested. In FIG. 5, in the foregoing implementation manner 2, and N=3, and the time from the time when the server starts receiving the online service data to the current time, the two trigger points to be tested and the trigger point 2 to be tested are included as an example for description. . In Figure 5, the window where the trigger point 1 is to be tested is the window 3, and the trigger point 2 to be tested is located. The window is window 6.

Since a subsequent step is to determine the data characteristics of the online service data according to the online service data received in the window, and the online service data received in one window, a set of data features (including one or more data features) may be generated. Therefore, the optional implementation manner can ensure that the server acquires a set of data features at any trigger point to be tested in the subsequent step, thereby ensuring whether each trigger point to be tested is an update trigger point. In addition, the foregoing implementation manner 1 can ensure that a set of data features are acquired before the first trigger point to be tested, or between any two adjacent test trigger points; the foregoing implementation manner 2 can guarantee the first one. Multiple sets of data features are acquired before the trigger point to be tested, or between any two adjacent test trigger points. Among them, the description of the data characteristics can be referred to below.

Based on the above implementation 2:

If i=1, the S302 may include: a data feature of the online service data received in the window according to the trigger point to be tested, and a window between the time when the online service data is started to be received, and the window between the trigger points to be tested. And the data feature of the online service data received in at least one window other than the window where the trigger point is to be tested, and constructing the first feature sequence. Optionally, the at least one window refers to each window. For example, based on FIG. 5, if the trigger point to be tested is the trigger point 1 to be tested, S302 may include: according to the data feature of the online service data received in the window 1 and the data feature of the online service data received in the window 2 And the data characteristics of the online business data received in the window 3, constructing the first feature sequence.

If i≥2, the S302 may include: a data feature of the online service data received in the window in which the trigger point is to be tested, and a relationship between the i-1th test trigger point and the first test trigger point. A data feature of the online service data received in the window and received in at least one window other than the window in which the trigger point is to be tested, constructs a first feature sequence. Optionally, the at least one window refers to each window. For example, based on FIG. 5, if the trigger point to be tested is the trigger point 2 to be tested, S302 may include: according to the data feature of the online service data received in the window 4, and the data feature of the online service data received in the window 5. And a data feature of the online business data received in the window 6, constructing a first feature sequence.

Optionally, as shown in FIG. 6 (FIG. 6 is drawn based on FIG. 3), S302 may include:

S302.1: Build a first data sequence according to data characteristics of the first online service data, where an element in the first data sequence is a data point, and the data point includes at least the following characteristics: a time point at which the data point is located, and a data point The data characteristics of the corresponding business data.

The time at which the data point is located refers to the end of the receiving window of each service data corresponding to the data point, that is, the trigger point to be tested, and optionally, the service data corresponding to the data point can be received. The serial number of the window indicates, of course, the specific implementation is not limited to this. For example, the data point corresponding to the data feature of the first online service may be represented as (t, v); wherein t represents the serial number of the receiving window of the data feature of the first online service, and v represents the data feature of the first online service.

The first data sequence can be understood as a set consisting of one data point, or a set consisting of a plurality of data points in chronological order of the time at which the plurality of data points are located, the set being represented by a vector.

For example, the nth data point in the first data sequence can be represented as (t _n , v _n ), where t _n represents the time at which the nth data point in the first data sequence is located, and v _n represents the first The data characteristics of the online service data corresponding to the nth data point in the data sequence; 1 ≤ n ≤ N, n and N are integers, and N represents the total number of data points in the first data sequence. In this case, the first data sequence can be expressed as {(t ₁ , v ₁ ), t ₂ , v ₂ ), ... (t _n , v _n ) (t _N , v _N )}. If the data feature of the online service data is multi-dimensional (ie, the number of data features of the online service data is multiple), the v _n in the _nth data point (t _n , v _n ) may be represented by a vector form, for example The nth data point can be expressed as: (t _n , v _n1 , v _n2 , ... v _nm ... v _nM ), where v _nm represents the mth of the online service data corresponding to the nth data point Data characteristics; in this case, the first data sequence can be expressed as {(t ₁ , v ₁₁ , v ₁₂ , ... v _1m ... v _1M ), (t ₂ , v ₂₁ , v ₂₂ , ... v _2m ... ...v _2M ), ... (t _n , v _n1 , v _n2 , ... v _nm ... v _nM ), ... (t _N , v _N1 , v _N2 , ... v _Nm ... v _NM ).

Based on the example in S302, the data feature of the first online service data may be represented as (t, v1, v2), where t represents the serial number of the receiving window of the first online service data, and v1 represents the receiving window of the first online service data. The number of people watching the animation inside, v2 indicates the average playing time of the cartoon in the receiving window of the first online business data.

S302.2: Generate a first feature sequence by the first data sequence; wherein the element in the first feature sequence includes at least the following feature: a time at which the data point is located, a rate of change between the data point and the previous data point.

Optionally, the element in the first feature sequence may further include the following feature: a time period between a time when the data point is located and a time when the previous data point is located. Since the optional feature can be inferred according to the moment when the previous data point of the data point is located, the element in the first feature sequence may not include the optional feature.

Wherein, if there are other data sequences before the first data sequence in time series, the previous data point of the first data point in the first data sequence is the last data point in the previous data sequence, which needs to be explained. Yes, according to the description of the model change point below, the last data point is A model change point that is closest to the current time. If there is no other data sequence before the first data sequence in time series, the first data point in the first data sequence is actually: the second one from the time when the server starts to receive the online service data (ie, the starting point) The data point, the previous data point is the first data point from the moment the server starts receiving online business data. This is because the previous data point of the first data point from the time when the server starts receiving the online service data does not exist, so the rate of change between the first data point and the previous data point has no meaning, resulting in This first data point has no meaning. For example, as shown in FIG. 4, the time at which the first data point in the first data sequence is located is the second test point to be tested. All of the following are examples of other data sequences before the first data sequence.

For example, the nth element in the first feature sequence can be represented as (t _n , Δ _n , d _n ), where t _n represents the time at which the nth data point in the first data sequence is located, and Δ _n represents The nth data point and the previous data point in the first data sequence (specifically, the n-1th data point in the first data sequence, or the last data point in the previous data sequence of the first data sequence) The rate of change between d _n represents the time period between the time at which the nth data point in the first data sequence is located and the time at which the previous data point is located. In this case, the first characteristic sequence can be expressed as: TS = {(t ₁ , Δ ₁ , d ₁ ), (t ₂ , Δ ₂ , d ₂ ) (t _n , Δ _n , d _n )... (t _N , Δ _N , d _N )}. If the data characteristic line service data is multi-dimensional, the TS of the n-th element _{_{(t n, △ n, d}} n) of △ _n may be expressed in vector form, example, n-th element (T _n , Δ _n , d _n ) can be expressed as: (t _n , Δ _n1 , Δ _n2 , ... Δ _nm Δ Δ _nM , d _n ), where Δ _nm represents the first data feature for the mth The rate of change between the nth data point and the previous data point in the data sequence; by way of example, based on the example in S302.1, the data characteristics of the first online service data can be expressed as (t, v1, v2), If the mth data feature represents the first data feature, such as the number of people watching the animation in the receiving window of the online service data, Δ _nm represents (t _n , v _n1 ) and (t _n-1 , v _{(n-1) ) 1} ) The slope of the change between. In this case, the first data sequence can be expressed as {(t ₁ , Δ ₁₁ , Δ ₁₂ , ... Δ _1m ...... Δ _1M , d ₁ ), (t ₂ , Δ ₂₁ , Δ ₂₂ , ... Δ _2m ... ... △ _2M , d ₂ , ... (t _n , Δ _n1 , Δ _n2 , ... Δ _nm ...... Δ _nM , d _n ) (t _N , Δ _N1 , Δ _N2 , ... △ _Nm ...... △ _NM , d _N ).

It should be noted that if the server receives online service data in each of a plurality of consecutive windows (excluding the first window after receiving the service data from the beginning), according to the received in each window The online business data can obtain a data point. In this case, the time period between the time when the data point is located and the time when the previous data point is located is a time period corresponding to one window. In actual implementation, the server may not receive online business data in some windows, and based on the window, one cannot get one. A data point. In this case, the time period between the time when the data point is located and the time when the previous data point is located is not a time period corresponding to one window, and may be a time period corresponding to multiple windows.

The rate of change Δ between the data point and the previous data point may be any of the following: a slope between the data point and the previous data point, a normalization of the slope between the data point and the previous data point, The normalization of the inverse tangent of the slope between the data point and the previous data point, the inverse tangent of the slope between the data point and the previous data point, and the slope between the data point and the previous data point The symbol corresponding to the value of the cut. An example of a rate of change between a data point and a previous data point is shown in Table 1:

Table 1

In the above Table 1, the range of the inverse tangent of the slope is divided into the above seven sub-regions, that is, the rate of change between the data point and the previous data point is ranked by seven levels, and the actual implementation is not limited thereto. For example, the rate of change between a data point and a previous data point can be located at any level.

For example, based on Table 1, the first feature sequence may be: {(3, -2, 1), (4, 3, 1), (5, 0, 1) ...}. The “4” in the element (4, 3, 1) indicates the time at which the data point corresponding to the element is located, specifically the serial number of the receiving window of the online service data corresponding to the data point, and “3” indicates the data point. The rate of change from the previous data point is a rapid rise (see Table 1), and "1" indicates the time period between the time at which the data point is located and the time at which the previous data point is located. Specifically: 1 window corresponds to Time period.

Further, as shown in FIG. 6a (FIG. 6a is drawn based on FIG. 3 and FIG. 6), after S302.1, the method may further include:

S302.1a: extract feature points in the first data sequence, and construct a second data sequence according to the feature points in the first data sequence.

Physically, feature points are local extreme points on the curve (eg, peak points, valley points), inflection points, and so on. For the embodiment of the present invention, the feature points in the first data sequence may be feature points on the curve formed by each data point in the first data sequence. The relationship between the data points and the feature points is: the feature points must be data points, but the data points are not necessarily feature points.

Optionally, for any one-dimensional data feature, the server may be based on the n-1th data point (t _n-1 , v _n-1 ), and the n+1th data point (t _n+1 , v) The relationship between _n+1 ) determines whether the nth data point (t _n , v _n ) is a feature point; specifically: the relationship can be expressed by the following formula:

Where Thre1 is a constant greater than or equal to 0.

It should be noted that, if the data feature of the online service data is in multiple dimensions, the nth data point may be used as a feature point as long as the data feature of at least one dimension satisfies the above formula.

Further optionally, the time interval from the previous feature point is greater than or equal to Thre2; wherein Thre2 is a constant greater than or equal to 0. The further optional implementation is used to avoid the continuation of the eigenvalues of the data features in the adjacent two data points, and the two consecutive data points are used as the feature points, thereby resulting in the acquired feature points. The problem of lower accuracy, which ultimately leads to lower accuracy of model updates. For example, the eigenvalue of the data feature may suddenly become larger due to the server repeatedly receiving the online service data and the like in the next window of the adjacent two windows; or, because the server network is in the adjacent two windows In the latter window, the network connection error business data or the online service data is not received, and the characteristic value of the data feature suddenly becomes small. That is, the further optional implementation is for avoiding online service data received in the next window of the adjacent two windows due to abrupt changes in the feature values of the data features in the adjacent two data points. The effect of data feature mutations on the accuracy of acquiring feature points,

In a specific implementation, if Thre1 is 0, and Thrre2 is less than or equal to a time period corresponding to the minimum window, the first data sequence is the same as the second data sequence.

For example, as shown in FIG. 7, it is a schematic diagram for determining feature points. In Fig. 7, the abscissa represents t, and the ordinate represents v; the three consecutive data points on the timing acquired by the server are data points A(t _n-1 , v _n-1 ), and data points B (t _n , v _n ) and the data point C(t _n+1 , v _n+1 ); wherein, the data point A(t _n-1 , v _n-1 ) indicates that the number of people watching the cartoon in the time window t _n-1 is v _N-1 , the data point B(t _n , v _n ) indicates that the number of people watching the cartoon in the time window t _n is v _n , and the data point C(t _n+1 , v _n+1 ) is expressed in the time window t _n The number of people watching ** in ₊₁ is v _n+1 . The previous feature point of the data point B is (t ₁ , v ₁ ), where, in the present example, n is an integer greater than or equal to 2. Then, according to the above-described Condition 1 and Condition 2 shows that, if the ordinate data point B deviates t _n in time on the straight line AC corresponding to that of the point B '(i.e. point in the mathematical sense) the ordinate is greater than or equal to thre1, and the data The time period between the time t _n at which the point B is located and the time t ₁ at which the previous feature point is located is greater than or equal to Thre 2 , and it is determined that the data point B is a feature point.

It should be noted that, in actual implementation, if the current trigger point to be tested is the trigger point 1 to be tested, and the data point obtained according to the data feature of the online service data received in the window where the trigger point 1 is to be tested is the data point B. That is, the data point B is a new data point in the process of determining whether the trigger point 1 to be tested is an update trigger point, and in the process of determining the update trigger point, the data point B is directly used as the feature point B. If the current trigger point to be tested is the next to-be-tested trigger point of the trigger point 1 to be tested (ie, the trigger point 2 to be tested), and the data point C is newly added during the process of determining whether the trigger point 2 to be tested is the update trigger point. For the data point, in the process of determining the update trigger point, it is determined according to the method shown in FIG. 7 whether the data point B is a feature point. In addition, after the trigger point 2 to be tested is not the update trigger point, it is determined whether the subsequent test trigger point of the trigger point 2 to be tested is the next update trigger point, and the data point C is directly used as the feature point; The trigger point to be tested is the update trigger point.

Based on the optional implementation including S301.1a, S302.2 in FIG. 6 may include the following S302.2', as shown in FIG. 6a:

S302.2': generating a first feature sequence by the second data sequence; wherein the element in the first feature sequence includes a time at which the feature point is located, a rate of change between the feature point and the previous feature point, and a feature point The time period between the moment and the moment when the previous feature point is located.

For the specific implementation of the step S302.2', refer to the specific implementation manner of the foregoing S302.2, and details are not described herein again.

For example, based on Table 1, the first feature sequence may be: {(5, -2, 5), (14, -1, 9) ...}. The "14" in the element (14, -1, 9) indicates the time at which the feature point corresponding to the element is located, specifically the sequence number of the receiving window of the online service data corresponding to the feature point, and "-1" indicates the feature. The rate of change between the point and the previous feature point is slowly decreasing (see Table 1), and "9" indicates the time period between the time at which the feature point is located and the time at which the previous feature point is located, specifically: 9 windows The corresponding time period.

It should be noted that, in actual implementation, the number of data points included in the first data sequence may be many, such that if the first feature sequence is directly generated according to the first data sequence, the number of elements in the first feature sequence is There will be a lot, which will make the calculation amount in the process of determining the association relationship between the first feature sequence and the at least one representative slice; the optional implementation obtains the second by extracting the feature points in the first data sequence. a data sequence, and generating a first feature sequence according to the second data sequence; the number of elements in the first feature sequence generated in the optional implementation is less than, and the elements in the first feature sequence are obtained according to the first data feature The number, therefore, the amount of calculation in determining the relationship between the first feature sequence and the at least one representative slice can be reduced, thereby speeding up the processing. In addition, since the feature point is some special data point in the first data sequence (referred to as a representative data point), the first feature generated by the second feature sequence obtained by using the feature point in the first data sequence The association between the sequence and the at least one representative slice, and the error between the association between the first feature sequence generated using the first data sequence and the at least one representative slice is not too great.

Optionally, the vector is used to represent the first feature sequence and the representative slice; in this case, the S303 can be packaged. Include: determining a distance between the first feature sequence and the at least one representative slice. S304 may include updating the current model if the distance is less than or equal to the first preset threshold.

In effect, the representative slice is a sequence of features constructed from the data characteristics of the historical business data, and thus, it can represent the representative slice using the manner of representing the first feature sequence described above. For a specific example, reference may be made to the above. It should be noted that the data feature of the online service data of the first feature sequence is determined to be the same as the data feature of the historical service data. For example, the data feature of the online service data and the data feature of the historical service data are both : The number of people watching the movie in the receiving window and the average playing time of the animation in the receiving window.

The distance between the first feature sequence and the representative slice can be seen as the distance between the two vectors. In a specific implementation, the distance between two vectors can be determined in any way. In addition, the first feature sequence and the representative slice can also be regarded as slices. An optional implementation for determining the distance between the two slices is provided below. It should be noted that the two slices in the calculated distance are The number of elements is equal:

The first feature sequence is represented as Slice _p and the representative slice is represented as Slice _q . The following formula determines the distance between Slice _p and Slice _q :

Where D(Slice _p , Slice _q ) represents the distance between Slice _p and Slice _q ; I represents the number of data points (optionally feature points) in the first feature sequence, and I is greater than or equal to 1. The integer, D _m (Slice _pi , Slice _qi ), represents the mode distance between the i-th data feature of the online service data corresponding to Slice _p and the i-th data feature of the historical service data corresponding to Slice _q ; D _d (Slice _pi , Slice _qi ) represents the temporal distance between the i-th data feature of the online service data corresponding to Slice _p and the i-th data feature of the historical service data corresponding to Slice _q . among them:

D _m (Slice _pi , Slice _qi )=|Δ _pi -Δ _qi |;

D _d (Slice _pi , Slice _qi )=|R _pi -R _qi |;

Where Δ _pi represents the rate of change between the i-th data point and the previous data point in Slice _p , and Δ _qi represents the rate of change between the i-th data point and the previous data point in Slice _q ;

R _pi d _pi represents the time period between the i-th data point in the previous Slice _p data points representing a proportion of the total period of Slice _p; t _last data point represents the last time the Slice _p where, t _first Indicates the time at which the first data point of Slice _p is located; d _first represents the time period between the first data point of Slice _p and the last data point in the previous slice (this time period is saved in the first of Slice _p ) Among the elements).

Optionally, the vector is used to represent the first feature sequence and the representative slice; in this case, the S303 can be packaged. Included: determining a similarity between the first feature sequence and the at least one representative slice. S304 may include updating the current model if the similarity is greater than or equal to the second preset threshold.

The similarity between the first feature sequence and the representative slice can be seen as the similarity between the two vectors. In a specific implementation, the similarity between two vectors can be determined in any way. In addition, the first feature sequence and the representative slice can also be regarded as slices. An optional implementation for determining the similarity between the two slices is provided below. It should be noted that two slices of similarity are calculated. The number of elements in are equal:

The first feature sequence is represented as Slice _p and the representative slice is represented as Slice _q . The following formula determines the similarity between Slice _p and Slice _q :

D(Slice _p , Slice _q )=D _m (Slice _p , Slice _q )+D _t (Slice _p , Slice _q );

Where D(Slice _p , Slice _q ) represents the similarity between Slice _p and Slice _q ; D _m (Slice _pi , Slice _qi ) represents the i-th data feature of the online service data corresponding to Slice _p corresponding to Slice _q The mode distance between the i-th data features of the historical service data; D _d (Slice _pi , Slice _qi ) indicates the i-th data feature of the online service data corresponding to Slice _p and the i-th of the historical service data corresponding to Slice _q The time distance between data features. among them:

Where I represents the number of data points (optionally feature points) in the first feature sequence, and I is an integer greater than or equal to 1.

FIG. 8 is a schematic flowchart diagram of a method for acquiring a representative slice according to an embodiment of the present invention. The method shown in Figure 8 can include:

S801: Acquire historical business data, and construct historical features according to data characteristics of historical business data. sequence.

The historical service data refers to any part of historical business data or all historical business data relative to the current time. For a specific implementation manner of constructing a historical feature sequence according to the data feature of the historical service data, reference may be made to the specific implementation manner of constructing the first feature sequence according to the data feature of the online service data, and details are not described herein again.

For example, the IPTV online video is an animation, and the data feature of the online business data is an example of the number of people watching the animation. If the window is a time window, for example, half an hour, then S801 may include:

S1: The server counts the number of people watching the animation under each window in a period of time, and obtains the data sequence 1.

Among them, the data sequence 1 can be similar to the first data feature provided above. A schematic diagram of a curve drawn according to the first data sequence is shown in FIG. 9. In FIG. 9, the abscissa indicates the window number, the ordinate indicates the number of people watching the cartoon, and FIG. 9 shows the watching cartoons obtained in several windows. Number of people.

S2: The server extracts the feature points in the data feature 1 and constructs the data sequence 2 from the extracted feature points.

Wherein, the data sequence 2 can be similar to the second data feature provided above. The extracted feature points are respectively represented as feature points A to P, as shown in FIG. 10 (FIG. 10 is drawn based on FIG. 9).

S3: Construct a historical feature sequence according to data feature 2.

Based on the example shown in FIG. 10, the historical feature sequence obtained in S3 may be: HTS={(5,-2,5), (14,3,9)......}; wherein, (5, -2, 5 ) indicates feature point B, and (14, 3, 9) indicates feature point C.

S802: Determine a model change point in the historical feature sequence.

The model change point refers to an update trigger point whose amplitude between the two models before and after the triggered model update process is greater than or equal to a preset threshold. The update trigger point may be an update trigger point determined according to a method for determining an update trigger point provided in the prior art, or may be determined by any method for determining an update trigger point according to an embodiment of the present invention. Update the trigger point. The model change point is explained below through a specific example:

Assume that at time t0, the model in the server is model 1, and the sequence obtained by arranging the update trigger points in chronological order is: update trigger points 1, 2, then, at the time when the trigger point 1 is updated, the current model (ie, Model 1) After updating, model 2 is obtained; after updating the current model (ie, model 2) at the time when update trigger point 2 is updated, model 3 is obtained, as shown in FIG. The In the case of updating trigger point 1, if the magnitude of the change between the two models before and after the model update process triggered by the update trigger point 1 (ie, model 1 and model 2) is greater than or equal to the preset threshold, it will be updated. Trigger point 1 is used as the model change point; for updating trigger point 2, if the update model 2 is triggered by the model update process, the change between the two models (ie, model 2 and model 3) is greater than or equal to the preset. Threshold, the trigger point 2 will be updated as the model change point.

The specific implementation manner of the variation range between the two models is not limited in the embodiment of the present invention, and may be implemented by using any one of the prior art. Alternatively, the magnitude of the change between the two models can be determined in any of the following ways:

Mode 1 In the embodiment of the present invention, the model is a logistic regression model, and the Euclidean distance between the vectors of the parameters of the two models may be used as the variation range between the two models.

Mode 2: Taking the model in the embodiment of the present invention as a naive Bayesian model, the Euclidean distance between the vectors formed by the prior probabilities of the two models may be used as the variation range between the two models.

It should be noted that the update trigger point in the optional implementation manner is an update trigger point determined by any method for determining an update trigger point provided by the embodiment of the present invention, and the trigger point to be tested and the update trigger point are updated. And the relationship between the model change points is explained: First, the trigger point to be tested, the update trigger point and the model change point are time concepts. Secondly, the trigger point to be tested may be the update trigger point, or may not be the update trigger point; the update trigger point must be the trigger point to be tested; the update trigger point may be the model change point or the model change point; the model change point must be Update the trigger point. In addition, the interval between adjacent update trigger points is an integer multiple of the interval between adjacent test trigger points; the interval between adjacent model change points is an integer multiple of the interval between adjacent test trigger points; The interval between adjacent update trigger points is not directly related to the interval between adjacent model change points.

In addition, it should be noted that the relationship between the time at which the data point is located, the time at which the feature point is located, and the model change point are as follows: the time at which the data point is located may be the model change point or the model change point; the model change point must be The time at which the data point is located; there is no direct relationship between the time at which the feature point is located and the model change point.

Based on the above examples of S1 to S3, it is assumed that the model change point determined in S802 is as shown in FIG. 12, and the abscissa corresponding to each small dot in FIG. 12 represents a model change point, part of the model change point and feature point. The moments at the moment coincide. It should be noted that FIG. 12 is drawn based on FIG. 10; in actual implementation, the server determines the model change point based on the update trigger point, and in order to clearly explain that the time at which the feature point is located is independent of the model change point, FIG. 11 Combined with the determined model change points In one figure (ie Figure 12). It can be seen from FIG. 12 that the adjacent model change points may include the time at which one or more feature points are located, for example, the time at which the adjacent model change points F and H include the feature points F, G, and H; One or more model change points may be included between the moments at which the feature points are located, for example, three model change points are included between the moments where the adjacent feature points B and C are located. Therefore, the moment at which the feature point is located is independent of the model change point.

S803: Cutting the historical feature sequence based on the model change point to obtain a representative slice.

Specifically, at each model change point, the historical feature sequence is cut to obtain a plurality of segments; wherein, when cutting, the model change point can be used as the starting point of the latter segment. Each fragment is a subset of the historical feature sequence.

Suppose the historical feature sequence is HTS={(t ₁ , m ₁ , d ₁ ), (t ₂ , m ₂ , d ₂ ), . . . (t _n , m _n , d _n )}, and the model determined in S802 There are L-1 change points. Then, in S803, after cutting the historical feature sequence HTS, L slices are obtained, as follows:

Wherein k is an integer greater than or equal to 2, L is an integer greater than or equal to 2, and n is an integer greater than or equal to 2.

It should be noted that the method shown in FIG. 8 may be performed by the server in an offline state, or may be performed by the server in an online state. If the method shown in FIG. 8 is that the server is executed in the online state, the method shown in FIG. 8 can be executed at any step before the execution of S303 described above.

If the set of representative slices obtained in S803 is referred to as a representative slice library, the representative slice library may not be updated once determined; or may be updated as the historical feature sequence is updated. The updating the historical feature sequence may include: when the online feature sequence (for example, the first feature sequence and the second feature sequence, etc.) gradually becomes a new historical feature sequence, in which case, These newly added historical feature sequences can be used as a new representative slice update representative slice library, or these newly added historical feature sequences can be combined with the original historical feature sequence to re-determine the representative slice to update the representative slice library; thus, the server The data characteristics of the historical business data and the historical business data may not be saved, but the historical feature sequence may be saved, thereby saving storage space, and the rate of updating the representative slicing library may be increased, and the time for updating the representative slicing library may be shortened.

Based on the example in S802, the slice between model change points 8, 9 can be expressed as: {(33, 2,3), (38,1,5)}. The slice between adjacent feature points B and C can be expressed as: Slice_1={(7,3,2)}, Slice_2={(10,3,3)}, Slice_3={(14,3,4)} . It should be noted that the rate of change in the plurality of slices between two adjacent feature points shares the rate of change between the two feature points.

Optionally, as shown in FIG. 8a (FIG. 8a is drawn based on FIG. 8), S803 may include:

S803': The historical feature sequence is cut based on the model change point, and the slice obtained after the cutting is clustered to obtain a representative slice.

Specifically, the historical feature sequence is cut based on the model change point, and the sliced slice is clustered by using a clustering algorithm to obtain a representative slice. The embodiment of the present invention does not limit the specific implementation of the clustering algorithm, and may be any clustering algorithm in the prior art, for example, may be a k-means clustering algorithm.

For example, the relationship between any two slices is determined. If the relationship satisfies certain conditions, the two slices may be clustered (ie, the two slices are considered to be slices of the same kind), and then selected. Any one of these slices is used as a representative slice of the class. For a specific implementation manner of the relationship between two slices, reference may be made to the above. For example, according to the above calculation method for determining the distance between Slice _p and Slice _q , the distance between any two slices obtained after cutting is determined, and if the distance is less than or equal to a preset threshold, the two slices are performed. Cluster and use one of the two slices as a representative slice.

Based on the example in S803, the mode distance Dm (Slice_1, Slice_2), the temporal distance Dt (Slice_1, Slice_2), and the total distance D (Slice_1, Slice_2) between Slice_1 and Slice_2 are as shown in Table 2:

Table 2

Dm(Slice_1,Slice_2)Dm(Slice_1,Slice_2)	Dt(Slice_1,Slice_2)Dt(Slice_1,Slice_2)	D(Slice_1,Slice_2)D (Slice_1, Slice_2)
\|3-3\|＝0\|3-3\|=0	\|2/2-3/3\|＝0\|2/2-3/3\|=0	00

Similarly, the distances of D (Slice_1, Slice_3) and D (Slice_2, Slice_3) can be calculated to be 0. Therefore, the three slices can be selected as the representative slice.

Similarly, it can be concluded that the representative slices of each slice as shown in FIG. 12 are as shown in Table 3:

table 3

Slice_ABSlice_AB	Slice_BCSlice_BC	Slice_CESlice_CE	Slice_FHSlice_FH
{(5,-2,5)}{(5,-2,5)}	{(7,3,2)}{(7,3,2)}	{(16,-3,3),(21,2,5)}{(16,-3,3),(21,2,5)}	{(33,2,3),(38,1,5)}{(33,2,3),(38,1,5)}

It should be noted that the features of the partial segments (ie, slices) obtained after cutting the historical feature sequences may be similar, and the clusters obtained after the cutting are clustered by using the optional implementation manner. It is enough to reduce the number of representative slices, thereby saving the storage space occupied by the representative slice library; further, it is also possible to reduce the determination between the online feature sequence (for example, the first feature sequence or the second feature sequence) and the representative slice similar to these features. The amount of computation in the process of associating relationships, thereby increasing the rate of model update.

The model update method provided above is explained below with a specific example:

Assume that the data sequence in the current time server is {data point a (104, v1), data point b (109, v2)}, that is, data points a, b are not update trigger points, before data point a A data point is an update trigger point, where "104" represents the 104th time window and "109" represents the 109th time window. Then, if the current time is a trigger point to be tested, then:

S11: Obtain the online service data received in the trigger point to be tested, and the data feature v3 of the online service data, and obtain the data point c (114, v3), and “114” represents the 114th time window, which is assumed to be based on the data. The curves drawn by points a, b and c are shown in Fig. 13.

S12: Add data point c (114, v3) to the data sequence {data point a (104, v1), data point b (109, v2)} to obtain a first data sequence {data point a (104, v1), Data point b (109, v2), data point c (114, v3)}.

S13: Extract a feature point in the first data sequence according to the determining method of the feature point provided above, to generate a second data sequence.

It should be noted that since the data feature {data point a (104, v1), data point b (109, v2)} contains the data point a, and the data point a is not the last data point in the data sequence, The data point a is determined as a feature point in the process of determining whether the last trigger point to be tested is an update trigger point. S13 may specifically include: determining whether the data point b is a feature point according to the method provided above, and directly using the data point c as a feature point. It is assumed that the determined second data sequence is: {data point a (104, v1), data point b (109, v2), data point c (114, v3)}.

S14: The feature sequence TS={(109, 2, 3), (114, 1, 5)} determined according to the second data sequence.

S15: Determine a distance between the feature sequence TS={(109, 2, 3), (114, 1, 5)} and at least one representative slice, and determine whether the distance is less than or equal to a preset distance threshold (ie, The first preset threshold in the text is used to determine whether the time at which the data point c is located (ie, the trigger point to be tested) is an update trigger point.

Since the current time is the time at which the data point c is located, it is only necessary to determine whether the time at which the data point c is located is an update trigger point.

Assuming that each representative slice in the representative slice library is as shown in Table 3, then according to the number of elements included in the feature sequence TS, it can be seen that in S5, it is necessary to calculate the settlement Slice ac{(109, 2, 3). , (114,1,5)}, the distance between Slice CE and Slice FH, respectively, the specific calculation process As shown in Table 4:

Table 4

Assuming that the preset distance threshold is 0.5, it can be seen from Table 4 that D(Slice_ac, Slice_FH)=0<0.5, so c is an update trigger point.

Without loss of generality, it is determined whether the time at which the data point b is located is an update trigger point before determining whether the data point c is at the time of updating the trigger point.

For example, if the current time is the time when the data point b is located, determining whether the data point b is the update trigger point may include: calculating the slice Slice_ab (109, 2, 3) represented by ab and the representative slice Slice_AB {(5, - 2,5)}, the distance of Slice_BC1{(7,3,2)}, as shown in Table 5.

table 5

D(Slice_ab,Slice_AB)D (Slice_ab, Slice_AB)	D(Slice_ab,Slice_BC)D (Slice_ab, Slice_BC)
\|2-(-2)\|+\|3/3-5/5\|＝4\|2-(-2)\|+\|3/3-5/5\|=4	\|2-3\|+\|3/3-2/2\|＝1\|2-3\|+\|3/3-2/2\|=1

Assuming that the preset distance threshold is 0.5, it can be seen from Table 5 that the time at which the data point b is located is not the update trigger point.

The solution provided by the embodiment of the present invention is mainly introduced from the perspective of a model updating device (specifically, a server). It can be understood that in order to implement the above various functions, the model updating apparatus includes hardware structures and/or software modules corresponding to the execution of the respective functions. Those skilled in the art will readily appreciate that the present invention can be implemented in a combination of hardware or hardware and computer software in combination with the modules and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.

The embodiment of the present invention may divide the function module by the model update device according to the above method example. For example, each function module may be divided according to each function, or two or more functions may be integrated into one processing module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present invention is schematic, and is only a logical function division, and the actual implementation may have another division manner.

In the case of the respective functional modules divided by the respective functions, FIG. 14 shows a schematic structural diagram of a model updating apparatus 140. The model updating device 140 may be the server involved in the above embodiment. The model updating apparatus 140 may include: an obtaining module 1401, a building module 1402, a determining module 1403, and an updating module 1404; and optionally, the generating module 1405. The function of each of the functional modules may be inferred according to the steps in the method embodiments provided above, or may refer to the content provided in the above content of the invention, and details are not described herein again. .

In the case of adopting an integrated module, the above-mentioned obtaining module 1401, building module 1402, determining module 1403, updating module 1404, and generating module 1405 can all be integrated into one processing module in one model updating device. In addition, the model updating apparatus may further include a communication module and a storage module.

FIG. 15 is a schematic structural diagram of a model updating apparatus 150 according to an embodiment of the present invention. The model updating means 150 may be the server involved in the above embodiment. The model updating apparatus 150 may include a processing module 1501 and a communication module 1502. The processing module 1501 is configured to perform control management on the operation of the model updating apparatus 150. For example, the processing module 1501 is configured to support the model updating apparatus 150 to perform the operations in FIG. 3, FIG. 3a, FIG. 6, FIG. 6a, FIG. 8, and FIG. Various steps, and/or other processes for the techniques described herein. For example, it can also be used to support the steps S1 to S3, S11 to S15, and the like provided in the specific examples above. The communication module 1502 is configured to support communication of the model update device 150 with other network entities, such as communication with a service client, and the like. Optionally, the model updating apparatus 150 may further include: a storage module 1503, configured to store the program code and data corresponding to the model updating apparatus 150 to perform any of the model updating methods provided above.

The processing module 1501 may be a processor or a controller, such as a CPU, a general purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It is possible to implement or carry out various exemplary logical blocks, modules and circuits described in connection with the disclosure of the embodiments of the invention. The processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like. The communication module 1502 can be a transceiver, a transceiver circuit, a communication interface, or the like. The storage module 1503 can be a memory.

When the processing module 1501 is a processor, the communication module 1502 is a transceiver, and the storage module 1503 is a memory, the model updating apparatus 150 according to the embodiment of the present invention may be shown by the model updating apparatus 20 shown in FIG. 2.

It will be apparent to those skilled in the art that the above description is convenient and concise for the description. The device is exemplified by the division of each functional module mentioned above. In practical applications, the above function assignment can be completed by different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete the above description. All or part of the function. For the specific working process of the system, the device and the module described above, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the modules or modules is only a logical function division. In actual implementation, there may be another division manner, for example, multiple modules or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.

The modules described as separate components may or may not be physically separated. The components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules.

The integrated modules, if implemented in the form of software functional modules and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, and the like, which can store a program code.

The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the appended claims.

Claims

A method for updating a model, comprising:

Obtaining the first online service data received in the window where the trigger point to be tested is located;

Constructing a first feature sequence according to the data feature of the first online service data;

Determining an association relationship between the first feature sequence and at least one representative slice; the representative slice is a slice of a feature sequence constructed according to data characteristics of historical service data;

If the association relationship between the first feature sequence and the at least one representative slice satisfies a preset condition, the current model is updated.
The method according to claim 1, wherein after determining the association relationship between the first feature sequence and the at least one representative slice, the method further comprises:

And if the association relationship between the first feature sequence and the at least one representative slice does not satisfy the preset condition, acquiring a second received in a window where the trigger point to be tested of the to-be-tested trigger point is located Online business data;

And constructing, according to the data feature of the first online service data and the data feature of the second online service data, the second feature sequence according to the receiving time sequence;

Determining an association relationship between the second feature sequence and the at least one representative slice;

And updating the current model if an association relationship between the second feature sequence and the at least one representative slice satisfies a preset condition.
The method according to claim 1 or 2, wherein the first feature sequence and the representative slice are represented by a vector; the determining the association relationship between the first feature sequence and at least one representative slice, include:

Determining a distance between the first feature sequence and at least one representative slice;

If the association relationship between the first feature sequence and the at least one representative slice meets a preset condition, updating the current model, including:

If the distance is less than or equal to the first preset threshold, the current model is updated.
The method according to claim 1 or 2, wherein the first feature sequence and the representative slice are represented by a vector; the determining the first feature sequence and at least one Represents the relationship between slices, including:

Determining a similarity between the first feature sequence and the at least one representative slice;

If the association relationship between the first feature sequence and the at least one representative slice meets a preset condition, updating the current model, including:

If the similarity is greater than or equal to the second preset threshold, the current model is updated.
The method according to any one of claims 1 to 4, wherein the constructing the first feature sequence according to the data feature of the first online service data comprises:

Constructing a first data sequence according to the data feature of the first online service data; wherein, one element in the first data sequence is a data point, and the data point includes at least the following feature: where the data point is located At the moment, the data characteristics of the service data corresponding to the data point;

Generating, by the first data sequence, a first feature sequence; wherein, the element in the first feature sequence includes at least a feature: a time at which the data point is located, a change between the data point and a previous data point The rate, and the time period between the time at which the data point is located and the time at which the previous data point is located.
The method according to claim 5, wherein after the constructing the first data sequence according to the data feature of the first online service data, the method further comprises:

Extracting feature points in the first data sequence, and constructing a second data sequence according to feature points in the first data sequence;

Generating the first data sequence to the first feature sequence, including:

Generating, by the second data sequence, a first feature sequence; wherein, the element in the first feature sequence includes a time at which the feature point is located, a rate of change between the feature point and a previous feature point, and a A time period between a time at which the feature point is located and a time at which the previous feature point is located.
The method according to any one of claims 1 to 6, wherein the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the method is The window in which the trigger point is located refers to the window between the time when the online service data is received and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is from the i-1th A window between the trigger point to be tested and the trigger point to be tested.
The method according to any one of claims 1 to 7, wherein the method further comprises:

The time at which the integer time of the preset duration is started from the time when the online service data is started is determined as the trigger point to be tested.
The method according to any one of claims 1 to 7, wherein the method further comprises:

The time at which the online service data is received is started, and the time when the received online service data is an integral multiple of the preset data amount is determined as the trigger point to be tested.
The method according to any one of claims 1 to 9, wherein before the determining the association relationship between the first feature sequence and the at least one representative slice, the method further comprises:

Obtaining historical business data, and constructing a historical feature sequence according to the historical business data;

Determining a model change point in the sequence of historical features;

The historical feature sequence is cut based on the model change point in the historical feature sequence to obtain a representative slice.
The method according to claim 10, wherein the cutting the historical feature sequence based on the model change point to obtain a representative slice comprises:

The historical feature sequence is cut based on the model change point, and the slice obtained after cutting is clustered to obtain a representative slice.
A model updating device, comprising:

An acquiring module, configured to acquire first online service data received in a window where the trigger point to be tested is located;

a building module, configured to construct a first feature sequence according to the data feature of the first online service data;

a determining module, configured to determine an association relationship between the first feature sequence and at least one representative slice; the representative slice is a slice of a feature sequence constructed according to data features of historical service data;

And an updating module, configured to update the current model if an association relationship between the first feature sequence and the at least one representative slice satisfies a preset condition.
The device according to claim 12, characterized in that

The acquiring module is further configured to: if the association relationship between the first feature sequence and the at least one representative slice does not meet the preset condition, obtain a subsequent trigger point to be tested of the to-be-tested trigger point Second online business data received within the window;

The building module is further configured to: according to the data feature of the first online service data and the data feature of the second online service data, construct a second feature sequence according to a receiving time sequence;

The determining module is further configured to: determine an association relationship between the second feature sequence and the at least one representative slice;

The updating module is further configured to: if the association relationship between the second feature sequence and the at least one representative slice meets a preset condition, update the current model.
The apparatus according to claim 12 or 13, wherein the first feature sequence and the representative slice are represented by a vector;

The determining module is specifically configured to: determine a distance between the first feature sequence and at least one representative slice;

The update module is specifically configured to: if the distance is less than or equal to the first preset threshold, update the current model.
The apparatus according to claim 12 or 13, wherein the first feature sequence and the representative slice are represented by a vector;

The determining module is specifically configured to: determine a similarity between the first feature sequence and at least one representative slice;

The update module is specifically configured to: if the similarity is greater than or equal to a second preset threshold, update the current model.
The device according to any one of claims 12 to 15, wherein the building module is specifically configured to:

Constructing a first data sequence according to data characteristics of the first online service data; wherein An element in the first data sequence is a data point, and the data point includes at least the following feature: a time at which the data point is located, and a data feature of the service data corresponding to the data point;

Generating, by the first data sequence, a first feature sequence; wherein, the element in the first feature sequence includes at least a feature: a time at which the data point is located, a change between the data point and a previous data point The rate, and the time period between the time at which the data point is located and the time at which the previous data point is located.
The device of claim 16 wherein:

The building module is further configured to: extract feature points in the first data sequence, and construct a second data sequence according to the feature points in the first data sequence;

The constructing module is configured to: when the first data sequence is generated by using the first data sequence, generate: the second data sequence to generate a first feature sequence; wherein, the element in the first feature sequence And including a time period at which the feature point is located, a rate of change between the feature point and a previous feature point, and a time period between a time at which the feature point is located and a time at which the previous feature point is located.
The device according to any one of claims 12 to 17, wherein the trigger point to be tested is the i-th trigger point to be tested, i≥1, i is an integer; if i=1, the waiting The window in which the trigger point is located refers to the window between the time when the online service data is received and the trigger point to be tested; if i≥2, the window of the trigger point to be tested is from the i-1th A window between the trigger point to be tested and the trigger point to be tested.
Apparatus according to any one of claims 12 to 18, wherein

The determining module is further configured to determine, as the trigger point to be tested, a time when an integer multiple of the preset duration from the time when the online service data is started to be received.
Apparatus according to any one of claims 12 to 18, wherein

The determining module is further configured to determine, as the trigger point to be tested, from a time when the online service data is started to be received, and when the received online service data is an integer multiple of the preset data amount.
Apparatus according to any one of claims 12 to 20, wherein

The obtaining module is further configured to: acquire historical service data;

The building module is further configured to: construct a historical feature sequence according to the historical service data;

The determining module is further configured to: determine a model change point in the historical feature sequence;

The device also includes:

And a generating module, configured to cut the historical feature sequence based on the model change point in the historical feature sequence to obtain a representative slice.
The device according to claim 21, wherein

The generating module is specifically configured to: cut a historical feature sequence based on a model change point, and perform clustering on the slice obtained after the cutting to obtain a representative slice.
A model updating apparatus, comprising: a processor, a memory, a system bus, and a communication interface;

The memory is configured to store a computer to execute instructions, the processor is coupled to the memory via the system bus, and when the device is in operation, the processor executes the computer-executed instructions stored in the memory to cause The apparatus performs the model updating method according to any one of claims 1-11.