CN111274462A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN111274462A
CN111274462A CN202010049536.8A CN202010049536A CN111274462A CN 111274462 A CN111274462 A CN 111274462A CN 202010049536 A CN202010049536 A CN 202010049536A CN 111274462 A CN111274462 A CN 111274462A
Authority
CN
China
Prior art keywords
behavior information
behavior
target object
time periods
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010049536.8A
Other languages
Chinese (zh)
Inventor
宋德超
贾巨涛
李立辉
项伟伟
刘家平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai, Zhuhai Lianyun Technology Co Ltd filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN202010049536.8A priority Critical patent/CN111274462A/en
Publication of CN111274462A publication Critical patent/CN111274462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device. Wherein, the method comprises the following steps: acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. The invention solves the technical problem that the behavior habit of the user is difficult to accurately analyze under the condition of small data volume in the prior art.

Description

Data processing method and device
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for processing data.
Background
Over time, people are becoming increasingly aware of the importance of data. The big data era provides new challenges for the data handling capability of human beings and also provides unprecedented space and potential for people to obtain more profound and comprehensive insights. For example, the behavior habits of the user can be analyzed through big data analysis, which is usually performed based on the frequency of each behavior of the user at present, but the behavior habits obtained according to the frequency are not accurate for a single user or a group of users with less data.
Aiming at the problem that the behavior habit of a user is difficult to accurately analyze under the condition of small data volume in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and device, and at least solves the technical problem that the behavior habit of a user is difficult to accurately analyze under the condition of small data volume in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a data processing method, including: acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
Further, acquiring behavior information of the target object, and constructing a plurality of behavior information strings based on the execution time of the behavior information, including: acquiring behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and segmenting the connection result to obtain the behavior information string in each time period.
Further, segmenting the connection result to obtain a behavior information string in each time period, including: acquiring a time difference between two adjacent behavior information; and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
Further, before vectorization processing is carried out on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, a time period to which the execution time of the initial behavior information in the behavior information string belongs is determined as a time period to which the behavior information string belongs; and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
Further, vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, including: acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: identification of the behavior information, interaction data and time interval with the next behavior information; and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
Further, determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods, including: clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance; obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
Further, after acquiring the behavior information of the target object within the preset time range, the method further includes: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
Further, the behavior information includes voice interaction behavior information between the target object and the home appliance.
According to an aspect of an embodiment of the present invention, there is provided a data processing apparatus including: the acquisition module is used for acquiring the behavior information of the target object and forming a plurality of behavior information strings based on the execution time of the behavior information; the processing module is used for vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; the determining module is used for determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
According to an aspect of the embodiments of the present invention, there is provided a storage medium including a stored program, wherein, when the program runs, a device on which the storage medium is located is controlled to execute the above-mentioned data processing method.
According to an aspect of the embodiments of the present invention, there is provided a processor, configured to execute a program, where the program executes the method for processing data described above.
In the embodiment of the invention, behavior information of a target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. According to the scheme, under the condition that the data volume is limited, vectorization processing is carried out on the behavior information of different time periods, and clustering analysis is carried out on the behavior vectors of different time periods obtained through the vectorization processing, so that behavior habits of users in different time periods are obtained, and the problem that the behavior habits of the users are difficult to accurately analyze under the condition that the data volume is small in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method of processing data according to an embodiment of the invention;
FIG. 2 is a diagram illustrating an alignment completion of behavior information strings according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a first vectorization process for behavior information strings according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of determining a behavior information string according to an embodiment of the invention;
FIG. 5 is a schematic diagram of obtaining information about user behavior according to an embodiment of the present invention; and
fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of a method for processing data, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than that presented herein.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
step S102, behavior information of the target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information.
Specifically, the target object is a user whose behavior needs to be analyzed, and may be a single user, or may be multiple users that allow the same home appliance to be operated together, for example, multiple persons in a home. The behavior information is an interactive behavior between the user and the home appliance device, for example, a control instruction sent to the home appliance device by the user may be a voice interactive behavior, a gesture interactive behavior, a remote control behavior, a trigger button behavior, and the like. And connecting the behavior information according to the execution time sequence to obtain the behavior information string.
In an optional embodiment, taking an air conditioner as an example, behavior information of a user operating the air conditioner is collected, and the behavior information of each day is connected according to the execution time sequence, so that a behavior information string corresponding to each day can be formed.
And step S104, vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods.
In an alternative implementation, one day may be taken as a cycle, and one day may be divided into 24 time periods, with each hour being taken as a time period.
First, the behavior information string may be mapped in a time period, for example, a time period to which the execution time of the first behavior information in the behavior information string belongs is determined to be the time period to which the behavior information string belongs. After the behavior information string contained in each time period is determined, vectorization processing is carried out on the behavior information string in each time period, and then the behavior vector in each time period can be obtained.
When vectorizing the behavior information string, the name of the behavior information may be converted into a word vector in a word2vec manner or the like, so as to obtain a behavior vector corresponding to the behavior information string, or the behavior information in the behavior information string may be represented by a corresponding preset numerical value, so as to form a behavior vector corresponding to the information string.
Step S160, determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in a plurality of different time periods.
After determining the behavior vectors corresponding to different time periods, clustering the behavior vectors to obtain the habit behavior vectors corresponding to the habit behavior information of the user in each time period, and then converting the habit behavior vectors to obtain the habit behavior information corresponding to the habit behavior information of the user in each time period.
After the habit behavior information of the user in each time period is obtained, the household appliance can be intelligently controlled based on the habit behavior information. In an alternative embodiment, the habit behavior information of the user at 13:00 to 14:00 is obtained as follows: when the intelligent control system is started, the temperature is set to be 23 ℃, and wind is swept up and down, the user can automatically execute the action string at 13:00, or the user is prompted whether to execute the action string, so that the aim of intelligently controlling according to the habit of the user is fulfilled.
As can be seen from the above, in the embodiments of the present application, behavior information of a target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. According to the scheme, under the condition that the data volume is limited, vectorization processing is carried out on the behavior information of different time periods, and clustering analysis is carried out on the behavior vectors of different time periods obtained through the vectorization processing, so that behavior habits of users in different time periods are obtained, and the problem that the behavior habits of the users are difficult to accurately analyze under the condition that the data volume is small in the prior art is solved.
As an alternative embodiment, acquiring behavior information of a target object, and forming a plurality of behavior information strings based on execution time of the behavior information includes: acquiring behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and segmenting the connection result to obtain the behavior information string in each time period.
Specifically, the preset time range may be one month, two months or the like closest to the current time, and the time period may be one day.
In an optional embodiment, taking an air conditioner as an example for explanation, all behavior information interacting with the air conditioner within one month closest to the current time may be obtained, the behavior information within each day is connected to obtain a connection result of the behavior information within each month, and then the connection result of each day is divided.
As an alternative embodiment, segmenting the connection result to obtain the behavior information string in each time period includes: acquiring a time difference between two adjacent behavior information; and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
If the time difference between two adjacent behavior information is greater than the preset time length, the two adjacent behavior information may not belong to a group of operations, and thus needs to be divided.
In an optional embodiment, the preset time length may be 30 minutes, a day is used as a time period, and the connection result of the behavior information in a day is divided according to an interval exceeding 30 minutes, so as to obtain a plurality of behavior information strings.
As an optional embodiment, before performing vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, the method further includes: determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs; and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
Specifically, the alignment completion is used to make the behavior information strings in the same time period have the same dimension, and the dimension that is the same is used to indicate that all the behavior information strings in the same time period are the same behavior information or null action at the same position.
In an alternative embodiment, still taking the time cycle as one day as an example, the one day is divided into 24 time periods on average, and the time period to which the execution time of the first behavior information in the behavior information string belongs is determined. For example, for the behavior information string "power on-set to 23 degrees celsius-wind up and down", the action of power on occurs between 13:00 and 14:00, and thus it is determined that the behavior information string is in the time period of 13:00 to 14: 00.
Because the behavior information contained in the behavior information strings in the same time period may be different, before vectorization processing is performed on the behavior information strings, alignment completion processing needs to be performed on the behavior information strings, so that the dimensions of the behavior information segments in the same time period are the same.
Fig. 2 is a schematic diagram of performing alignment completion on a behavior information string according to an embodiment of the present invention, and with reference to fig. 2, a behavior information string 1 is: action one, action three, action four, the behavior information string 2 is: action one, action three and action five, the action information string 1 and the action information string 2 are the same action, different parts are supplemented with null actions, thereby obtaining the result shown in fig. 2, after the Jining alignment is completed, the action information string 1 is: action one, action three, action four and no action, the action information string 2 is: action one, action three, null action and action five.
Fig. 2 only shows an example of performing alignment completion on two behavior information strings, and the number of behavior information strings included in the same time period is often greater than two, so when performing alignment completion, all behavior information strings in the same time period may be simultaneously performed alignment completion, or the longest behavior information string may be found from the behavior information strings, and the longest behavior information string is respectively performed alignment completion on other behavior information strings.
As an optional embodiment, performing vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods includes: acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: identification of the behavior information, interaction data and time interval with the next behavior information; and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
Specifically, the behavior data corresponding to the behavior information may be one or more, and the behavior information is replaced with the corresponding behavior data, so that the behavior vector corresponding to the behavior information string is obtained.
The identifier of the behavior information may be a preset action number, for example, the identifier corresponds to a power-on operation 01, a temperature adjustment operation 02, a power-off operation 00, and the like; the interaction data is used to indicate the duration of the action or a specific interaction value of the action, for example, the action lasts for 1 hour, the interaction data is 1 hour, the temperature is adjusted to 23 degrees celsius, and the interaction data is 23 hours.
In an alternative embodiment, in the case that the behavior data includes the above three parameters, the behavior information string 1 after completion of alignment in fig. 2 is vectorized to obtain [ action one number, interaction data, time interval, action three number, interaction data, time interval, action four number, interaction data, time interval, null action number, interaction data ]. The interaction data of the null action is 0, and the time interval between the null action and the other actions is also 0.
Fig. 3 is a schematic diagram of a first vectorization processing on a behavior information string according to an embodiment of the present invention, and in conjunction with fig. 3, for each behavior information (i.e., the above-mentioned action) in the behavior information string, three values may be represented, which are an identifier of the behavior information, interaction data, and a time interval, where the interaction data may be an action duration or an interaction value.
As an alternative embodiment, determining the habit behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods includes: clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance; obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
Specifically, when clustering is performed, a mean shift clustering method may be used to obtain a clustering center. And setting a preset radius by taking the clustering center as a circle center to obtain a circle with the clustering center as the circle center, collecting a plurality of points, namely the candidate vectors, from the circle, respectively calculating the mean value of each item in the candidate vectors to form a mean value vector, and converting the mean value vector into behavior information in the same conversion mode, so that the habit behavior information of the user in a plurality of time periods can be obtained.
In an optional implementation, mean shift clustering is performed on the behavior vectors in the same time period, the largest cluster point is selected, points within a certain distance from the example cluster points are selected as initial vectors, each item at the same position in the initial vectors is added, the number of nonzero elements at the position in all the initial vectors is divided, the mean vector is obtained, the mean vector is converted into a behavior information string, and the behavior information string is an interaction path frequently used by a user.
As an optional embodiment, after obtaining the behavior information of the target object within the preset time range, the method further includes: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
Specifically, the behavior information with the occurrence frequency lower than the preset frequency is usually behavior information that is not commonly used by the user, and has a low reference value and can increase the complexity of the operation.
Fig. 4 is a schematic diagram of determining a behavior information string according to an embodiment of the present invention, and with reference to fig. 4, an interactive action (i.e., the behavior information) is obtained, an interactive action with a frequency lower than a preset frequency is removed according to a frequency of the interactive action, a connection interactive action is configured according to a time period, and a connection result is divided according to a time interval to obtain a plurality of interactive paths (i.e., the behavior information string).
As an alternative embodiment, the behavior information includes voice interaction behavior information between the target object and the home appliance.
In the above scheme, the behavior information includes voice interaction behavior information between the target object and the home appliance, so that the finally obtained habit behavior information is voice habit behavior information of the user, and thereby the behavior habit of the user when performing voice interaction with the home appliance is obtained.
Fig. 5 is a schematic diagram of obtaining user habit information according to an embodiment of the present invention, and with reference to fig. 5, an interaction path (i.e., the behavior information string) is first segmented according to a time interval, the interaction path is aligned and completed, the interaction path is vectorized, mean shift clustering is performed on vectors obtained through the vectorization processing to obtain cluster points, points within a certain distance from the cluster points are selected, and vectors corresponding to the selected points are processed (the mean value of the vectors corresponding to the selected points is obtained and converted into corresponding interaction paths), so as to obtain interaction paths commonly used by users.
Example 2
According to an embodiment of the present invention, there is provided an embodiment of a data processing apparatus, and fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention, as shown in fig. 6, the apparatus includes:
the obtaining module 60 is configured to obtain behavior information of the target object, and form a plurality of behavior information strings based on the execution time of the behavior information.
And the processing module 62 is configured to perform vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods.
And the determining module 64 is configured to determine the habit behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
As an alternative embodiment, the obtaining module includes: the first obtaining sub-module is used for obtaining behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; the connection submodule is used for connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and the division submodule is used for dividing the connection result to obtain the behavior information string in each time period.
As an alternative embodiment, the partitioning submodule includes: an acquisition unit configured to acquire a time difference between two adjacent pieces of behavior information; and the dividing unit is used for dividing the two adjacent behavior information if the time difference is greater than the preset time length.
As an alternative embodiment, the apparatus further comprises: the determining module is used for determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs before vectorization processing is carried out on the behavior information string to obtain the behavior vectors of the target object in a plurality of different time periods; and the processing module is used for carrying out alignment completion processing on the behavior information strings in each time period so that the behavior information strings in the same time period have the same dimensionality.
As an alternative embodiment, the processing module comprises: the second obtaining submodule is configured to obtain behavior data corresponding to each piece of behavior information in the behavior information string in each time period, where the behavior data includes at least one of the following: identification of the behavior information, interaction data and time interval with the next behavior information; and the composition module is used for replacing the behavior information with the behavior data to form the behavior vector of the target object in a plurality of different time periods.
As an alternative embodiment, the determining module includes: the clustering submodule is used for clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; the selection submodule is used for selecting a plurality of candidate vectors of which the distances from the clustering center are smaller than the preset distance; the third obtaining submodule is used for obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and the conversion sub-module is used for converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
As an alternative embodiment, the apparatus further comprises: and the rejecting module is used for rejecting the behavior information of which the occurrence frequency is lower than the preset frequency after acquiring the behavior information of the target object within the preset time range.
As an alternative embodiment, the behavior information includes voice interaction behavior information between the target object and the home appliance.
Example 3
According to an embodiment of the present invention, a storage medium is provided, and the storage medium includes a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the data processing method according to embodiment 1.
Example 4
According to an embodiment of the present invention, there is provided a processor, configured to execute a program, where the program executes the data processing method according to embodiment 1.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (11)

1. A method for processing data, comprising:
acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information;
vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods;
and determining the habitual behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods.
2. The method of claim 1, wherein obtaining behavior information of a target object and constructing a plurality of behavior information strings based on execution time of the behavior information comprises:
acquiring behavior information of the target object in a preset time range, wherein the preset time range comprises a plurality of time periods;
connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results;
and segmenting the connection result to obtain a behavior information string in each time period.
3. The method according to claim 2, wherein the segmenting the concatenation result to obtain the behavior information string in each time period comprises:
acquiring a time difference between two adjacent behavior information;
and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
4. The method according to claim 1, wherein before vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, the method further comprises:
determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs;
and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
5. The method according to claim 4, wherein vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods comprises:
acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: the identification of the behavior information, the interaction data and the time interval with the next behavior information;
and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
6. The method of claim 1, wherein determining the habitual behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods comprises:
clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period;
selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance;
obtaining the mean value of the selected candidate vectors to obtain a mean value vector;
and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
7. The method according to claim 2, wherein after acquiring the behavior information of the target object within a preset time range, the method further comprises: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
8. The method according to claim 1, wherein the behavior information includes voice interaction behavior information between the target object and a home appliance.
9. An apparatus for processing data, comprising:
the acquisition module is used for acquiring the behavior information of the target object and forming a plurality of behavior information strings based on the execution time of the behavior information;
the processing module is used for vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods;
and the determining module is used for determining the habit behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods.
10. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the data processing method of any one of claims 1 to 8.
11. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute a method for processing data according to any one of claims 1 to 8 when the program is run.
CN202010049536.8A 2020-01-16 2020-01-16 Data processing method and device Pending CN111274462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010049536.8A CN111274462A (en) 2020-01-16 2020-01-16 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010049536.8A CN111274462A (en) 2020-01-16 2020-01-16 Data processing method and device

Publications (1)

Publication Number Publication Date
CN111274462A true CN111274462A (en) 2020-06-12

Family

ID=71000944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010049536.8A Pending CN111274462A (en) 2020-01-16 2020-01-16 Data processing method and device

Country Status (1)

Country Link
CN (1) CN111274462A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112057079A (en) * 2020-08-07 2020-12-11 中国科学院深圳先进技术研究院 Behavior quantification method and terminal based on state and map
CN112883257A (en) * 2021-01-11 2021-06-01 北京达佳互联信息技术有限公司 Behavior sequence data processing method and device, electronic equipment and storage medium
WO2022027590A1 (en) * 2020-08-07 2022-02-10 中国科学院深圳先进技术研究院 State and graph-based behavior quantification method and terminal
CN115204322A (en) * 2022-09-16 2022-10-18 成都新希望金融信息有限公司 Behavioral link abnormity identification method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107426177A (en) * 2017-06-13 2017-12-01 努比亚技术有限公司 A kind of user behavior clustering method and terminal, computer-readable recording medium
CN108470034A (en) * 2018-02-01 2018-08-31 百度在线网络技术(北京)有限公司 A kind of smart machine service providing method and system
CN109933502A (en) * 2019-01-23 2019-06-25 平安科技(深圳)有限公司 Electronic device, the processing method of user operation records and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107426177A (en) * 2017-06-13 2017-12-01 努比亚技术有限公司 A kind of user behavior clustering method and terminal, computer-readable recording medium
CN108470034A (en) * 2018-02-01 2018-08-31 百度在线网络技术(北京)有限公司 A kind of smart machine service providing method and system
CN109933502A (en) * 2019-01-23 2019-06-25 平安科技(深圳)有限公司 Electronic device, the processing method of user operation records and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112057079A (en) * 2020-08-07 2020-12-11 中国科学院深圳先进技术研究院 Behavior quantification method and terminal based on state and map
WO2022027590A1 (en) * 2020-08-07 2022-02-10 中国科学院深圳先进技术研究院 State and graph-based behavior quantification method and terminal
CN112883257A (en) * 2021-01-11 2021-06-01 北京达佳互联信息技术有限公司 Behavior sequence data processing method and device, electronic equipment and storage medium
WO2022148186A1 (en) * 2021-01-11 2022-07-14 北京达佳互联信息技术有限公司 Behavioral sequence data processing method and apparatus
CN112883257B (en) * 2021-01-11 2024-01-05 北京达佳互联信息技术有限公司 Behavior sequence data processing method and device, electronic equipment and storage medium
CN115204322A (en) * 2022-09-16 2022-10-18 成都新希望金融信息有限公司 Behavioral link abnormity identification method and device

Similar Documents

Publication Publication Date Title
CN111274462A (en) Data processing method and device
CN109118330B (en) Household appliance recommendation method and device, storage medium and server
EP3037983A1 (en) Data processing system, data processing method, and data processing device
CN110738577B (en) Community discovery method, device, computer equipment and storage medium
US11907659B2 (en) Item recall method and system, electronic device and readable storage medium
CN103106285A (en) Recommendation algorithm based on information security professional social network platform
CN107894827B (en) Application cleaning method and device, storage medium and electronic equipment
CN106201624A (en) A kind of recommendation method of application program and terminal
CN106294219A (en) A kind of equipment identification, data processing method, Apparatus and system
Kang et al. A service scenario generation scheme based on association rule mining for elderly surveillance system in a smart home environment
CN111597241A (en) Method, device and equipment for data acquisition
CN114880560A (en) Content recommendation method and device, storage medium and electronic device
CN114223139B (en) Interface switching method and device, wearable electronic equipment and storage medium
CN114855416A (en) Recommendation method and device of washing program, storage medium and electronic device
CN114154078A (en) Information recommendation method and device, electronic equipment and storage medium
KR20180007248A (en) Method for frequent itemset mining from uncertain data with different item importance and uncertain weighted frequent item mining apparatus performing the same
CN112905937A (en) Service content updating and generating method based on big data and cloud computing service system
CN114595372A (en) Scene recommendation method and device, computer equipment and storage medium
CN106469086B (en) Event processing method and device
CN114861678A (en) Method and apparatus for determining time information, storage medium, and electronic apparatus
CN115599260A (en) Intelligent scene generation method, device and system, storage medium and electronic device
CN111107493A (en) Method and system for predicting position of mobile user
CN113326296B (en) Load decomposition method and system suitable for industrial and commercial users
CN114417988A (en) Method and apparatus for determining operation information, storage medium, and electronic apparatus
CN111669654B (en) Program recommendation method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200612