WO2019120024A1 - User gender identification method, apparatus, storage medium, and electronic device - Google Patents

User gender identification method, apparatus, storage medium, and electronic device Download PDF

Info

Publication number
WO2019120024A1
WO2019120024A1 PCT/CN2018/116713 CN2018116713W WO2019120024A1 WO 2019120024 A1 WO2019120024 A1 WO 2019120024A1 CN 2018116713 W CN2018116713 W CN 2018116713W WO 2019120024 A1 WO2019120024 A1 WO 2019120024A1
Authority
WO
WIPO (PCT)
Prior art keywords
gender
feature set
user
similarity
reference feature
Prior art date
Application number
PCT/CN2018/116713
Other languages
French (fr)
Chinese (zh)
Inventor
陈岩
刘耀勇
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019120024A1 publication Critical patent/WO2019120024A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present application relates to the field of terminal technologies, and in particular, to a user gender identification method, apparatus, storage medium, and electronic device.
  • the embodiment of the present application provides a user gender identification method, device, storage medium, and electronic device, which can accurately identify the gender of the user.
  • an embodiment of the present application provides a user gender identification method, including:
  • an embodiment of the present application provides a user gender identification apparatus, including:
  • a first feature acquiring module configured to acquire a multi-dimensional feature of the plurality of sample users that has gender recognition during use of the application, and obtain a sample feature set of the plurality of sample users;
  • a feature set generating module configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set
  • a second feature acquisition module configured to acquire a multi-dimensional feature that is gender-recognized by an unknown gender user during application use, and obtain a feature set of the unknown gender user;
  • the user gender identification module is configured to acquire the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
  • a storage medium provided by an embodiment of the present application has a computer program stored thereon, and when the computer program is run on a computer, the computer is caused to perform a user gender identification method according to any embodiment of the present application.
  • an electronic device provided by the embodiment of the present application includes a processor and a memory, where the memory has a computer program, and the processor is used to execute a user as provided in any embodiment of the present application by calling the computer program. Gender identification method.
  • FIG. 1 is a schematic diagram of an application scenario of a user gender identification method according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a user gender identification method provided by an embodiment of the present application.
  • FIG. 3 is another schematic flowchart of a user gender identification method provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a user gender identification apparatus according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 6 is another schematic structural diagram of an electronic device according to an embodiment of the present application.
  • references to "an embodiment” herein mean that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the present application.
  • the appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
  • the embodiment of the present application provides a user gender identification method, including:
  • the gender reference feature set includes a male reference feature set and a female reference feature set
  • the acquiring the similarity of the feature set and the gender reference feature set, and identifying the similarity according to the similarity include:
  • the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
  • the gender reference feature set is a male reference feature set
  • the similarity of the feature set and the gender reference feature set is obtained
  • the gender of the unknown gender user is identified according to the similarity
  • the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  • the gender reference feature set is a female reference feature set
  • the similarity of the feature set and the gender reference feature set is obtained
  • the gender of the unknown gender user is identified according to the similarity
  • the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
  • the obtaining a plurality of sample users has gender-recognized multi-dimensional features during application use, and the step of obtaining the sample feature sets of the plurality of sample users comprises:
  • the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
  • the step of obtaining the similarity between the feature set and the gender reference feature set comprises:
  • the distance between the feature set and the gender reference feature set is calculated by the following formula:
  • l represents the distance between the feature set and the gender reference feature set
  • Xn represents a one-dimensional feature in the gender reference feature set.
  • the step of acquiring a gender-recognized multi-dimensional feature of the user of the unknown gender during the application use comprises: collecting the gender-recognized multi-dimensional feature of the user of the unknown gender in the application process according to the preset frequency in the historical time period. .
  • the gender reference feature set includes a male reference feature set and a female reference feature set; the obtaining a similarity between the feature set and the gender reference feature set, and identifying the similarity according to the similarity
  • the steps for the gender of an unknown gender user include:
  • first similarity is less than the second similarity, identifying the unknown gender user as a female user
  • the embodiment of the present application provides a user gender identification method, and the execution subject of the user gender identification method may be the user gender identification device provided by the embodiment of the present application, or an electronic device integrated with the user gender identification device, wherein the user gender recognition
  • the device can be implemented in hardware or software.
  • the electronic device may be a device such as a smart phone, a tablet computer, a palmtop computer, a notebook computer, or a desktop computer.
  • FIG. 1 is a schematic diagram of an application scenario of a user gender identification method according to an embodiment of the present application.
  • the user identity recognition device is integrated into an electronic device as an example, and the electronic device can obtain a charging feature set when charging behavior occurs, and obtain a plurality of charging feature sets; performing similarity recognition on the plurality of charging feature sets to obtain a similar charging feature set including a plurality of similar charging feature sets; predicting a next charging behavior according to the similar charging feature set; and predicting a next charging behavior Determine the corresponding performance adjustment mode; perform performance adjustment operations according to the determined performance adjustment mode.
  • a plurality of sample users may be acquired in a historical time period.
  • Multi-dimensional features with gender recognition during application use (such as the number and duration of user b browsing male products in shopping applications, the length of time user b browses male readings in reading applications, and user c in shopping applications
  • the number and duration of browsing female products, the length of time that user c browses female readings in the reading application as a sample, obtaining sample feature sets of multiple sample users; obtaining average feature values of similar features in multiple sample feature sets (For example, averaging the characteristics of the user in the reading application to view the length of the male reading in the reading application), and obtaining the gender reference feature set (in other words, the gender reference feature set, that is, the set of the average values of the various types of features)
  • the gender reference feature set is used to describe the multi-dimensional characteristics of the male user in a biased manner, or to describe the multi-dimensional characteristics of the female user in a biased manner; Obtain multi-dimensional features of gender-identified users in the process of application use (such as obtaining the length of time that user a browses male/female reading
  • FIG. 2 is a schematic flowchart of a user gender identification method according to an embodiment of the present application.
  • the specific process of the user gender identification method provided by the embodiment of the present application may be as follows:
  • the multi-dimensional feature has a dimension of a certain length, that is, the multi-dimensional feature is composed of a plurality of features.
  • the multi-dimensional feature may include gender-recognized behavior characteristics of the user in the process of using the application, for example, the number and duration of browsing the male-type goods (such as men's wear, razors, etc.) in the shopping application, and the user is shopping.
  • the number and duration of browsing female-oriented goods (such as cosmetics, women's clothing, etc.) in the application, the length of time that the user reads the male-like readings in the reading application, and the length of time the user reads the female-oriented readings in the reading application.
  • the multi-dimensional feature may further include related behavior characteristic information with the electronic device itself, for example, the number of times the user uses the shooting-type application to call the front camera, the number of times the user uses the shooting-type application to call the rear camera, and the like.
  • the obtained sample feature set is multiple, and corresponds to each sample user.
  • the multi-dimensional features in each sample feature set may be acquired according to a preset frequency within a historical time period.
  • the historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for any sample user, the multi-dimensional feature classification of the sample user collected each time in the historical time period is accumulated (for example, the feature "user browses the male-type goods in the shopping application” is accumulated. For the feature "the user accumulates the partial female products in the shopping application", the sample feature set of the sample user in the historical time period is obtained.
  • the sample feature set (X1, X2, ..., Xn) corresponding to the male user b is obtained, where Xn is One-dimensional feature of user b.
  • the multi-dimensional features of each known gender user based on their electronic device usage application process may be collected by the server, and then the electronic device may be obtained from the server at the time of gender recognition.
  • a gender user is known to be a user who provides gender information when using an electronic device, for example, a user who provides gender information when the account is registered.
  • each sample feature set may be marked to obtain a sample tag of each sample feature set. Since the embodiment of the present application is to identify the gender of the unknown gender user, the labeled sample is The labels include “male” and “female”, ie the sample categories include males and females.
  • the gender information of the known gender user may be marked, for example, the sample feature set corresponding to the male user b may be marked as “male”; for example, the sample feature set corresponding to the female user c may be marked as "female”.
  • the sample feature set may be marked using a number, such as the number "1" for “male”, the number "0” for "female", and vice versa.
  • “acquiring a multi-sample feature that has multiple genders in the application process and obtaining a sample feature set of the plurality of sample users” may include:
  • the obtained multi-dimensional features are normalized to obtain a sample feature set of a plurality of sample users.
  • normalization is a way to simplify the calculation, that is, a dimensional representation, transformed into a dimensionless expression, becomes a scalar.
  • a dimensional representation transformed into a dimensionless expression
  • a scalar For the specific normalization method, it can be selected by a person skilled in the art according to actual needs, which is not specifically limited in this application.
  • the embodiment of the present application normalizes each dimension feature in the multi-dimensional feature, and normalizes the original feature value to a value between 0-1, and then
  • the normalized multidimensional features constitute a sample feature set of male user b.
  • the obtained sample feature set is gender-marked, so the calculation of the average feature value can be performed based on the gender flag of the sample feature set. For example, for a sample feature set labeled "male”, the average feature value of each feature in the "male” sample feature set can be calculated; for example, for a sample feature set labeled "female”, it can be calculated The average eigenvalue of each feature in the "female” sample feature set.
  • the obtained gender reference feature sets are also different.
  • the obtained gender reference feature set is a male reference feature set characterizing "male”; for example, only the "female” sample feature set is calculated.
  • the obtained gender reference feature set is a female reference feature set representing "female”; for example, in calculating the average eigenvalue of the same feature in the "male” sample feature set, and calculating "
  • a female reference feature set characterizing "female” and a male reference feature set characterizing "male” are respectively obtained.
  • the feature set of the unknown gender user may be collected according to a preset frequency within a historical time period.
  • the historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour.
  • the multi-dimensional feature classification of the unknown gender user collected each time in the historical time period is accumulated (for example, the feature “user browses the male-type commodity in the shopping application” is accumulated, For the feature "the user accumulates the partial female products in the shopping application", the feature set of the unknown gender user in the historical time period is obtained.
  • the feature set (X1, X2, ..., Xn) of the corresponding user a is obtained by acquiring the multi-dimensional feature of the user a having gender recognition during the application use, wherein Xn is the user a.
  • the historical time period selected by the multi-dimensional feature of the unknown gender user is the same as the historical time period selected by the sample user (ie, the known gender user). For example, when the historical time period selected by the multi-dimensional feature of the sample user is 7 days, when the multi-dimensional feature of the user of the unknown gender is collected, the historical time period is also selected as 7 days. In this way, the gender reference feature set obtained by processing the sample feature set will be in the same time dimension as the feature set of the unknown gender user, thereby achieving the purpose of improving the gender accuracy of the identified user.
  • the gender reference feature set obtained in the previous step is different, and the gender mode of identifying the unknown gender user according to the similarity is also different.
  • the previously obtained gender reference feature set includes a male reference feature set and a female reference feature set, “acquiring the similarity between the feature set and the gender reference feature set, and identifying the unknown gender user according to the similarity degree.
  • Gender can include:
  • the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  • the manner of obtaining the first similarity is the same as the manner of obtaining the second similarity.
  • the distance between the feature set and the gender reference feature set is used to describe the similarity between the feature set and the gender reference feature set. The larger the distance, the smaller the similarity, the smaller the distance, and the greater the similarity.
  • the method for calculating the distance between the feature set and the gender reference feature set is not limited in the embodiment of the present application, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
  • the distance between the feature set and the gender reference feature set is calculated according to the following formula:
  • l represents the distance between the feature set of the unknown gender user and the gender reference feature set
  • Xn represents the one-dimensional feature in the gender reference feature set.
  • X1 represents the feature in the gender reference feature set "the length of time the user reads the partial male reading in the reading application”
  • a feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application.”
  • n is a positive integer greater than 2.
  • the first similarity and the second similarity are obtained after the first similarity of the feature set of the unknown gender user and the male reference feature set is obtained, and the second similarity of the unknown gender user and the female reference feature set is obtained.
  • the size is compared to identify the gender of an unknown gender user based on the similarity comparison result.
  • the unknown gender user is more similar to the male user, and the unknown gender user is identified as the male user; when the first similarity is less than the second similarity, the unknown gender is indicated.
  • the user is more similar to the female user.
  • the user who identifies the unknown gender is the female user.
  • the first similarity and the second similarity are the same, it indicates that the feature set of the unknown gender user collected at this time is not enough to support the identification of the gender. At this time, there is no recognition result.
  • “acquiring the similarity of the feature set and the gender reference feature set, and identifying the gender of the unknown gender user according to the similarity” may include:
  • the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
  • the distance between the feature set and the male reference feature set is used to describe the similarity between the feature set and the male reference feature set.
  • the method for calculating the distance between the feature set and the male reference feature set is not limited in specific embodiments, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
  • the distance between the feature set and the male reference feature set is calculated according to the following formula:
  • l represents the distance between the feature set of the unknown gender user and the male reference feature set
  • Xn represents the one-dimensional feature in the male reference feature set.
  • X1 represents the feature in the male reference feature set "the length of time the user reads the partial male reading in the reading application”
  • a feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application.”
  • the distance is located in the first preset distance interval. Specifically, when the distance is within the first preset distance interval, the unknown gender user is inclined to be a male user, and the unknown gender user is identified as a male user; the distance is outside the first preset distance interval. At the same time, it is indicated that the user of the unknown gender does not tend to be a male user. Obviously, the gender of the user is not male or female, and the unknown gender user can be identified as a female user at this time.
  • the distance between each sample feature set labeled "male" and the male reference feature set may be calculated, and the maximum distance among the distances is taken as the right side of the first preset distance interval. Endpoint and set the left endpoint of the first preset distance interval to zero.
  • “acquiring the similarity between the feature set and the gender reference feature set, and identifying the gender of the unknown gender user according to the similarity” may include:
  • the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
  • the distance between the feature set and the female reference feature set is used to describe the similarity between the feature set and the female reference feature set.
  • the method for calculating the distance between the feature set and the female reference feature set is not specifically limited in the embodiment of the present application, and a suitable calculation manner may be selected by a person skilled in the art according to actual needs.
  • the distance between the feature set and the female reference feature set is calculated according to the following formula:
  • l represents the distance between the feature set of the unknown gender user and the female reference feature set
  • Xn represents the one-dimensional feature in the female reference feature set.
  • X1 represents the feature in the female reference feature set "the length of time the user reads the partial male reading in the reading application”
  • a feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application.”
  • the distance is located in the second preset distance interval. Specifically, when the distance is within the second preset distance interval, the unknown gender user tends to be a female user, and the unknown gender user is identified as a female user; the distance is outside the second preset distance interval. At the same time, it is indicated that the user of the unknown gender does not tend to be a female user. Obviously, the gender of the user is not male or female, and the unknown gender user can be identified as a male user at this time.
  • the distance between each sample feature set labeled as “female” and the female reference feature set may be calculated, and the maximum distance among the distances is taken as the right side of the second preset distance interval. Endpoint and set the left endpoint of the second preset distance interval to zero.
  • the embodiment of the present application first obtains multi-dimensional features of multiple sample users that have gender recognition during application, obtain sample feature sets of multiple sample users, and then obtain average features of similar features in multiple sample feature sets. Value, obtain the gender reference feature set, and then obtain the gender-recognition multi-dimensional feature of the unknown gender user in the application process, obtain the feature set of the unknown gender user, and finally obtain the similarity between the feature set and the gender reference feature set, and according to the similarity To predict the gender of an unknown gender user, so as to accurately identify the gender of the user and obtain the gender information of the user.
  • the user gender identification method may include:
  • the multi-dimensional feature is a multi-dimensional user feature in which a gender user such as a male user or a female user has gender identity during application use.
  • a gender user such as a male user or a female user has gender identity during application use.
  • a user has a behavioral characteristic with male or female characteristics in the application process.
  • the multi-dimensional feature has a dimension of a certain length, that is, the multi-dimensional feature is composed of a plurality of features.
  • the multi-dimensional feature may include gender-recognized behavior characteristics of the user in the process of using the application, for example, the number and duration of browsing the male-type goods (such as men's wear, razors, etc.) in the shopping application, and the user is shopping.
  • the number and duration of browsing female-oriented goods (such as cosmetics, women's clothing, etc.) in the application, the length of time that the user reads the male-like readings in the reading application, and the length of time the user reads the female-oriented readings in the reading application.
  • the multi-dimensional feature may further include related behavior characteristic information with the electronic device itself, for example, the number of times the user uses the shooting-type application to call the front camera, the number of times the user uses the shooting-type application to call the rear camera, and the like.
  • the obtained sample feature set is multiple, and corresponds to each sample user.
  • the multi-dimensional features in each sample feature set may be acquired according to a preset frequency within a historical time period.
  • the historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for any sample user, the multi-dimensional feature classification of the sample user collected each time in the historical time period is accumulated (for example, the feature "user browses the male-type goods in the shopping application” is accumulated. For the feature "the user accumulates the partial female products in the shopping application", the sample feature set of the sample user in the historical time period is obtained.
  • the sample feature set (X1, X2, ..., Xn) corresponding to the male user b is obtained, where Xn is One-dimensional feature of user b.
  • the multi-dimensional features of each known gender user based on their electronic device usage application process may be collected by the server, and then the electronic device may be obtained from the server at the time of gender recognition.
  • a gender user is known to be a user who provides gender information when using an electronic device, for example, a user who provides gender information when the account is registered.
  • each sample feature set may be marked to obtain a sample tag of each sample feature set. Since the embodiment of the present application is to identify the gender of the unknown gender user, the labeled sample is The labels include “male” and “female”, ie the sample categories include males and females.
  • the gender information of the known gender user may be marked, for example, the sample feature set corresponding to the male user b may be marked as “male”; for example, the sample feature set corresponding to the female user c may be marked as "female”.
  • the sample feature set may be marked using a number, such as the number "1" for “male”, the number "0” for "female", and vice versa.
  • normalization is a way to simplify the calculation, that is, a dimensional representation, transformed into a dimensionless expression, becomes a scalar.
  • a dimensional representation transformed into a dimensionless expression
  • a scalar For the specific normalization method, it can be selected by a person skilled in the art according to actual needs, which is not specifically limited in this application.
  • the embodiment of the present application normalizes each dimension feature in the multi-dimensional feature, and normalizes the original feature value to a value between 0-1, and then
  • the normalized multidimensional features constitute a sample feature set of male user b.
  • a specific set of sample features can be as shown in Table 1 below, including features of multiple dimensions. It should be noted that the features shown in Table 1 are only examples. In practical applications, the number of features included in a sample feature set, The number of features shown in Table 1 may be less than the number of features shown in Table 1, and the specific features may be different from those shown in Table 1, and are not specifically limited herein.
  • Dimension Characteristic information 1 The number of times a user viewed a male-type item (such as a men's clothing) in a shopping application. 2 How long does a user browse a male-like item (such as a men's clothing) in a shopping app? 3 The number of times a user viewed a female product (such as cosmetics, women's clothing) in a shopping application
  • Number of times users use beauty apps 11 Number of times users play different types of game apps 12 How long do users play different types of game apps?
  • the obtained sample feature set is gender-marked, so the calculation of the average feature value can be performed based on the gender flag of the sample feature set. For example, for a sample feature set labeled "male”, the average feature value of each feature in the "male” sample feature set can be calculated; for example, for a sample feature set labeled "female”, it can be calculated The average eigenvalue of each feature in the "female” sample feature set.
  • the obtained gender reference feature sets are also different.
  • the obtained gender reference feature set is a male reference feature set characterizing "male”; for example, only the "female” sample feature set is calculated.
  • the obtained gender reference feature set is a female reference feature set representing "female”; for example, in calculating the average eigenvalue of the same feature in the "male” sample feature set, and calculating "
  • a female reference feature set characterizing "female” and a male reference feature set characterizing "male” are respectively obtained.
  • the feature set of the unknown gender user may be collected according to a preset frequency within a historical time period.
  • the historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour.
  • the multi-dimensional feature classification of the unknown gender user collected each time in the historical time period is accumulated (for example, the feature “user browses the male-type commodity in the shopping application” is accumulated, For the feature "the user accumulates the partial female products in the shopping application", the feature set of the unknown gender user in the historical time period is obtained.
  • the feature set (X1, X2, ..., Xn) of the corresponding user a is obtained by acquiring the multi-dimensional feature of the user a having gender recognition during the application use, wherein Xn is the user a.
  • the historical time period selected by the multi-dimensional feature of the unknown gender user is the same as the historical time period selected by the sample user (ie, the known gender user). For example, when the historical time period selected by the multi-dimensional feature of the sample user is 7 days, when the multi-dimensional feature of the user of the unknown gender is collected, the historical time period is also selected as 7 days. In this way, the gender reference feature set obtained by processing the sample feature set will be in the same time dimension as the feature set of the unknown gender user, thereby achieving the purpose of improving the gender accuracy of the identified user.
  • the first similarity between the feature set of the unknown gender user and the male reference feature set is calculated.
  • the second similarity of the feature set of the unknown gender user and the female reference feature set is calculated.
  • the manner of obtaining the first similarity is the same as the manner of obtaining the second similarity.
  • the distance between the feature set and the gender reference feature set is used to describe the similarity between the feature set and the gender reference feature set. The larger the distance, the smaller the similarity, and the smaller the distance, the greater the similarity.
  • the method for calculating the distance between the feature set and the gender reference feature set is not limited in specific embodiments, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
  • the distance between the feature set and the gender reference feature set is calculated according to the following formula:
  • l represents the distance between the feature set of the unknown gender user and the gender reference feature set
  • Xn represents the one-dimensional feature in the gender reference feature set.
  • X1 represents the feature in the gender reference feature set "the length of time the user reads the partial male reading in the reading application”
  • a feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application.”
  • the first similarity and the second similarity are obtained after the first similarity of the feature set of the unknown gender user and the male reference feature set is obtained, and the second similarity of the unknown gender user and the female reference feature set is obtained.
  • the size is compared to identify the gender of an unknown gender user based on the similarity comparison result.
  • first similarity is greater than the second similarity, identify the unknown gender user as a male user, and otherwise identify the unknown gender user as a female user.
  • the unknown gender user is more similar to the male user, and the unknown gender user is identified as the male user; when the first similarity is less than the second similarity, the unknown gender is indicated.
  • the user is more similar to the female user.
  • the user who identifies the unknown gender is the female user.
  • the first similarity and the second similarity are the same, it indicates that the feature set of the unknown gender user collected at this time is not enough to support the identification of the gender. At this time, there is no recognition result.
  • the embodiment of the present application first obtains multi-dimensional features of multiple sample users that have gender recognition during application, obtain sample feature sets of multiple sample users, and then obtain average features of similar features in multiple sample feature sets. Value, obtain the gender reference feature set, and then obtain the gender-recognition multi-dimensional feature of the unknown gender user in the application process, obtain the feature set of the unknown gender user, and finally obtain the similarity between the feature set and the gender reference feature set, and according to the similarity To predict the gender of an unknown gender user, so as to accurately identify the gender of the user and obtain the gender information of the user.
  • the embodiment of the present application further provides a user gender identification device, including:
  • a first feature acquiring module configured to acquire a multi-dimensional feature of the plurality of sample users that has gender recognition during use of the application, and obtain a sample feature set of the plurality of sample users;
  • a feature set generating module configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set
  • a second feature acquisition module configured to acquire a multi-dimensional feature that is gender-recognized by an unknown gender user during application use, and obtain a feature set of the unknown gender user;
  • the user gender identification module is configured to acquire the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
  • the gender reference feature set includes a male reference feature set and a female reference feature set
  • the user gender identification module is further configured to:
  • the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
  • the gender reference feature set is a male reference feature set
  • the user gender identification module is further configured to:
  • the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  • the gender reference feature set is a female reference feature set
  • the user gender identification module is further configured to:
  • the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
  • the first feature acquisition module is further configured to:
  • the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
  • the user gender identification module is configured to calculate a distance between the feature set and the gender reference feature set by:
  • l represents the distance between the feature set and the gender reference feature set
  • Xn represents a one-dimensional feature in the gender reference feature set.
  • the second feature acquisition module is configured to: collect, during a historical time period, a multi-dimensional feature of an unknown gender user having gender recognition during application use according to a preset frequency.
  • the set of gender reference features comprises a set of male reference features and a set of female reference features
  • User gender identification module for:
  • first similarity is less than the second similarity, identifying the unknown gender user as a female user
  • FIG. 4 is a schematic structural diagram of a user gender identification apparatus according to an embodiment of the present application.
  • the user gender identification device is applied to the electronic device, and the user gender recognition device includes a first feature acquisition module 401, a feature set generation module 402, a second feature acquisition module 403, and a user gender identification module 404 as follows:
  • the first feature obtaining module 401 is configured to obtain a multi-dimensional feature that the plurality of sample users have gender recognition during the application, and obtain a sample feature set of the plurality of sample users;
  • the feature set generation module 402 is configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set;
  • the second feature acquisition module 403 is configured to obtain a multi-dimensional feature that is gender-recognized by an unknown gender user during use, and obtain a feature set of an unknown gender user;
  • the user gender identification module 404 is configured to obtain the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
  • the gender reference feature set includes a male reference feature set and a female reference feature set
  • the user gender identification module 404 can be used to:
  • the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  • the gender reference feature set is a male reference feature set
  • the user gender identification module 404 can be used to:
  • the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
  • the gender reference feature set is a female reference feature set
  • the user gender identification module 404 can be used to:
  • the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
  • the first feature obtaining module 401 can be used to:
  • the acquired multi-dimensional features are normalized to obtain a sample feature set of multiple sample users.
  • the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
  • the user gender identification module 404 is configured to calculate a distance between the feature set and the gender reference feature set by using the following formula:
  • l represents the distance between the feature set and the gender reference feature set
  • Xn represents a one-dimensional feature in the gender reference feature set.
  • the second feature acquiring module 403 is configured to: collect, according to a preset frequency, a multi-dimensional feature that the user of the unknown gender has gender recognition during the application in the historical time period.
  • the gender reference feature set includes a male reference feature set and a female reference feature set
  • User gender identification module 404 for:
  • first similarity is less than the second similarity, identifying the unknown gender user as a female user
  • module unit
  • the term “module” "unit” as used herein may be taken to mean a software object that is executed on the computing system.
  • the different components, modules, engines, and services described herein can be considered as implementation objects on the computing system.
  • the apparatus and method described herein may be implemented in software, and may of course be implemented in hardware, all of which are within the scope of the present application.
  • the steps performed by each module in the user gender identification device may refer to the method steps described in the foregoing method embodiments.
  • the user gender identification device can be integrated in an electronic device such as a mobile phone, a tablet, or the like.
  • the foregoing modules may be implemented as an independent entity, or may be implemented in any combination, and may be implemented as the same entity or a plurality of entities.
  • the foregoing units refer to the foregoing embodiments, and details are not described herein again.
  • the user gender identification device of the embodiment can obtain the multi-dimensional feature that the plurality of sample users have gender recognition during the application use process, and obtain the sample feature set of the plurality of sample users;
  • the generating module 402 obtains the average feature value of the same feature in the plurality of sample feature sets to obtain the gender reference feature set;
  • the second feature acquiring module 403 obtains the multi-dimensional feature of the gender-identified user in the application process, and obtains the unknown gender.
  • the feature set of the user is obtained by the user gender identification module 404, and the gender of the unknown gender user is predicted according to the similarity, so that the gender of the user is accurately recognized, and the gender information of the user is obtained.
  • the electronic device 500 includes a processor 501 and a memory 502.
  • the processor 501 is electrically connected to the memory 502.
  • the processor 500 is a control center of the electronic device 500 that connects various portions of the entire electronic device using various interfaces and lines, by running or loading a computer program stored in the memory 502, and recalling data stored in the memory 502, The various functions of the electronic device 500 are performed and the data is processed to achieve accurate identification of the user's gender.
  • the memory 502 can be used to store software programs and modules, and the processor 501 executes various functional applications and data processing by running computer programs and modules stored in the memory 502.
  • the memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a computer program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of electronic devices, etc.
  • memory 502 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 502 can also include a memory controller to provide processor 501 access to memory 502.
  • the processor 501 in the electronic device 500 loads the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and is stored in the memory 502 by the processor 501.
  • the computer program in which to implement various functions, as follows:
  • the similarity between the feature set and the gender reference feature set is obtained, and the gender of the unknown gender user is identified according to the similarity.
  • the gender reference feature set includes a male reference feature set and a female reference feature set, and obtains a similarity between the feature set and the gender reference feature set, and identifies a gender of the unknown gender user according to the similarity, the processor 501 can perform the following steps:
  • the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  • the processor 501 may specifically execute the following: step:
  • the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
  • the processor 501 may specifically execute the following: step:
  • the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
  • the processor 501 may further perform the following steps when acquiring a multi-dimensional feature that the plurality of sample users have gender-recognition during application use, and obtaining a sample feature set of the plurality of sample users:
  • the acquired multi-dimensional features are normalized to obtain a sample feature set of multiple sample users.
  • the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set
  • the processor 501 when acquiring the similarity between the feature set and the gender reference feature set, the processor 501 may specifically perform the following steps:
  • the distance between the feature set and the gender reference feature set is calculated by the following formula:
  • l represents the distance between the feature set and the gender reference feature set
  • Xn represents a one-dimensional feature in the gender reference feature set.
  • the processor 501 may perform the following steps when acquiring a multi-dimensional feature of the gender-identified user during the application use process:
  • the multi-dimensional features of the gender-identified users in the application process are collected according to the preset frequency.
  • the gender reference feature set includes a male reference feature set and a female reference feature set; the acquiring the similarity of the feature set and the gender reference feature set, and identifying according to the similarity
  • the processor 501 may further perform the following steps:
  • first similarity is less than the second similarity, identifying the unknown gender user as a female user
  • the embodiment of the present application first obtains a charging feature set when charging behavior occurs, and obtains a plurality of charging feature sets; then performs similarity recognition on the plurality of charging feature sets to obtain a similar charging feature set; and then according to the similar charging feature set. Predicting the next charging behavior; determining the corresponding performance adjustment mode according to the predicted next charging behavior; finally performing performance adjustment operation according to the determined performance adjustment mode, thereby realizing dynamic adjustment of the performance of the electronic device itself, and satisfying the actual use of the user demand.
  • the electronic device 500 may further include: a display 503, a radio frequency circuit 504, an audio circuit 505, and a power source 506.
  • the display 503, the radio frequency circuit 504, the audio circuit 505, and the power source 506 are electrically connected to the processor 501, respectively.
  • the display 503 can be used to display information entered by a user or information provided to a user, as well as various graphical user interfaces, which can be composed of graphics, text, icons, video, and any combination thereof.
  • the display 503 can include a display panel.
  • the display panel can be configured in the form of a liquid crystal display (LCD) or an organic light-emitting diode (OLED).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • the radio frequency circuit 504 can be used to transmit and receive radio frequency signals to establish wireless communication with a network device or other electronic device through wireless communication, and to transmit and receive signals with a network device or other electronic device.
  • the audio circuit 505 can be used to provide an audio interface between a user and an electronic device through a speaker or a microphone.
  • the power source 506 can be used to power various components of the electronic device 500.
  • the power source 506 can be logically coupled to the processor 501 through a power management system to enable functions such as managing charging, discharging, and power management through the power management system.
  • the electronic device 500 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
  • the embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, causes the computer to perform the user gender identification method in any of the above embodiments, such as: Obtaining multi-dimensional features of multiple sample users in the process of application use, obtaining sample feature sets of multiple sample users; obtaining average feature values of similar features in multiple sample feature sets, obtaining gender reference feature sets; obtaining unknown
  • the gender user has the gender-recognition multi-dimensional feature in the application process, obtains the feature set of the unknown gender user; obtains the similarity between the feature set and the gender reference feature set, and identifies the gender of the unknown gender user according to the similarity degree.
  • the storage medium may be a magnetic disk, an optical disk, a read only memory (ROM), or a random access memory (RAM).
  • ROM read only memory
  • RAM random access memory
  • the computer program may be stored in a computer readable storage medium, such as in a memory of the electronic device, and executed by at least one processor in the electronic device, and may include, for example, user gender during execution.
  • the storage medium may be a magnetic disk, an optical disk, a read only memory, a random access memory, or the like.
  • each functional module may be integrated into one processing chip, or each module may exist physically separately, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • the integrated module if implemented in the form of a software functional module and sold or used as a standalone product, may also be stored in a computer readable storage medium, such as a read only memory, a magnetic disk or an optical disk, etc. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed in the embodiments of the present application are a user gender identification method, an apparatus, a storage medium, and an electronic device. The method comprises: acquiring a sample feature set of a plurality of sample users; acquiring a gender reference feature set; acquiring a multi-dimensional feature of an unknown-gender user having gender identifiability during the use of an application, to obtain a feature set of the unknown-gender user; acquiring the similarity between the feature set and the gender reference feature set, and predicting the gender of the unknown-gender user according to the similarity.

Description

用户性别识别方法、装置、存储介质及电子设备User gender identification method, device, storage medium and electronic device
本申请要求于2017年12月22日提交中国专利局、申请号为201711405392.X、发明名称为“用户性别识别方法、装置、存储介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 22, 2017, the Chinese Patent Office, the application number is 201711405392.X, and the invention is entitled "user gender identification method, device, storage medium and electronic device". This is incorporated herein by reference.
技术领域Technical field
本申请涉及终端技术领域,具体涉及一种用户性别识别方法、装置、存储介质及电子设备。The present application relates to the field of terminal technologies, and in particular, to a user gender identification method, apparatus, storage medium, and electronic device.
背景技术Background technique
随着智能手机等电子设备的普及和移动宽带网络的建设,目前我们已进入移动互联网的新时代。设备厂商为了对电子设备进行各方面的深度优化,或者是个性化的给用户推荐应用程序,推送新闻等,往往需要知道用户性别才能进行。With the popularization of electronic devices such as smart phones and the construction of mobile broadband networks, we have entered a new era of mobile Internet. In order to optimize the electronic equipment in various aspects, or to personally recommend applications to users, push news, etc., it is often necessary to know the user's gender.
发明内容Summary of the invention
本申请实施例提供了一种用户性别识别方法、装置、存储介质及电子设备,可以实现对用户性别的准确识别。The embodiment of the present application provides a user gender identification method, device, storage medium, and electronic device, which can accurately identify the gender of the user.
第一方面,本申请实施例了提供了的一种用户性别识别方法,包括:In a first aspect, an embodiment of the present application provides a user gender identification method, including:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;Obtaining a multi-dimensional feature of the plurality of sample users having gender recognition during application use, and obtaining a sample feature set of the plurality of sample users;
获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;Obtaining average eigenvalues of similar features in the plurality of sample feature sets to obtain a gender reference feature set;
获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;Obtaining a gender-recognized multi-dimensional feature of an unknown gender user during application use, and obtaining a feature set of the unknown gender user;
获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。Obtaining a similarity between the feature set and the gender reference feature set, and identifying a gender of the unknown gender user according to the similarity.
第二方面,本申请实施例了提供了的一种用户性别识别装置,包括:In a second aspect, an embodiment of the present application provides a user gender identification apparatus, including:
第一特征获取模块,用于获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;a first feature acquiring module, configured to acquire a multi-dimensional feature of the plurality of sample users that has gender recognition during use of the application, and obtain a sample feature set of the plurality of sample users;
特征集合生成模块,用于获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;a feature set generating module, configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set;
第二特征获取模块,用于获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;a second feature acquisition module, configured to acquire a multi-dimensional feature that is gender-recognized by an unknown gender user during application use, and obtain a feature set of the unknown gender user;
用户性别识别模块,用于获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。The user gender identification module is configured to acquire the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
第三方面,本申请实施例提供的存储介质,其上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行如本申请任一实施例提供的用户性别识别方法。In a third aspect, a storage medium provided by an embodiment of the present application has a computer program stored thereon, and when the computer program is run on a computer, the computer is caused to perform a user gender identification method according to any embodiment of the present application.
第四方面,本申请实施例提供的电子设备,包括处理器和存储器,所述存储器有计算机程序,所述处理器通过调用所述计算机程序,用于执行如本申请任一实施例提供的用户性别识别方法。In a fourth aspect, an electronic device provided by the embodiment of the present application includes a processor and a memory, where the memory has a computer program, and the processor is used to execute a user as provided in any embodiment of the present application by calling the computer program. Gender identification method.
附图说明DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present application. Other drawings can also be obtained from those skilled in the art based on these drawings without paying any creative effort.
图1为本申请实施例提供的用户性别识别方法的应用场景示意图。FIG. 1 is a schematic diagram of an application scenario of a user gender identification method according to an embodiment of the present application.
图2是本申请实施例提供的用户性别识别方法的一个流程示意图。FIG. 2 is a schematic flowchart of a user gender identification method provided by an embodiment of the present application.
图3是本申请实施例提供的用户性别识别方法的另一个流程示意图。FIG. 3 is another schematic flowchart of a user gender identification method provided by an embodiment of the present application.
图4是本申请实施例提供的用户性别识别装置的一结构示意图。FIG. 4 is a schematic structural diagram of a user gender identification apparatus according to an embodiment of the present application.
图5是本申请实施例提供的电子设备的一个结构示意图。FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
图6是本申请实施例提供的电子设备的另一结构示意图。FIG. 6 is another schematic structural diagram of an electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。References to "an embodiment" herein mean that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the present application. The appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
本申请实施例提供了一种用户性别识别方法,包括:The embodiment of the present application provides a user gender identification method, including:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;Obtaining a multi-dimensional feature of the plurality of sample users having gender recognition during application use, and obtaining a sample feature set of the plurality of sample users;
获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;Obtaining average eigenvalues of similar features in the plurality of sample feature sets to obtain a gender reference feature set;
获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;Obtaining a gender-recognized multi-dimensional feature of an unknown gender user during application use, and obtaining a feature set of the unknown gender user;
获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。Obtaining a similarity between the feature set and the gender reference feature set, and identifying a gender of the unknown gender user according to the similarity.
在一些实施例中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:In some embodiments, the gender reference feature set includes a male reference feature set and a female reference feature set, the acquiring the similarity of the feature set and the gender reference feature set, and identifying the similarity according to the similarity The steps for the gender of an unknown gender user include:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
在一些实施例中,所述性别参考特征集合为男性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:In some embodiments, the gender reference feature set is a male reference feature set, the similarity of the feature set and the gender reference feature set is obtained, and the gender of the unknown gender user is identified according to the similarity The steps include:
获取所述特征集合与所述男性参考特征集合的距离,将所述距离作为所述特征集合与所述男性参考特征集合的相似度;Obtaining a distance between the feature set and the male reference feature set, and using the distance as a similarity between the feature set and the male reference feature set;
判断所述距离是否位于第一预设距离区间;Determining whether the distance is in a first preset distance interval;
若是则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If yes, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
在一些实施例中,所述性别参考特征集合为女性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:In some embodiments, the gender reference feature set is a female reference feature set, the similarity of the feature set and the gender reference feature set is obtained, and the gender of the unknown gender user is identified according to the similarity The steps include:
获取所述特征集合与所述女性参考特征集合的距离,将所述距离作为所述特征集合与所述女性参考特征集合的相似度;Obtaining a distance between the feature set and the female reference feature set, and using the distance as a similarity between the feature set and the female reference feature set;
判断所述距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
若是则识别所述未知性别用户为女性用户,否则识别所述未知性别用户为男性用户。If yes, the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
在一些实施例中,所述获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合的步骤包括:In some embodiments, the obtaining a plurality of sample users has gender-recognized multi-dimensional features during application use, and the step of obtaining the sample feature sets of the plurality of sample users comprises:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
对获取的所述多维特征进行归一化处理,得到所述多个样本用户的样本特征集合。And normalizing the obtained multi-dimensional features to obtain a sample feature set of the plurality of sample users.
在一些实施例中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离。In some embodiments, the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
在一些实施例中,获取所述特征集合与所述性别参考特征集合的相似度的步骤包括:In some embodiments, the step of obtaining the similarity between the feature set and the gender reference feature set comprises:
通过如下公式计算所述特征集合与所述性别参考特征集合的距离:The distance between the feature set and the gender reference feature set is calculated by the following formula:
Figure PCTCN2018116713-appb-000001
Figure PCTCN2018116713-appb-000001
其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000002
表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
Figure PCTCN2018116713-appb-000003
对应同类特征,n为大于2的正整数。
Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000002
A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
Figure PCTCN2018116713-appb-000003
Corresponding to the same feature, n is a positive integer greater than 2.
在一些实施例中,获取未知性别用户在应用使用过程中具有性别识别性的多维特征的步骤包括:在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。In some embodiments, the step of acquiring a gender-recognized multi-dimensional feature of the user of the unknown gender during the application use comprises: collecting the gender-recognized multi-dimensional feature of the user of the unknown gender in the application process according to the preset frequency in the historical time period. .
在一些实施例中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:In some embodiments, the gender reference feature set includes a male reference feature set and a female reference feature set; the obtaining a similarity between the feature set and the gender reference feature set, and identifying the similarity according to the similarity The steps for the gender of an unknown gender user include:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
若所述第一相似度等所述第二相似度,则无识别结果。If the first similarity is the second similarity, there is no recognition result.
本申请实施例提供一种用户性别识别方法,该用户性别识别方法的执行主体可以是本申请实施例提供的用户性别识别装置,或者集成了该用户性别识别装置的电子设备,其中该用户性别识别装置可以采用硬件或者软件的方式实现。其中,电子设备可以是智能手机、平板电脑、掌上电脑、笔记本电脑、或者台式电脑等设备。The embodiment of the present application provides a user gender identification method, and the execution subject of the user gender identification method may be the user gender identification device provided by the embodiment of the present application, or an electronic device integrated with the user gender identification device, wherein the user gender recognition The device can be implemented in hardware or software. The electronic device may be a device such as a smart phone, a tablet computer, a palmtop computer, a notebook computer, or a desktop computer.
请参阅图1,图1为本申请实施例提供的用户性别识别方法的应用场景示意图,以用户性别识别装置集成在电子设备中为例,电子设备可以获取充电行为发生时的充电特征集合,得到多个充电特征集合;对多个充电特征集合进行相似度识别,得到包括多个相似的充电特征集合的相似充电特征集合;根据相似充电特征集合预测下一次充电行为;根据预测的下一次充电行为确定对应的性能调整方式;根据确定的性能调整方式进行性能调整操作。Referring to FIG. 1 , FIG. 1 is a schematic diagram of an application scenario of a user gender identification method according to an embodiment of the present application. The user identity recognition device is integrated into an electronic device as an example, and the electronic device can obtain a charging feature set when charging behavior occurs, and obtain a plurality of charging feature sets; performing similarity recognition on the plurality of charging feature sets to obtain a similar charging feature set including a plurality of similar charging feature sets; predicting a next charging behavior according to the similar charging feature set; and predicting a next charging behavior Determine the corresponding performance adjustment mode; perform performance adjustment operations according to the determined performance adjustment mode.
具体地,请参照图1,以预测用户a的性别为例,可以在历史时间段内,获取多个样本用户(样本用户即已知性别的用户,如男性用户b,女性用户c等等)在应用使用过程中具有性别识别性的多维特征(如用户b在购物类应用中浏览男性类商品的次数与时长,用户b在阅读类应用中浏览男性类读物的时长,用户c在购物类应用中浏览女性类商品的次数与时长,用户c在阅读类应用中浏览女性类读物的时长)作为样本,得到多个样本用户的样本特征集合;获取多个样本特征集合中同类特征的平均特征值(如对各样本特征集合中的特征“用户在阅读类应用中浏览男性类读物的时长”求平均值),得到性别参考特征集合(换言之,性别参考特征集合即各类特征平均值的集合),该性别参考特征集合用于倾向性的描述男性用户的多维特征,或倾向性的描述女性用户的多维特征;获取未知性别用户在应用使用过程中具有性别识别性的多维特征(如获取用户a在阅读类应用中浏览男性/女性类读物的时长,获取用户a在购物类应用中浏览男性/女性类商品的次数与时长等等), 得到未知性别用户的特征集合;获取未知性别用户的特征集合与性别参考特征集合的相似度,并根据二者相似度识别未知性别用户的性别(如识别用户a为男性,或是女性)。Specifically, referring to FIG. 1 , taking the gender of the user a as an example, a plurality of sample users (a sample user, that is, a user of known gender, such as a male user b, a female user c, etc.) may be acquired in a historical time period. Multi-dimensional features with gender recognition during application use (such as the number and duration of user b browsing male products in shopping applications, the length of time user b browses male readings in reading applications, and user c in shopping applications The number and duration of browsing female products, the length of time that user c browses female readings in the reading application, as a sample, obtaining sample feature sets of multiple sample users; obtaining average feature values of similar features in multiple sample feature sets (For example, averaging the characteristics of the user in the reading application to view the length of the male reading in the reading application), and obtaining the gender reference feature set (in other words, the gender reference feature set, that is, the set of the average values of the various types of features) The gender reference feature set is used to describe the multi-dimensional characteristics of the male user in a biased manner, or to describe the multi-dimensional characteristics of the female user in a biased manner; Obtain multi-dimensional features of gender-identified users in the process of application use (such as obtaining the length of time that user a browses male/female readings in reading applications, and obtaining user a to browse male/female products in shopping applications) The number of times and duration, etc.), obtain the feature set of the unknown gender user; obtain the similarity between the feature set of the unknown gender user and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity between the two (for example, identifying the user a as a male) , or female).
请参照图2,图2为本申请实施例提供的用户性别识别方法的流程示意图。本申请实施例提供的用户性别识别方法的具体流程可以如下:Please refer to FIG. 2 , which is a schematic flowchart of a user gender identification method according to an embodiment of the present application. The specific process of the user gender identification method provided by the embodiment of the present application may be as follows:
201、获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合。201. Acquire a multi-dimensional feature that the plurality of sample users have gender recognition during the application, and obtain a sample feature set of the plurality of sample users.
其中,多维特征具有一定长度的维度,即该多维特征息由多个特征构成。该多维特征可以包括用户在使用应用的过程中具有性别识别性的行为特征,比如,用户在购物类应用中浏览偏男性类商品(如男装、剃须刀等)的次数与时长,用户在购物类应用中浏览偏女性类商品(如化妆品、女装等)次数与时长,用户在阅读类应用中阅读偏男性类读物的时长,用户在阅读类应用中阅读偏女性类读物的时长等。Wherein, the multi-dimensional feature has a dimension of a certain length, that is, the multi-dimensional feature is composed of a plurality of features. The multi-dimensional feature may include gender-recognized behavior characteristics of the user in the process of using the application, for example, the number and duration of browsing the male-type goods (such as men's wear, razors, etc.) in the shopping application, and the user is shopping. The number and duration of browsing female-oriented goods (such as cosmetics, women's clothing, etc.) in the application, the length of time that the user reads the male-like readings in the reading application, and the length of time the user reads the female-oriented readings in the reading application.
此外,该多维特征还可以包括与电子设备本身的相关行为特征信息,比如,用户使用拍摄类应用调用前置摄像头的次数,用户使用拍摄类应用调用后置摄像的次数等等。In addition, the multi-dimensional feature may further include related behavior characteristic information with the electronic device itself, for example, the number of times the user uses the shooting-type application to call the front camera, the number of times the user uses the shooting-type application to call the rear camera, and the like.
其中,得到的样本特征集合为多个,分别对应各个样本用户。各样本特征集合中的多维特征可以是在历史时间段内,按照预设频率采集的。历史时间段,例如可以是过去7天、10天;预设频率,例如可以是每10分钟采集一次、每半小时采集一次。可以理解的是,对于任一样本用户,对在历史时间段内每次采集的该样本用户的多维特征分类进行累计(如对特征“用户在购物类应用中浏览偏男性类商品”进行累计,对特征“用户在购物类应用中浏览偏女性类商品进行累计”),得到该样本用户在该历史时间段内的样本特征集合。例如,对于男性用户b,通过对该男性用户b在应用使用过程中具有性别识别性的多维特征进行获取,得到对应男性用户b的样本特征集合(X1,X2……,Xn),其中Xn为用户b的一维特征。The obtained sample feature set is multiple, and corresponds to each sample user. The multi-dimensional features in each sample feature set may be acquired according to a preset frequency within a historical time period. The historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for any sample user, the multi-dimensional feature classification of the sample user collected each time in the historical time period is accumulated (for example, the feature "user browses the male-type goods in the shopping application" is accumulated. For the feature "the user accumulates the partial female products in the shopping application", the sample feature set of the sample user in the historical time period is obtained. For example, for the male user b, by acquiring the multi-dimensional feature of the male user b with gender recognition during the application, the sample feature set (X1, X2, ..., Xn) corresponding to the male user b is obtained, where Xn is One-dimensional feature of user b.
在一实施例中,可以由服务器收集各已知性别用户基于其电子设备使用应用过程中的多维特征,然后,在性别识别时电子设备可以从服务器中获取。其中,已知性别用户可以为使用电子设备时提供了性别信息的用户,比如,在账号注册时提供了性别信息的用户等。In an embodiment, the multi-dimensional features of each known gender user based on their electronic device usage application process may be collected by the server, and then the electronic device may be obtained from the server at the time of gender recognition. Among them, a gender user is known to be a user who provides gender information when using an electronic device, for example, a user who provides gender information when the account is registered.
在构成样本特征集合之后,可以对各样本特征集合进行标记,得到每个样本特征集合的样本标签,由于本申请实施例要实现的是对未知性别用户的性别进行识别,因此,所标记的样本标签包括“男性”和“女性”,也即样本类别包括男性和女性。After constituting the sample feature set, each sample feature set may be marked to obtain a sample tag of each sample feature set. Since the embodiment of the present application is to identify the gender of the unknown gender user, the labeled sample is The labels include “male” and “female”, ie the sample categories include males and females.
在具体实施时,可根据已知性别用户的性别信息进行标记,例如:可将对应男性用户b的样本特征集合标记为“男性”;又例如,可将对应女性用户c的样本特征集合标记为“女性”。可选地,可以使用数字来对样本特征集合进行标记,如采用数字“1”表示“男性”,采用数字“0”表示“女性”,反之亦可。In a specific implementation, the gender information of the known gender user may be marked, for example, the sample feature set corresponding to the male user b may be marked as “male”; for example, the sample feature set corresponding to the female user c may be marked as "female". Alternatively, the sample feature set may be marked using a number, such as the number "1" for "male", the number "0" for "female", and vice versa.
在一实施例中,为便于对样本特征集合进行后续处理,“获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合”可以包括:In an embodiment, in order to facilitate subsequent processing of the sample feature set, “acquiring a multi-sample feature that has multiple genders in the application process and obtaining a sample feature set of the plurality of sample users” may include:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
对获取到的多维特征进行归一化处理,得到多个样本用户的样本特征集合。The obtained multi-dimensional features are normalized to obtain a sample feature set of a plurality of sample users.
其中,归一化是一种简化计算的方式,即将有量纲的表示,经过变换,化为无量纲的表达式,成为标量。对于具体采用何种归一化方式,可由本领域技术人员根据实际需要进行选取,本申请对此不做具体限定。Among them, normalization is a way to simplify the calculation, that is, a dimensional representation, transformed into a dimensionless expression, becomes a scalar. For the specific normalization method, it can be selected by a person skilled in the art according to actual needs, which is not specifically limited in this application.
比如,对于样本用户“男性用户b”的多维特征,本申请实施例对多维特征中的各维特征进行归一化处理,将原始特征数值归一化到0-1之间的数值,然后由归一化后的多维特征构成男性用户b的样本特征集合。For example, for the multi-dimensional feature of the sample user “male user b”, the embodiment of the present application normalizes each dimension feature in the multi-dimensional feature, and normalizes the original feature value to a value between 0-1, and then The normalized multidimensional features constitute a sample feature set of male user b.
202、获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合。202. Acquire an average feature value of a similar feature in the plurality of sample feature sets to obtain a gender reference feature set.
如上所述,本申请实施例中对得到的样本特征集合进行了性别标记,因此可基于样本特征集合的性别标记进行平均特征值的计算。比如,对于被标记为“男性”的样本特征集合,可以计算得到“男性”样本特征集合中各类特征的平均特征值;又比如,对于被标记为“女性”的样本特征集合,可以计算得到“女性”样本特征集合中各类特征的平均特征值。As described above, in the embodiment of the present application, the obtained sample feature set is gender-marked, so the calculation of the average feature value can be performed based on the gender flag of the sample feature set. For example, for a sample feature set labeled "male", the average feature value of each feature in the "male" sample feature set can be calculated; for example, for a sample feature set labeled "female", it can be calculated The average eigenvalue of each feature in the "female" sample feature set.
相应的,由于计算得到的平均特征值不同,得到的性别参考特征集合也不同。比如,在仅计算“男性”样本特征集合中同类特征的平均特征值时,那么得到的性别参考特征集合为表征“男性”的男性参考特征集合;又比如,在仅计算“女性”样本特征集合中同类特征的平均特征值时,那么得到的性别参考特征集合为表征“女性”的女性参考特征集合;又比如,在既计算“男性”样本特征集合中同类特征的平均特征值,又计算“女性”样本特征集合中同类特征的平均特征值时,将分别得到表征“女性”的女性参考特征集合,以及表征“男性”的男性参考特征集合。Correspondingly, because the calculated average eigenvalues are different, the obtained gender reference feature sets are also different. For example, when only the average feature value of the same feature in the "male" sample feature set is calculated, then the obtained gender reference feature set is a male reference feature set characterizing "male"; for example, only the "female" sample feature set is calculated. When the average eigenvalues of the similar features are obtained, then the obtained gender reference feature set is a female reference feature set representing "female"; for example, in calculating the average eigenvalue of the same feature in the "male" sample feature set, and calculating " When the female eigenvalues of the similar features in the sample feature set are averaged, a female reference feature set characterizing "female" and a male reference feature set characterizing "male" are respectively obtained.
203、获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合。203. Acquire a multi-dimensional feature of the gender-identified user in the application process, and obtain a feature set of the user of the unknown gender.
其中,未知性别用户的特征集合可以是在历史时间段内,按照预设频率采集的。历史时间段,例如可以是过去7天、10天;预设频率,例如可以是每10分钟采集一次、每半小时采集一次。可以理解的是,对于未知性别用户,对在历史时间段内每次采集的该未知性别用户的多维特征分类进行累计(如对特征“用户在购物类应用中浏览偏男性类商品”进行累计,对特征“用户在购物类应用中浏览偏女性类商品进行累计”),得到该未知性别用户在该历史时间段内的特征集合。例如,对于未知性别的用户a,通过对用户a在应用使用过程中具有性别识别性的多维特征进行获取,得到对应用户a的特征集合(X1,X2……,Xn),其中Xn为用户a的一维特征。The feature set of the unknown gender user may be collected according to a preset frequency within a historical time period. The historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for an unknown gender user, the multi-dimensional feature classification of the unknown gender user collected each time in the historical time period is accumulated (for example, the feature “user browses the male-type commodity in the shopping application” is accumulated, For the feature "the user accumulates the partial female products in the shopping application", the feature set of the unknown gender user in the historical time period is obtained. For example, for the user a of unknown gender, the feature set (X1, X2, ..., Xn) of the corresponding user a is obtained by acquiring the multi-dimensional feature of the user a having gender recognition during the application use, wherein Xn is the user a. One-dimensional features.
为提升识别用户性别的准确性,在一实施例中,采集未知性别用户的多维特征所选取的历史时间段,与采集样本用户(即已知性别用户)所选取的历史时间段相同。比如,采集样本用户的多维特征所选取的历史时间段为7天时,在采集未知性别用户的多维特征时,也选取历史时间段为7天。这样,对样本特征集合进行处理后得到的性别参考特征集合,将与未知性别用户的特征集合位于同一时间维度,达到提升识别用户性别准确性的目的。In order to improve the accuracy of identifying the user's gender, in an embodiment, the historical time period selected by the multi-dimensional feature of the unknown gender user is the same as the historical time period selected by the sample user (ie, the known gender user). For example, when the historical time period selected by the multi-dimensional feature of the sample user is 7 days, when the multi-dimensional feature of the user of the unknown gender is collected, the historical time period is also selected as 7 days. In this way, the gender reference feature set obtained by processing the sample feature set will be in the same time dimension as the feature set of the unknown gender user, thereby achieving the purpose of improving the gender accuracy of the identified user.
204、获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别。204. Acquire a similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
其中,之前步骤得到的性别参考特征集合不同,根据相似度识别未知性别用户的性别方式也不同。比如,在一实施例中,在之前得到的性别参考特征集合包括男性参考特征集合和女性参考特征集合时,“获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别”可以包括:Among them, the gender reference feature set obtained in the previous step is different, and the gender mode of identifying the unknown gender user according to the similarity is also different. For example, in an embodiment, when the previously obtained gender reference feature set includes a male reference feature set and a female reference feature set, “acquiring the similarity between the feature set and the gender reference feature set, and identifying the unknown gender user according to the similarity degree. Gender can include:
获取未知性别用户的特征集合与男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set of the unknown gender user and the male reference feature set;
获取未知性别用户的特征集合与女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set of the unknown gender user and the female reference feature set;
比较第一相似度与第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若第一相似度大于第二相似度,则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
本申请中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而不是用于描述特定顺序。The terms "first," "second," and "third," etc. in this application are used to distinguish different objects, and are not intended to describe a particular order.
其中,获取第一相似度的方式与获取第二相似度的方式相同。比如,本申请实施例中采用特征集合与性别参考特征集合的距离来描述特征集合和性别参考特征集合的相似度,距离越大,相似度越小,距离越小,相似度越大。需要说明的是,对于采用何种方式来计 算特征集合与性别参考特征集合的距离,本申请实施例不做具体限制,可由本领域技术人员根据实际需要选择合适的计算方式。The manner of obtaining the first similarity is the same as the manner of obtaining the second similarity. For example, in the embodiment of the present application, the distance between the feature set and the gender reference feature set is used to describe the similarity between the feature set and the gender reference feature set. The larger the distance, the smaller the similarity, the smaller the distance, and the greater the similarity. It should be noted that the method for calculating the distance between the feature set and the gender reference feature set is not limited in the embodiment of the present application, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
比如,本申请实施例中,按照如下公式计算特征集合与性别参考特征集合的距离:For example, in the embodiment of the present application, the distance between the feature set and the gender reference feature set is calculated according to the following formula:
Figure PCTCN2018116713-appb-000004
Figure PCTCN2018116713-appb-000004
其中,l表示未知性别用户的特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000005
表示未知性别用户的特征集合中的一维特征,在“n”的取值相同时,Xn和
Figure PCTCN2018116713-appb-000006
对应同类特征,例如,X1表示性别参考特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”,
Figure PCTCN2018116713-appb-000007
表示未知性别用户的特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”。n为大于2的正整数。
Where l represents the distance between the feature set of the unknown gender user and the gender reference feature set, and Xn represents the one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000005
A one-dimensional feature in a feature set representing an unknown gender user. When the values of "n" are the same, Xn and
Figure PCTCN2018116713-appb-000006
Corresponding to the same feature, for example, X1 represents the feature in the gender reference feature set "the length of time the user reads the partial male reading in the reading application",
Figure PCTCN2018116713-appb-000007
A feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application." n is a positive integer greater than 2.
其中,在获取得到未知性别用户的特征集合与男性参考特征集合的第一相似度,以及获取得到未知性别用户与女性参考特征集合的第二相似度之后,对第一相似度以及第二相似度的大小进行比较,从而根据相似度比较结果识别未知性别用户的的性别。The first similarity and the second similarity are obtained after the first similarity of the feature set of the unknown gender user and the male reference feature set is obtained, and the second similarity of the unknown gender user and the female reference feature set is obtained. The size is compared to identify the gender of an unknown gender user based on the similarity comparison result.
具体的,在第一相似度大于第二相似度时,说明未知性别用户更相似于男性用户,此时识别未知性别用户为男性用户;在第一相似度小于第二相似度时,说明未知性别用户更相似于女性用户,此时识别未知性别用户为女性用户;在第一相似度与第二相似度相同时,说明此时采集的未知性别用户的特征集合尚不足以支持对其性别进行识别,此时无识别结果。Specifically, when the first similarity is greater than the second similarity, the unknown gender user is more similar to the male user, and the unknown gender user is identified as the male user; when the first similarity is less than the second similarity, the unknown gender is indicated. The user is more similar to the female user. At this time, the user who identifies the unknown gender is the female user. When the first similarity and the second similarity are the same, it indicates that the feature set of the unknown gender user collected at this time is not enough to support the identification of the gender. At this time, there is no recognition result.
在一实施例中,在之前得到的性别参考特征集合为男性参考特征集合时,“获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别”可以包括:In an embodiment, when the previously obtained gender reference feature set is a male reference feature set, “acquiring the similarity of the feature set and the gender reference feature set, and identifying the gender of the unknown gender user according to the similarity” may include:
获取未知性别用户的特征集合与男性参考特征集合的距离,将该距离作为特征集合与男性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the male reference feature set, and using the distance as the similarity between the feature set and the male reference feature set;
判断特征集合与男性参考特征集合的距离是否位于第一预设距离区间;Determining whether the distance between the feature set and the male reference feature set is located in the first preset distance interval;
若是则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If yes, the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
其中,采用特征集合与男性参考特征集合的距离来描述特征集合与男性参考特征集合的相似度,距离越小,相似度越大,距离越大,相似度越小。需要说明的是,对于采用何种方式来计算特征集合与男性参考特征集合的距离,本申请实施例不做具体限制,可由本领域技术人员根据实际需要选择合适的计算方式。The distance between the feature set and the male reference feature set is used to describe the similarity between the feature set and the male reference feature set. The smaller the distance, the larger the similarity, the larger the distance, and the smaller the similarity. It should be noted that the method for calculating the distance between the feature set and the male reference feature set is not limited in specific embodiments, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
比如,本申请实施例中,按照如下公式计算特征集合与男性参考特征集合的距离:For example, in the embodiment of the present application, the distance between the feature set and the male reference feature set is calculated according to the following formula:
Figure PCTCN2018116713-appb-000008
Figure PCTCN2018116713-appb-000008
其中,l表示未知性别用户的特征集合与男性参考特征集合的距离,Xn表示男性参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000009
表示未知性别用户的特征集合中的一维特征,在“n”的取值相同时,Xn和
Figure PCTCN2018116713-appb-000010
对应同类特征,例如,X1表示男性参考特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”,
Figure PCTCN2018116713-appb-000011
表示未知性别用户的特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”。
Where l represents the distance between the feature set of the unknown gender user and the male reference feature set, and Xn represents the one-dimensional feature in the male reference feature set.
Figure PCTCN2018116713-appb-000009
A one-dimensional feature in a feature set representing an unknown gender user. When the values of "n" are the same, Xn and
Figure PCTCN2018116713-appb-000010
Corresponding to the same feature, for example, X1 represents the feature in the male reference feature set "the length of time the user reads the partial male reading in the reading application",
Figure PCTCN2018116713-appb-000011
A feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application."
其中,在获取得到未知性别用户的特征集合与男性参考特征集合的距离之后,进一步判断该距离是否位于第一预设距离区间。具体的,在该距离位于第一预设距离区间之内时,说明该未知性别用户倾向于男性用户,此时识别该未知性别用户为男性用户;在该距离位 于第一预设距离区间之外时,说明该未知性别用户并不倾向于男性用户,显然的,用户的性别非男即女,此时即可识别该未知性别用户为女性用户。After obtaining the distance between the feature set of the unknown gender user and the male reference feature set, it is further determined whether the distance is located in the first preset distance interval. Specifically, when the distance is within the first preset distance interval, the unknown gender user is inclined to be a male user, and the unknown gender user is identified as a male user; the distance is outside the first preset distance interval. At the same time, it is indicated that the user of the unknown gender does not tend to be a male user. Obviously, the gender of the user is not male or female, and the unknown gender user can be identified as a female user at this time.
此外,对于第一预设距离区间的设置,可以计算各被标记为“男性”的样本特征集合与男性参考特征集合的距离,将各距离中的最大距离作为第一预设距离区间的右侧端点,并将第一预设距离区间的左侧端点设置为零。In addition, for the setting of the first preset distance interval, the distance between each sample feature set labeled "male" and the male reference feature set may be calculated, and the maximum distance among the distances is taken as the right side of the first preset distance interval. Endpoint and set the left endpoint of the first preset distance interval to zero.
在一实施例中,在之前得到的性别参考特征集合为女性参考特征集合时,“获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别”可以包括:In an embodiment, when the previously obtained gender reference feature set is a female reference feature set, “acquiring the similarity between the feature set and the gender reference feature set, and identifying the gender of the unknown gender user according to the similarity” may include:
获取未知性别用户的特征集合与女性参考特征集合的距离,将该距离作为特征集合与女性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the female reference feature set, and using the distance as the similarity between the feature set and the female reference feature set;
判断该距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
若是则识别未知性别用户为女性用户,否则识别未知性别用户为男性用户。If yes, the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
其中,采用特征集合与女性参考特征集合的距离来描述特征集合与女性参考特征集合的相似度,距离越小,相似度越大,距离越大,相似度越小。需要说明的是,对于采用何种方式来计算特征集合与女性参考特征集合的距离,本申请实施例不做具体限制,可由本领域技术人员根据实际需要选择合适的计算方式。The distance between the feature set and the female reference feature set is used to describe the similarity between the feature set and the female reference feature set. The smaller the distance, the greater the similarity, the larger the distance, and the smaller the similarity. It should be noted that the method for calculating the distance between the feature set and the female reference feature set is not specifically limited in the embodiment of the present application, and a suitable calculation manner may be selected by a person skilled in the art according to actual needs.
比如,本申请实施例中,按照如下公式计算特征集合与女性参考特征集合的距离:For example, in the embodiment of the present application, the distance between the feature set and the female reference feature set is calculated according to the following formula:
Figure PCTCN2018116713-appb-000012
Figure PCTCN2018116713-appb-000012
其中,l表示未知性别用户的特征集合与女性参考特征集合的距离,Xn表示女性参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000013
表示未知性别用户的特征集合中的一维特征,在“n”的取值相同时,Xn和
Figure PCTCN2018116713-appb-000014
对应同类特征,例如,X1表示女性参考特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”,
Figure PCTCN2018116713-appb-000015
表示未知性别用户的特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”。
Wherein, l represents the distance between the feature set of the unknown gender user and the female reference feature set, and Xn represents the one-dimensional feature in the female reference feature set.
Figure PCTCN2018116713-appb-000013
A one-dimensional feature in a feature set representing an unknown gender user. When the values of "n" are the same, Xn and
Figure PCTCN2018116713-appb-000014
Corresponding to the same feature, for example, X1 represents the feature in the female reference feature set "the length of time the user reads the partial male reading in the reading application",
Figure PCTCN2018116713-appb-000015
A feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application."
其中,在获取得到未知性别用户的特征集合与女性参考特征集合的距离之后,进一步判断该距离是否位于第二预设距离区间。具体的,在该距离位于第二预设距离区间之内时,说明该未知性别用户倾向于女性用户,此时识别该未知性别用户为女性用户;在该距离位于第二预设距离区间之外时,说明该未知性别用户并不倾向于女性用户,显然的,用户的性别非男即女,此时即可识别该未知性别用户为男性用户。After obtaining the distance between the feature set of the unknown gender user and the female reference feature set, it is further determined whether the distance is located in the second preset distance interval. Specifically, when the distance is within the second preset distance interval, the unknown gender user tends to be a female user, and the unknown gender user is identified as a female user; the distance is outside the second preset distance interval. At the same time, it is indicated that the user of the unknown gender does not tend to be a female user. Obviously, the gender of the user is not male or female, and the unknown gender user can be identified as a male user at this time.
此外,对于第二预设距离区间的设置,可以计算各被标记为“女性”的样本特征集合与女性参考特征集合的距离,将各距离中的最大距离作为第二预设距离区间的右侧端点,并将第二预设距离区间的左侧端点设置为零。In addition, for the setting of the second preset distance interval, the distance between each sample feature set labeled as “female” and the female reference feature set may be calculated, and the maximum distance among the distances is taken as the right side of the second preset distance interval. Endpoint and set the left endpoint of the second preset distance interval to zero.
由上可知,本申请实施例首先获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合,然后获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合,再获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合,最后获取特征集合与性别参考特征集合的相似度,并根据相似度预测未知性别用户的性别,从而实现对用户性别的准确识别,获得用户的性别信息。As can be seen from the above, the embodiment of the present application first obtains multi-dimensional features of multiple sample users that have gender recognition during application, obtain sample feature sets of multiple sample users, and then obtain average features of similar features in multiple sample feature sets. Value, obtain the gender reference feature set, and then obtain the gender-recognition multi-dimensional feature of the unknown gender user in the application process, obtain the feature set of the unknown gender user, and finally obtain the similarity between the feature set and the gender reference feature set, and according to the similarity To predict the gender of an unknown gender user, so as to accurately identify the gender of the user and obtain the gender information of the user.
下面将在上述实施例描述的方法基础上,对本申请的用户性别识别方法做进一步介绍。参考图3,该用户性别识别方法可以包括:The user gender recognition method of the present application will be further described below based on the method described in the above embodiments. Referring to FIG. 3, the user gender identification method may include:
301、获取多个样本用户在应用使用过程中具有性别识别性的多维特征。301. Acquire multiple sample users with gender-recognizing multi-dimensional features during application use.
其中,多维特征为已知性别用户如男性用户或女性用户在应用使用过程中具有性别识 别性的多维用户特征。比如,用户使用应用过程中具有男性或女性特点的行为特征。Among them, the multi-dimensional feature is a multi-dimensional user feature in which a gender user such as a male user or a female user has gender identity during application use. For example, a user has a behavioral characteristic with male or female characteristics in the application process.
其中,多维特征具有一定长度的维度,即该多维特征息由多个特征构成。该多维特征可以包括用户在使用应用的过程中具有性别识别性的行为特征,比如,用户在购物类应用中浏览偏男性类商品(如男装、剃须刀等)的次数与时长,用户在购物类应用中浏览偏女性类商品(如化妆品、女装等)次数与时长,用户在阅读类应用中阅读偏男性类读物的时长,用户在阅读类应用中阅读偏女性类读物的时长等。Wherein, the multi-dimensional feature has a dimension of a certain length, that is, the multi-dimensional feature is composed of a plurality of features. The multi-dimensional feature may include gender-recognized behavior characteristics of the user in the process of using the application, for example, the number and duration of browsing the male-type goods (such as men's wear, razors, etc.) in the shopping application, and the user is shopping. The number and duration of browsing female-oriented goods (such as cosmetics, women's clothing, etc.) in the application, the length of time that the user reads the male-like readings in the reading application, and the length of time the user reads the female-oriented readings in the reading application.
此外,该多维特征还可以包括与电子设备本身的相关行为特征信息,比如,用户使用拍摄类应用调用前置摄像头的次数,用户使用拍摄类应用调用后置摄像的次数等等。In addition, the multi-dimensional feature may further include related behavior characteristic information with the electronic device itself, for example, the number of times the user uses the shooting-type application to call the front camera, the number of times the user uses the shooting-type application to call the rear camera, and the like.
其中,得到的样本特征集合为多个,分别对应各个样本用户。各样本特征集合中的多维特征可以是在历史时间段内,按照预设频率采集的。历史时间段,例如可以是过去7天、10天;预设频率,例如可以是每10分钟采集一次、每半小时采集一次。可以理解的是,对于任一样本用户,对在历史时间段内每次采集的该样本用户的多维特征分类进行累计(如对特征“用户在购物类应用中浏览偏男性类商品”进行累计,对特征“用户在购物类应用中浏览偏女性类商品进行累计”),得到该样本用户在该历史时间段内的样本特征集合。例如,对于男性用户b,通过对该男性用户b在应用使用过程中具有性别识别性的多维特征进行获取,得到对应男性用户b的样本特征集合(X1,X2……,Xn),其中Xn为用户b的一维特征。The obtained sample feature set is multiple, and corresponds to each sample user. The multi-dimensional features in each sample feature set may be acquired according to a preset frequency within a historical time period. The historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for any sample user, the multi-dimensional feature classification of the sample user collected each time in the historical time period is accumulated (for example, the feature "user browses the male-type goods in the shopping application" is accumulated. For the feature "the user accumulates the partial female products in the shopping application", the sample feature set of the sample user in the historical time period is obtained. For example, for the male user b, by obtaining the multi-dimensional feature of the male user b with gender recognition during the application, the sample feature set (X1, X2, ..., Xn) corresponding to the male user b is obtained, where Xn is One-dimensional feature of user b.
在一实施例中,可以由服务器收集各已知性别用户基于其电子设备使用应用过程中的多维特征,然后,在性别识别时电子设备可以从服务器中获取。其中,已知性别用户可以为使用电子设备时提供了性别信息的用户,比如,在账号注册时提供了性别信息的用户等。In an embodiment, the multi-dimensional features of each known gender user based on their electronic device usage application process may be collected by the server, and then the electronic device may be obtained from the server at the time of gender recognition. Among them, a gender user is known to be a user who provides gender information when using an electronic device, for example, a user who provides gender information when the account is registered.
在构成样本特征集合之后,可以对各样本特征集合进行标记,得到每个样本特征集合的样本标签,由于本申请实施例要实现的是对未知性别用户的性别进行识别,因此,所标记的样本标签包括“男性”和“女性”,也即样本类别包括男性和女性。After constituting the sample feature set, each sample feature set may be marked to obtain a sample tag of each sample feature set. Since the embodiment of the present application is to identify the gender of the unknown gender user, the labeled sample is The labels include “male” and “female”, ie the sample categories include males and females.
在具体实施时,可根据已知性别用户的性别信息进行标记,例如:可将对应男性用户b的样本特征集合标记为“男性”;又例如,可将对应女性用户c的样本特征集合标记为“女性”。可选地,可以使用数字来对样本特征集合进行标记,如采用数字“1”表示“男性”,采用数字“0”表示“女性”,反之亦可。In a specific implementation, the gender information of the known gender user may be marked, for example, the sample feature set corresponding to the male user b may be marked as “male”; for example, the sample feature set corresponding to the female user c may be marked as "female". Alternatively, the sample feature set may be marked using a number, such as the number "1" for "male", the number "0" for "female", and vice versa.
302、对获取的多维特征进行归一化处理,得到多个样本用户的样本特征集合。302. Perform normalization processing on the acquired multi-dimensional features to obtain a sample feature set of multiple sample users.
其中,归一化是一种简化计算的方式,即将有量纲的表示,经过变换,化为无量纲的表达式,成为标量。对于具体采用何种归一化方式,可由本领域技术人员根据实际需要进行选取,本申请对此不做具体限定。Among them, normalization is a way to simplify the calculation, that is, a dimensional representation, transformed into a dimensionless expression, becomes a scalar. For the specific normalization method, it can be selected by a person skilled in the art according to actual needs, which is not specifically limited in this application.
比如,对于样本用户“男性用户b”的多维特征,本申请实施例对多维特征中的各维特征进行归一化处理,将原始特征数值归一化到0-1之间的数值,然后由归一化后的多维特征构成男性用户b的样本特征集合。For example, for the multi-dimensional feature of the sample user “male user b”, the embodiment of the present application normalizes each dimension feature in the multi-dimensional feature, and normalizes the original feature value to a value between 0-1, and then The normalized multidimensional features constitute a sample feature set of male user b.
一个具体的样本特征集合可如下表1所示,包括多个维度的特征,需要说明的是,表1所示的特征仅为举例,实际应用中,一个样本特征集合所包含的特征的数量,可以多于表1所示特征的数量,也可以少于表1所示特征的数量,所取的具体特征也可以与表1所示不同,此处不作具体限定。A specific set of sample features can be as shown in Table 1 below, including features of multiple dimensions. It should be noted that the features shown in Table 1 are only examples. In practical applications, the number of features included in a sample feature set, The number of features shown in Table 1 may be less than the number of features shown in Table 1, and the specific features may be different from those shown in Table 1, and are not specifically limited herein.
维度Dimension 特征信息Characteristic information
11 用户在购物类应用中浏览偏男性类商品(如男装)的次数The number of times a user viewed a male-type item (such as a men's clothing) in a shopping application.
22 用户在购物类应用中浏览偏男性类商品(如男装)的时长How long does a user browse a male-like item (such as a men's clothing) in a shopping app?
33 用户在购物类应用中浏览偏女性类商品(如化妆品、女装)的次数The number of times a user viewed a female product (such as cosmetics, women's clothing) in a shopping application
44 用户在购物类应用中浏览偏女性类商品(如化妆品、女装)的时长The length of time users browse women-oriented items (such as cosmetics, women's clothing) in the shopping app
55 用户在阅读类应用中阅读偏男性类读物的时长The length of time users read a male-like reading in a reading application
66 用户在阅读类应用中阅读偏女性类读物的时长The length of time a user reads a female reading in a reading application
77 用户在新闻类应用中阅读体育类新闻的时长How long do users read sports news in news apps?
88 用户在新闻类应用中阅读星座类新闻的时长The length of time users read constellation news in news applications
99 用户使用拍摄类应用调用前置摄像头自拍的次数The number of times the user uses the shooting application to call the front camera self-timer
1010 用户使用美颜类应用的次数Number of times users use beauty apps
1111 用户玩不同类别游戏应用的次数Number of times users play different types of game apps
1212 用户玩不同类别游戏应用的时长How long do users play different types of game apps?
表1Table 1
303、获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合。303. Acquire an average feature value of a similar feature in the plurality of sample feature sets to obtain a gender reference feature set.
如上所述,本申请实施例中对得到的样本特征集合进行了性别标记,因此可基于样本特征集合的性别标记进行平均特征值的计算。比如,对于被标记为“男性”的样本特征集合,可以计算得到“男性”样本特征集合中各类特征的平均特征值;又比如,对于被标记为“女性”的样本特征集合,可以计算得到“女性”样本特征集合中各类特征的平均特征值。As described above, in the embodiment of the present application, the obtained sample feature set is gender-marked, so the calculation of the average feature value can be performed based on the gender flag of the sample feature set. For example, for a sample feature set labeled "male", the average feature value of each feature in the "male" sample feature set can be calculated; for example, for a sample feature set labeled "female", it can be calculated The average eigenvalue of each feature in the "female" sample feature set.
相应的,由于计算得到的平均特征值不同,得到的性别参考特征集合也不同。比如,在仅计算“男性”样本特征集合中同类特征的平均特征值时,那么得到的性别参考特征集合为表征“男性”的男性参考特征集合;又比如,在仅计算“女性”样本特征集合中同类特征的平均特征值时,那么得到的性别参考特征集合为表征“女性”的女性参考特征集合;又比如,在既计算“男性”样本特征集合中同类特征的平均特征值,又计算“女性”样本特征集合中同类特征的平均特征值时,将分别得到表征“女性”的女性参考特征集合,以及表征“男性”的男性参考特征集合。Correspondingly, because the calculated average eigenvalues are different, the obtained gender reference feature sets are also different. For example, when only the average feature value of the same feature in the "male" sample feature set is calculated, then the obtained gender reference feature set is a male reference feature set characterizing "male"; for example, only the "female" sample feature set is calculated. When the average eigenvalues of the similar features are obtained, then the obtained gender reference feature set is a female reference feature set representing "female"; for example, in calculating the average eigenvalue of the same feature in the "male" sample feature set, and calculating " When the female eigenvalues of the similar features in the sample feature set are averaged, a female reference feature set characterizing "female" and a male reference feature set characterizing "male" are respectively obtained.
304、获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合。304. Acquire a multi-dimensional feature of the gender-identified user in the application process, and obtain a feature set of the user of the unknown gender.
其中,未知性别用户的特征集合可以是在历史时间段内,按照预设频率采集的。历史时间段,例如可以是过去7天、10天;预设频率,例如可以是每10分钟采集一次、每半小时采集一次。可以理解的是,对于未知性别用户,对在历史时间段内每次采集的该未知性别用户的多维特征分类进行累计(如对特征“用户在购物类应用中浏览偏男性类商品”进行累计,对特征“用户在购物类应用中浏览偏女性类商品进行累计”),得到该未知性别用户在该历史时间段内的特征集合。例如,对于未知性别的用户a,通过对用户a在应用使用过程中具有性别识别性的多维特征进行获取,得到对应用户a的特征集合(X1,X2……,Xn),其中Xn为用户a的一维特征。The feature set of the unknown gender user may be collected according to a preset frequency within a historical time period. The historical time period may be, for example, the past 7 days or 10 days; the preset frequency may be, for example, collected every 10 minutes and collected every half hour. It can be understood that, for an unknown gender user, the multi-dimensional feature classification of the unknown gender user collected each time in the historical time period is accumulated (for example, the feature “user browses the male-type commodity in the shopping application” is accumulated, For the feature "the user accumulates the partial female products in the shopping application", the feature set of the unknown gender user in the historical time period is obtained. For example, for the user a of unknown gender, the feature set (X1, X2, ..., Xn) of the corresponding user a is obtained by acquiring the multi-dimensional feature of the user a having gender recognition during the application use, wherein Xn is the user a. One-dimensional features.
为提升识别用户性别的准确性,在一实施例中,采集未知性别用户的多维特征所选取的历史时间段,与采集样本用户(即已知性别用户)所选取的历史时间段相同。比如,采集样本用户的多维特征所选取的历史时间段为7天时,在采集未知性别用户的多维特征时,也选取历史时间段为7天。这样,对样本特征集合进行处理后得到的性别参考特征集合,将与未知性别用户的特征集合位于同一时间维度,达到提升识别用户性别准确性的目的。In order to improve the accuracy of identifying the user's gender, in an embodiment, the historical time period selected by the multi-dimensional feature of the unknown gender user is the same as the historical time period selected by the sample user (ie, the known gender user). For example, when the historical time period selected by the multi-dimensional feature of the sample user is 7 days, when the multi-dimensional feature of the user of the unknown gender is collected, the historical time period is also selected as 7 days. In this way, the gender reference feature set obtained by processing the sample feature set will be in the same time dimension as the feature set of the unknown gender user, thereby achieving the purpose of improving the gender accuracy of the identified user.
305、获取特征集合与男性参考特征集合的第一相似度;305. Acquire a first similarity between the feature set and the male reference feature set.
也即计算未知性别用户的特征集合与男性参考特征集合的第一相似度。That is, the first similarity between the feature set of the unknown gender user and the male reference feature set is calculated.
306、获取特征集合与女性参考特征集合的第二相似度;306. Acquire a second similarity between the feature set and the female reference feature set.
也即计算未知性别用户的特征集合与女性参考特征集合的第二相似度。That is, the second similarity of the feature set of the unknown gender user and the female reference feature set is calculated.
其中,获取第一相似度的方式与获取第二相似度的方式相同。比如,本申请实施例中 采用特征集合与性别参考特征集合的距离来描述特征集合和性别参考特征集合的相似度,距离越大,相似度越小,距离越小,相似度越大。需要说明的是,对于采用何种方式来计算特征集合与性别参考特征集合的距离,本申请实施例不做具体限制,可由本领域技术人员根据实际需要选择合适的计算方式。The manner of obtaining the first similarity is the same as the manner of obtaining the second similarity. For example, in the embodiment of the present application, the distance between the feature set and the gender reference feature set is used to describe the similarity between the feature set and the gender reference feature set. The larger the distance, the smaller the similarity, and the smaller the distance, the greater the similarity. It should be noted that the method for calculating the distance between the feature set and the gender reference feature set is not limited in specific embodiments, and a suitable calculation method may be selected by a person skilled in the art according to actual needs.
比如,本申请实施例中,按照如下公式计算特征集合与性别参考特征集合的距离:For example, in the embodiment of the present application, the distance between the feature set and the gender reference feature set is calculated according to the following formula:
Figure PCTCN2018116713-appb-000016
Figure PCTCN2018116713-appb-000016
其中,l表示未知性别用户的特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000017
表示未知性别用户的特征集合中的一维特征,在“n”的取值相同时,Xn和
Figure PCTCN2018116713-appb-000018
对应同类特征,例如,X1表示性别参考特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”,
Figure PCTCN2018116713-appb-000019
表示未知性别用户的特征集合中的特征“用户在阅读类应用中阅读偏男性类读物的时长”。
Where l represents the distance between the feature set of the unknown gender user and the gender reference feature set, and Xn represents the one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000017
A one-dimensional feature in a feature set representing an unknown gender user. When the values of "n" are the same, Xn and
Figure PCTCN2018116713-appb-000018
Corresponding to the same feature, for example, X1 represents the feature in the gender reference feature set "the length of time the user reads the partial male reading in the reading application",
Figure PCTCN2018116713-appb-000019
A feature in a feature set representing an unknown gender user "the length of time a user reads a partial male reading in a reading application."
307、比较第一相似度与第二相似度的大小;307. Compare the sizes of the first similarity and the second similarity.
其中,在获取得到未知性别用户的特征集合与男性参考特征集合的第一相似度,以及获取得到未知性别用户与女性参考特征集合的第二相似度之后,对第一相似度以及第二相似度的大小进行比较,从而根据相似度比较结果识别未知性别用户的的性别。The first similarity and the second similarity are obtained after the first similarity of the feature set of the unknown gender user and the male reference feature set is obtained, and the second similarity of the unknown gender user and the female reference feature set is obtained. The size is compared to identify the gender of an unknown gender user based on the similarity comparison result.
308、若第一相似度大于第二相似度,则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。308. If the first similarity is greater than the second similarity, identify the unknown gender user as a male user, and otherwise identify the unknown gender user as a female user.
具体的,在第一相似度大于第二相似度时,说明未知性别用户更相似于男性用户,此时识别未知性别用户为男性用户;在第一相似度小于第二相似度时,说明未知性别用户更相似于女性用户,此时识别未知性别用户为女性用户;在第一相似度与第二相似度相同时,说明此时采集的未知性别用户的特征集合尚不足以支持对其性别进行识别,此时无识别结果。Specifically, when the first similarity is greater than the second similarity, the unknown gender user is more similar to the male user, and the unknown gender user is identified as the male user; when the first similarity is less than the second similarity, the unknown gender is indicated. The user is more similar to the female user. At this time, the user who identifies the unknown gender is the female user. When the first similarity and the second similarity are the same, it indicates that the feature set of the unknown gender user collected at this time is not enough to support the identification of the gender. At this time, there is no recognition result.
由上可知,本申请实施例首先获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合,然后获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合,再获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合,最后获取特征集合与性别参考特征集合的相似度,并根据相似度预测未知性别用户的性别,从而实现对用户性别的准确识别,获得用户的性别信息。As can be seen from the above, the embodiment of the present application first obtains multi-dimensional features of multiple sample users that have gender recognition during application, obtain sample feature sets of multiple sample users, and then obtain average features of similar features in multiple sample feature sets. Value, obtain the gender reference feature set, and then obtain the gender-recognition multi-dimensional feature of the unknown gender user in the application process, obtain the feature set of the unknown gender user, and finally obtain the similarity between the feature set and the gender reference feature set, and according to the similarity To predict the gender of an unknown gender user, so as to accurately identify the gender of the user and obtain the gender information of the user.
本申请实施例还提供了一种用户性别识别装置,包括:The embodiment of the present application further provides a user gender identification device, including:
第一特征获取模块,用于获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;a first feature acquiring module, configured to acquire a multi-dimensional feature of the plurality of sample users that has gender recognition during use of the application, and obtain a sample feature set of the plurality of sample users;
特征集合生成模块,用于获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;a feature set generating module, configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set;
第二特征获取模块,用于获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;a second feature acquisition module, configured to acquire a multi-dimensional feature that is gender-recognized by an unknown gender user during application use, and obtain a feature set of the unknown gender user;
用户性别识别模块,用于获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。The user gender identification module is configured to acquire the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
在一些实施例中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合,所述用户性别识别模块,还用于:In some embodiments, the gender reference feature set includes a male reference feature set and a female reference feature set, and the user gender identification module is further configured to:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
在一些实施例中,所述性别参考特征集合为男性参考特征集合,所述用户性别识别模块还用于:In some embodiments, the gender reference feature set is a male reference feature set, and the user gender identification module is further configured to:
获取所述特征集合与所述男性参考特征集合的距离,将所述距离作为所述特征集合与所述男性参考特征集合的相似度;Obtaining a distance between the feature set and the male reference feature set, and using the distance as a similarity between the feature set and the male reference feature set;
判断所述距离是否位于第一预设距离区间;Determining whether the distance is in a first preset distance interval;
若是则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If yes, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
在一些实施例中,所述性别参考特征集合为女性参考特征集合,所述用户性别识别模块还用于:In some embodiments, the gender reference feature set is a female reference feature set, and the user gender identification module is further configured to:
获取所述特征集合与所述女性参考特征集合的距离,将所述距离作为所述特征集合与所述女性参考特征集合的相似度;Obtaining a distance between the feature set and the female reference feature set, and using the distance as a similarity between the feature set and the female reference feature set;
判断所述距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
若是则识别所述未知性别用户为女性用户,否则识别所述未知性别用户为男性用户。If yes, the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
在一些实施例中,所述第一特征获取模块还用于:In some embodiments, the first feature acquisition module is further configured to:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
对获取的所述多维特征进行归一化处理,得到所述多个样本用户的样本特征集合。And normalizing the obtained multi-dimensional features to obtain a sample feature set of the plurality of sample users.
在一些实施例中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离。In some embodiments, the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
在一些实施例中,所述用户性别识别模块,用于通过如下公式计算所述特征集合与所述性别参考特征集合的距离:In some embodiments, the user gender identification module is configured to calculate a distance between the feature set and the gender reference feature set by:
Figure PCTCN2018116713-appb-000020
Figure PCTCN2018116713-appb-000020
其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000021
表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
Figure PCTCN2018116713-appb-000022
对应同类特征,n为大于2的正整数。
Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000021
A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
Figure PCTCN2018116713-appb-000022
Corresponding to the same feature, n is a positive integer greater than 2.
在一些实施例中,所述第二特征获取模块,用于:在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。In some embodiments, the second feature acquisition module is configured to: collect, during a historical time period, a multi-dimensional feature of an unknown gender user having gender recognition during application use according to a preset frequency.
在一些实施例中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;In some embodiments, the set of gender reference features comprises a set of male reference features and a set of female reference features;
用户性别识别模块,用于:User gender identification module for:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
若所述第一相似度等所述第二相似度,则无识别结果。If the first similarity is the second similarity, there is no recognition result.
在一实施例中还提供了一种用户性别识别装置。请参阅图4,图4为本申请实施例提供的用户性别识别装置的结构示意图。其中该用户性别识别装置应用于电子设备,该用户性别识别装置包括第一特征获取模块401、特征集合生成模块402、第二特征获取模块403、用户性别识别模块404如下:A user gender identification device is also provided in an embodiment. Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a user gender identification apparatus according to an embodiment of the present application. The user gender identification device is applied to the electronic device, and the user gender recognition device includes a first feature acquisition module 401, a feature set generation module 402, a second feature acquisition module 403, and a user gender identification module 404 as follows:
第一特征获取模块401,用于获取多个样本用户在应用使用过程中具有性别识别性的 多维特征,得到多个样本用户的样本特征集合;The first feature obtaining module 401 is configured to obtain a multi-dimensional feature that the plurality of sample users have gender recognition during the application, and obtain a sample feature set of the plurality of sample users;
特征集合生成模块402,用于获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;The feature set generation module 402 is configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set;
第二特征获取模块403,用于获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合;The second feature acquisition module 403 is configured to obtain a multi-dimensional feature that is gender-recognized by an unknown gender user during use, and obtain a feature set of an unknown gender user;
用户性别识别模块404,用于获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别。The user gender identification module 404 is configured to obtain the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
在一实施例中,性别参考特征集合包括男性参考特征集合和女性参考特征集合,用户性别识别模块404,可以用于:In an embodiment, the gender reference feature set includes a male reference feature set and a female reference feature set, and the user gender identification module 404 can be used to:
获取未知性别用户的特征集合与男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set of the unknown gender user and the male reference feature set;
获取未知性别用户的特征集合与女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set of the unknown gender user and the female reference feature set;
比较第一相似度与第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若第一相似度大于第二相似度,则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
在一实施例中,性别参考特征集合为男性参考特征集合,用户性别识别模块404,可以用于:In an embodiment, the gender reference feature set is a male reference feature set, and the user gender identification module 404 can be used to:
获取未知性别用户的特征集合与男性参考特征集合的距离,将该距离作为特征集合与男性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the male reference feature set, and using the distance as the similarity between the feature set and the male reference feature set;
判断特征集合与男性参考特征集合的距离是否位于第一预设距离区间;Determining whether the distance between the feature set and the male reference feature set is located in the first preset distance interval;
若是则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If yes, the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
在一实施例中,性别参考特征集合为女性参考特征集合,用户性别识别模块404,可以用于:In an embodiment, the gender reference feature set is a female reference feature set, and the user gender identification module 404 can be used to:
获取未知性别用户的特征集合与女性参考特征集合的距离,将该距离作为特征集合与女性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the female reference feature set, and using the distance as the similarity between the feature set and the female reference feature set;
判断该距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
若是则识别未知性别用户为女性用户,否则识别未知性别用户为男性用户。If yes, the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
在一实施例中,第一特征获取模块401,可以用于:In an embodiment, the first feature obtaining module 401 can be used to:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
对获取的多维特征进行归一化处理,得到多个样本用户的样本特征集合。The acquired multi-dimensional features are normalized to obtain a sample feature set of multiple sample users.
在一实施例中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离。In an embodiment, the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set.
在一实施例中,所述用户性别识别模块404,用于通过如下公式计算所述特征集合与所述性别参考特征集合的距离:In an embodiment, the user gender identification module 404 is configured to calculate a distance between the feature set and the gender reference feature set by using the following formula:
Figure PCTCN2018116713-appb-000023
Figure PCTCN2018116713-appb-000023
其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000024
表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
Figure PCTCN2018116713-appb-000025
对应同类特征,n为大于2的正整数。
Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000024
A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
Figure PCTCN2018116713-appb-000025
Corresponding to the same feature, n is a positive integer greater than 2.
在一实施例中,所述第二特征获取模块403,用于:在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。In an embodiment, the second feature acquiring module 403 is configured to: collect, according to a preset frequency, a multi-dimensional feature that the user of the unknown gender has gender recognition during the application in the historical time period.
在一实施例中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;In an embodiment, the gender reference feature set includes a male reference feature set and a female reference feature set;
用户性别识别模块404,用于:User gender identification module 404 for:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
若所述第一相似度等所述第二相似度,则无识别结果。If the first similarity is the second similarity, there is no recognition result.
本文所使用的术语“模块”“单元”可看做为在该运算系统上执行的软件对象。本文所述的不同组件、模块、引擎及服务可看做为在该运算系统上的实施对象。而本文所述的装置及方法可以以软件的方式进行实施,当然也可在硬件上进行实施,均在本申请保护范围之内。其中,用户性别识别装置中各模块执行的步骤可以参考上述方法实施例描述的方法步骤。该用户性别识别装置可以集成在电子设备中,如手机、平板电脑等。The term "module" "unit" as used herein may be taken to mean a software object that is executed on the computing system. The different components, modules, engines, and services described herein can be considered as implementation objects on the computing system. The apparatus and method described herein may be implemented in software, and may of course be implemented in hardware, all of which are within the scope of the present application. The steps performed by each module in the user gender identification device may refer to the method steps described in the foregoing method embodiments. The user gender identification device can be integrated in an electronic device such as a mobile phone, a tablet, or the like.
具体实施时,以上各个模块可以作为独立的实体实现,也可以进行任意组合,作为同一或若干个实体来实现,以上各个单位的具体实施可参见前面的实施例,在此不再赘述。For the specific implementation, the foregoing modules may be implemented as an independent entity, or may be implemented in any combination, and may be implemented as the same entity or a plurality of entities. For the specific implementation of the foregoing units, refer to the foregoing embodiments, and details are not described herein again.
由上可知,本实施例用户性别识别装置可以由第一特征获取模块401获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合;由特征集合生成模块402获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;由第二特征获取模块403获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合;由用户性别识别模块404获取特征集合与性别参考特征集合的相似度,并根据相似度预测未知性别用户的性别,从而实现对用户性别的准确识别,获得用户的性别信息。It can be seen from the above that the user gender identification device of the embodiment can obtain the multi-dimensional feature that the plurality of sample users have gender recognition during the application use process, and obtain the sample feature set of the plurality of sample users; The generating module 402 obtains the average feature value of the same feature in the plurality of sample feature sets to obtain the gender reference feature set; and the second feature acquiring module 403 obtains the multi-dimensional feature of the gender-identified user in the application process, and obtains the unknown gender. The feature set of the user is obtained by the user gender identification module 404, and the gender of the unknown gender user is predicted according to the similarity, so that the gender of the user is accurately recognized, and the gender information of the user is obtained.
本申请实施例还提供一种电子设备。请参阅图5,电子设备500包括处理器501以及存储器502。其中,处理器501与存储器502电性连接。An embodiment of the present application further provides an electronic device. Referring to FIG. 5, the electronic device 500 includes a processor 501 and a memory 502. The processor 501 is electrically connected to the memory 502.
所述处理器500是电子设备500的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或加载存储在存储器502内的计算机程序,以及调用存储在存储器502内的数据,执行电子设备500的各种功能并处理数据,从而实现对用户性别的准确识别。The processor 500 is a control center of the electronic device 500 that connects various portions of the entire electronic device using various interfaces and lines, by running or loading a computer program stored in the memory 502, and recalling data stored in the memory 502, The various functions of the electronic device 500 are performed and the data is processed to achieve accurate identification of the user's gender.
所述存储器502可用于存储软件程序以及模块,处理器501通过运行存储在存储器502的计算机程序以及模块,从而执行各种功能应用以及数据处理。存储器502可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的计算机程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。相应地,存储器502还可以包括存储器控制器,以提供处理器501对存储器502的访问。The memory 502 can be used to store software programs and modules, and the processor 501 executes various functional applications and data processing by running computer programs and modules stored in the memory 502. The memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a computer program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to Data created by the use of electronic devices, etc. Moreover, memory 502 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, memory 502 can also include a memory controller to provide processor 501 access to memory 502.
在本申请实施例中,电子设备500中的处理器501会按照如下的步骤,将一个或一个以上的计算机程序的进程对应的指令加载到存储器502中,并由处理器501运行存储在存储器502中的计算机程序,从而实现各种功能,如下:In the embodiment of the present application, the processor 501 in the electronic device 500 loads the instructions corresponding to the process of one or more computer programs into the memory 502 according to the following steps, and is stored in the memory 502 by the processor 501. The computer program in which to implement various functions, as follows:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合;Obtaining multi-dimensional features of multiple sample users with gender recognition during application use, and obtaining sample feature sets of multiple sample users;
获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;Obtaining average eigenvalues of similar features in the plurality of sample feature sets to obtain a gender reference feature set;
获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合;Obtaining a gender-recognized multi-dimensional feature of an unknown gender user during application use, and obtaining a feature set of an unknown gender user;
获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别。The similarity between the feature set and the gender reference feature set is obtained, and the gender of the unknown gender user is identified according to the similarity.
在某些实施方式中,在性别参考特征集合包括男性参考特征集合和女性参考特征集合,且获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别时, 处理器501可以具体执行以下步骤:In some embodiments, the gender reference feature set includes a male reference feature set and a female reference feature set, and obtains a similarity between the feature set and the gender reference feature set, and identifies a gender of the unknown gender user according to the similarity, the processor 501 can perform the following steps:
获取未知性别用户的特征集合与男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set of the unknown gender user and the male reference feature set;
获取未知性别用户的特征集合与女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set of the unknown gender user and the female reference feature set;
比较第一相似度与第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若第一相似度大于第二相似度,则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
在某些实施方式中,在性别参考特征集合为男性参考特征集合,且获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别时,处理器501可以具体执行以下步骤:In some embodiments, when the gender reference feature set is a male reference feature set, and the similarity between the feature set and the gender reference feature set is acquired, and the gender of the unknown gender user is identified according to the similarity, the processor 501 may specifically execute the following: step:
获取未知性别用户的特征集合与男性参考特征集合的距离,将该距离作为特征集合与男性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the male reference feature set, and using the distance as the similarity between the feature set and the male reference feature set;
判断特征集合与男性参考特征集合的距离是否位于第一预设距离区间;Determining whether the distance between the feature set and the male reference feature set is located in the first preset distance interval;
若是则识别未知性别用户为男性用户,否则识别未知性别用户为女性用户。If yes, the user who identifies the unknown gender is a male user, otherwise the user who identifies the unknown gender is a female user.
在某些实施方式中,在性别参考特征集合为女性参考特征集合,且获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别时,处理器501可以具体执行以下步骤:In some embodiments, when the gender reference feature set is a female reference feature set, and the similarity between the feature set and the gender reference feature set is acquired, and the gender of the unknown gender user is identified according to the similarity, the processor 501 may specifically execute the following: step:
获取未知性别用户的特征集合与女性参考特征集合的距离,将该距离作为特征集合与女性参考特征集合的相似度;Obtaining a distance between the feature set of the unknown gender user and the female reference feature set, and using the distance as the similarity between the feature set and the female reference feature set;
判断该距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
若是则识别未知性别用户为女性用户,否则识别未知性别用户为男性用户。If yes, the user who identifies the unknown gender is a female user, otherwise the user who identifies the unknown gender is a male user.
在某些实施方式中,在获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合时,处理器501还可以具体执行以下步骤:In some embodiments, the processor 501 may further perform the following steps when acquiring a multi-dimensional feature that the plurality of sample users have gender-recognition during application use, and obtaining a sample feature set of the plurality of sample users:
获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
对获取的多维特征进行归一化处理,得到多个样本用户的样本特征集合。The acquired multi-dimensional features are normalized to obtain a sample feature set of multiple sample users.
在某些实施方式中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离In some embodiments, the similarity between the feature set and the gender reference feature set includes: a distance between the feature set and the gender reference feature set
在某些实施方式中,在获取所述特征集合与所述性别参考特征集合的相似度时,处理器501可以具体执行以下步骤:In some embodiments, when acquiring the similarity between the feature set and the gender reference feature set, the processor 501 may specifically perform the following steps:
通过如下公式计算所述特征集合与所述性别参考特征集合的距离:The distance between the feature set and the gender reference feature set is calculated by the following formula:
Figure PCTCN2018116713-appb-000026
Figure PCTCN2018116713-appb-000026
其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
Figure PCTCN2018116713-appb-000027
表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
Figure PCTCN2018116713-appb-000028
对应同类特征,n为大于2的正整数。
Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
Figure PCTCN2018116713-appb-000027
A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
Figure PCTCN2018116713-appb-000028
Corresponding to the same feature, n is a positive integer greater than 2.
在某些实施方式中,在获取未知性别用户在应用使用过程中具有性别识别性的多维特征时,处理器501可以具体执行以下步骤:In some embodiments, the processor 501 may perform the following steps when acquiring a multi-dimensional feature of the gender-identified user during the application use process:
在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。In the historical time period, the multi-dimensional features of the gender-identified users in the application process are collected according to the preset frequency.
在某些实施方式中,在所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别时,处理器501还可以具体执行以下步骤:In some embodiments, the gender reference feature set includes a male reference feature set and a female reference feature set; the acquiring the similarity of the feature set and the gender reference feature set, and identifying according to the similarity When the gender of the unknown gender user is used, the processor 501 may further perform the following steps:
获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
若所述第一相似度等所述第二相似度,则无识别结果。If the first similarity is the second similarity, there is no recognition result.
由上述可知,本申请实施例首先获取充电行为发生时的充电特征集合,得到多个充电特征集合;然后对多个充电特征集合进行相似度识别,得到相似充电特征集合;再根据相似充电特征集合预测下一次充电行为;再根据预测的下一次充电行为确定对应的性能调整方式;最后根据确定的性能调整方式进行性能调整操作,从而实现对电子设备自身性能的动态调整,满足了用户的实际使用需求。As can be seen from the above, the embodiment of the present application first obtains a charging feature set when charging behavior occurs, and obtains a plurality of charging feature sets; then performs similarity recognition on the plurality of charging feature sets to obtain a similar charging feature set; and then according to the similar charging feature set. Predicting the next charging behavior; determining the corresponding performance adjustment mode according to the predicted next charging behavior; finally performing performance adjustment operation according to the determined performance adjustment mode, thereby realizing dynamic adjustment of the performance of the electronic device itself, and satisfying the actual use of the user demand.
请一并参阅图6,在某些实施方式中,电子设备500还可以包括:显示器503、射频电路504、音频电路505以及电源506。其中,其中,显示器503、射频电路504、音频电路505以及电源506分别与处理器501电性连接。Referring to FIG. 6 together, in some embodiments, the electronic device 500 may further include: a display 503, a radio frequency circuit 504, an audio circuit 505, and a power source 506. The display 503, the radio frequency circuit 504, the audio circuit 505, and the power source 506 are electrically connected to the processor 501, respectively.
所述显示器503可以用于显示由用户输入的信息或提供给用户的信息以及各种图形用户接口,这些图形用户接口可以由图形、文本、图标、视频和其任意组合来构成。显示器503可以包括显示面板,在某些实施方式中,可以采用液晶显示器(Liquid Crystal Display,LCD)、或者有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板。The display 503 can be used to display information entered by a user or information provided to a user, as well as various graphical user interfaces, which can be composed of graphics, text, icons, video, and any combination thereof. The display 503 can include a display panel. In some embodiments, the display panel can be configured in the form of a liquid crystal display (LCD) or an organic light-emitting diode (OLED).
所述射频电路504可以用于收发射频信号,以通过无线通信与网络设备或其他电子设备建立无线通讯,与网络设备或其他电子设备之间收发信号。The radio frequency circuit 504 can be used to transmit and receive radio frequency signals to establish wireless communication with a network device or other electronic device through wireless communication, and to transmit and receive signals with a network device or other electronic device.
所述音频电路505可以用于通过扬声器、传声器提供用户与电子设备之间的音频接口。The audio circuit 505 can be used to provide an audio interface between a user and an electronic device through a speaker or a microphone.
所述电源506可以用于给电子设备500的各个部件供电。在一些实施例中,电源506可以通过电源管理系统与处理器501逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The power source 506 can be used to power various components of the electronic device 500. In some embodiments, the power source 506 can be logically coupled to the processor 501 through a power management system to enable functions such as managing charging, discharging, and power management through the power management system.
尽管图8中未示出,电子设备500还可以包括摄像头、蓝牙模块等,在此不再赘述。Although not shown in FIG. 8, the electronic device 500 may further include a camera, a Bluetooth module, and the like, and details are not described herein again.
本申请实施例还提供一种存储介质,所述存储介质存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上述任一实施例中的用户性别识别方法,比如:获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到多个样本用户的样本特征集合;获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到未知性别用户的特征集合;获取特征集合与性别参考特征集合的相似度,并根据相似度识别未知性别用户的性别。The embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, causes the computer to perform the user gender identification method in any of the above embodiments, such as: Obtaining multi-dimensional features of multiple sample users in the process of application use, obtaining sample feature sets of multiple sample users; obtaining average feature values of similar features in multiple sample feature sets, obtaining gender reference feature sets; obtaining unknown The gender user has the gender-recognition multi-dimensional feature in the application process, obtains the feature set of the unknown gender user; obtains the similarity between the feature set and the gender reference feature set, and identifies the gender of the unknown gender user according to the similarity degree.
在本申请实施例中,存储介质可以是磁碟、光盘、只读存储器(Read Only Memory,ROM,)、或者随机存取记忆体(Random Access Memory,RAM)等。In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a read only memory (ROM), or a random access memory (RAM).
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the details that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
需要说明的是,对本申请实施例的用户性别识别方法而言,本领域普通测试人员可以理解实现本申请实施例的用户性别识别方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在电子设备的存储器中,并被该电子设备内的至少一个处理器执行,在执行过程中可包括如用户性别识别方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器、随机存取记忆体等。It should be noted that, in the user gender identification method of the embodiment of the present application, a general tester in the field can understand all or part of the process of implementing the user gender identification method in the embodiment of the present application, and the related hardware can be controlled by a computer program. To complete, the computer program may be stored in a computer readable storage medium, such as in a memory of the electronic device, and executed by at least one processor in the electronic device, and may include, for example, user gender during execution. The flow of an embodiment of the identification method. The storage medium may be a magnetic disk, an optical disk, a read only memory, a random access memory, or the like.
对本申请实施例的用户性别识别装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集 成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。For the user gender identification device of the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist physically separately, or two or more modules may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules. The integrated module, if implemented in the form of a software functional module and sold or used as a standalone product, may also be stored in a computer readable storage medium, such as a read only memory, a magnetic disk or an optical disk, etc. .
以上对本申请实施例所提供的一种用户性别识别方法、装置、存储介质及电子设备进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The user gender identification method, apparatus, storage medium, and electronic device provided by the embodiments of the present application are described in detail. The principles and implementation manners of the present application are described in the specific examples, and the foregoing embodiments are described. It is only used to help understand the method of the present application and its core idea; at the same time, for those skilled in the art, according to the idea of the present application, there will be changes in the specific implementation manner and application scope. The contents of the description should not be construed as limiting the application.

Claims (19)

  1. 一种用户性别识别方法,其中,包括:A user gender identification method, which includes:
    获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;Obtaining a multi-dimensional feature of the plurality of sample users having gender recognition during application use, and obtaining a sample feature set of the plurality of sample users;
    获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;Obtaining average eigenvalues of similar features in the plurality of sample feature sets to obtain a gender reference feature set;
    获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;Obtaining a gender-recognized multi-dimensional feature of an unknown gender user during application use, and obtaining a feature set of the unknown gender user;
    获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。Obtaining a similarity between the feature set and the gender reference feature set, and identifying a gender of the unknown gender user according to the similarity.
  2. 如权利要求1所述的用户性别识别方法,其中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:The user gender identification method according to claim 1, wherein the gender reference feature set comprises a male reference feature set and a female reference feature set, wherein the similarity of the feature set and the gender reference feature set is obtained, and The step of identifying the gender of the unknown gender user based on the similarity includes:
    获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
    获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
    比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
    若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
  3. 如权利要求1所述的用户性别识别方法,其中,所述性别参考特征集合为男性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:The user gender identification method according to claim 1, wherein the gender reference feature set is a male reference feature set, and the similarity between the feature set and the gender reference feature set is acquired, and according to the similarity The steps of identifying the gender of the unknown gender user include:
    获取所述特征集合与所述男性参考特征集合的距离,将所述距离作为所述特征集合与所述男性参考特征集合的相似度;Obtaining a distance between the feature set and the male reference feature set, and using the distance as a similarity between the feature set and the male reference feature set;
    判断所述距离是否位于第一预设距离区间;Determining whether the distance is in a first preset distance interval;
    若是则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If yes, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  4. 如权利要求1所述的用户性别识别方法,其中,所述性别参考特征集合为女性参考特征集合,所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:The user gender identification method according to claim 1, wherein the gender reference feature set is a female reference feature set, and the similarity between the feature set and the gender reference feature set is acquired, and according to the similarity The steps of identifying the gender of the unknown gender user include:
    获取所述特征集合与所述女性参考特征集合的距离,将所述距离作为所述特征集合与所述女性参考特征集合的相似度;Obtaining a distance between the feature set and the female reference feature set, and using the distance as a similarity between the feature set and the female reference feature set;
    判断所述距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
    若是则识别所述未知性别用户为女性用户,否则识别所述未知性别用户为男性用户。If yes, the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
  5. 如权利要求1所述的用户性别识别方法,其中,所述获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合的步骤包括:The user gender identification method according to claim 1, wherein the obtaining a plurality of sample users having gender-recognized multi-dimensional features during application use, and obtaining the sample feature sets of the plurality of sample users comprises:
    获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
    对获取的所述多维特征进行归一化处理,得到所述多个样本用户的样本特征集合。And normalizing the obtained multi-dimensional features to obtain a sample feature set of the plurality of sample users.
  6. 如权利要求1所述的用户性别识别方法,其中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离。The user gender identification method according to claim 1, wherein the similarity between the feature set and the gender reference feature set comprises: a distance between the feature set and the gender reference feature set.
  7. 如权利要求6所述的用户性别识别方法,其中,获取所述特征集合与所述性别参考特征集合的相似度的步骤包括:The user gender identification method according to claim 6, wherein the step of acquiring the similarity between the feature set and the gender reference feature set comprises:
    通过如下公式计算所述特征集合与所述性别参考特征集合的距离:The distance between the feature set and the gender reference feature set is calculated by the following formula:
    Figure PCTCN2018116713-appb-100001
    Figure PCTCN2018116713-appb-100001
    其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
    Figure PCTCN2018116713-appb-100002
    表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
    Figure PCTCN2018116713-appb-100003
    对应同类特征,n为大于2的正整数。
    Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
    Figure PCTCN2018116713-appb-100002
    A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
    Figure PCTCN2018116713-appb-100003
    Corresponding to the same feature, n is a positive integer greater than 2.
  8. 如权利要求1所述的用户性别识别方法,其中,获取未知性别用户在应用使用过程中具有性别识别性的多维特征的步骤包括:在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。The user gender identification method according to claim 1, wherein the step of acquiring a multi-dimensional feature of the gender-identified user during the application use process comprises: collecting the unknown gender user in the application process according to the preset frequency in the historical time period. Among the multi-dimensional features of gender recognition.
  9. 如权利要求1所述的用户性别识别方法,其中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;所述获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别的步骤包括:The user gender identification method according to claim 1, wherein the gender reference feature set comprises a male reference feature set and a female reference feature set; and the obtaining the similarity between the feature set and the gender reference feature set, and The step of identifying the gender of the unknown gender user based on the similarity includes:
    获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
    获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
    比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
    若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
    若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
    若所述第一相似度等所述第二相似度,则无识别结果。10、一种用户性别识别装置,其中,包括:If the first similarity is the second similarity, there is no recognition result. 10. A user gender identification device, comprising:
    第一特征获取模块,用于获取多个样本用户在应用使用过程中具有性别识别性的多维特征,得到所述多个样本用户的样本特征集合;a first feature acquiring module, configured to acquire a multi-dimensional feature of the plurality of sample users that has gender recognition during use of the application, and obtain a sample feature set of the plurality of sample users;
    特征集合生成模块,用于获取多个样本特征集合中同类特征的平均特征值,得到性别参考特征集合;a feature set generating module, configured to obtain an average feature value of a similar feature in the plurality of sample feature sets, to obtain a gender reference feature set;
    第二特征获取模块,用于获取未知性别用户在应用使用过程中具有性别识别性的多维特征,得到所述未知性别用户的特征集合;a second feature acquisition module, configured to acquire a multi-dimensional feature that is gender-recognized by an unknown gender user during application use, and obtain a feature set of the unknown gender user;
    用户性别识别模块,用于获取所述特征集合与所述性别参考特征集合的相似度,并根据所述相似度识别所述未知性别用户的性别。The user gender identification module is configured to acquire the similarity between the feature set and the gender reference feature set, and identify the gender of the unknown gender user according to the similarity.
  10. 如权利要求10所述的用户性别识别装置,其中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合,所述用户性别识别模块,还用于:The user gender identification device of claim 10, wherein the gender reference feature set comprises a male reference feature set and a female reference feature set, the user gender identification module, further configured to:
    获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
    获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
    比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
    若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If the first similarity is greater than the second similarity, the unknown gender user is identified as a male user, and the unknown gender user is identified as a female user.
  11. 如权利要求10所述的用户性别识别装置,其中,所述性别参考特征集合为男性参考特征集合,所述用户性别识别模块还用于:The user gender identification device of claim 10, wherein the gender reference feature set is a male reference feature set, and the user gender identification module is further configured to:
    获取所述特征集合与所述男性参考特征集合的距离,将所述距离作为所述特征集合与所述男性参考特征集合的相似度;Obtaining a distance between the feature set and the male reference feature set, and using the distance as a similarity between the feature set and the male reference feature set;
    判断所述距离是否位于第一预设距离区间;Determining whether the distance is in a first preset distance interval;
    若是则识别所述未知性别用户为男性用户,否则识别所述未知性别用户为女性用户。If yes, the unknown gender user is identified as a male user, otherwise the unknown gender user is identified as a female user.
  12. 如权利要求10所述的用户性别识别装置,其中,所述性别参考特征集合为女性参考特征集合,所述用户性别识别模块还用于:The user gender identification device according to claim 10, wherein the gender reference feature set is a female reference feature set, and the user gender identification module is further configured to:
    获取所述特征集合与所述女性参考特征集合的距离,将所述距离作为所述特征集合与所述女性参考特征集合的相似度;Obtaining a distance between the feature set and the female reference feature set, and using the distance as a similarity between the feature set and the female reference feature set;
    判断所述距离是否位于第二预设距离区间;Determining whether the distance is in a second preset distance interval;
    若是则识别所述未知性别用户为女性用户,否则识别所述未知性别用户为男性用户。If yes, the unknown gender user is identified as a female user, otherwise the unknown gender user is identified as a male user.
  13. 如权利要求10所述的用户性别识别装置,其中,所述第一特征获取模块还用于:The user gender identification device of claim 10, wherein the first feature acquisition module is further configured to:
    获取多个样本用户在应用使用过程中具有性别识别性的多维特征;Obtain multi-dimensional features of multiple sample users with gender recognition during application use;
    对获取的所述多维特征进行归一化处理,得到所述多个样本用户的样本特征集合。And normalizing the obtained multi-dimensional features to obtain a sample feature set of the plurality of sample users.
  14. 如权利要求10所述的用户性别识别装置,其中,所述特征集合与所述性别参考特征集合的相似度,包括:所述特征集合与所述性别参考特征集合的距离。The user gender recognition apparatus according to claim 10, wherein the similarity between the feature set and the gender reference feature set comprises: a distance between the feature set and the gender reference feature set.
  15. 如权利要求15所述的用户性别识别装置,其中,所述用户性别识别模块,用于通过如下公式计算所述特征集合与所述性别参考特征集合的距离:The user gender identification device according to claim 15, wherein the user gender identification module is configured to calculate a distance between the feature set and the gender reference feature set by the following formula:
    Figure PCTCN2018116713-appb-100004
    Figure PCTCN2018116713-appb-100004
    其中,l表示所述特征集合与性别参考特征集合的距离,Xn表示性别参考特征集合中的一维特征,
    Figure PCTCN2018116713-appb-100005
    表示未知性别用户的特征集合中的一维特征,在n的取值相同时,Xn和
    Figure PCTCN2018116713-appb-100006
    对应同类特征,n为大于2的正整数。
    Wherein, l represents the distance between the feature set and the gender reference feature set, and Xn represents a one-dimensional feature in the gender reference feature set.
    Figure PCTCN2018116713-appb-100005
    A one-dimensional feature in a feature set representing an unknown gender user. When the values of n are the same, Xn and
    Figure PCTCN2018116713-appb-100006
    Corresponding to the same feature, n is a positive integer greater than 2.
  16. 如权利要求10所述的用户性别识别装置,其中,所述第二特征获取模块,用于:在历史时间段按照预设频率采集未知性别用户在应用使用过程中具有性别识别性的多维特征。The user gender identification device according to claim 10, wherein the second feature acquisition module is configured to: collect, according to a preset frequency, a multi-dimensional feature that the gender user has gender recognition during the application use period according to the preset frequency.
  17. 如权利要求10所述的用户性别识别装置,其中,所述性别参考特征集合包括男性参考特征集合和女性参考特征集合;The user gender identification device of claim 10, wherein the gender reference feature set comprises a male reference feature set and a female reference feature set;
    用户性别识别模块,用于:User gender identification module for:
    获取所述特征集合与所述男性参考特征集合的第一相似度;Obtaining a first similarity between the feature set and the male reference feature set;
    获取所述特征集合与所述女性参考特征集合的第二相似度;Obtaining a second similarity between the feature set and the female reference feature set;
    比较所述第一相似度与所述第二相似度的大小;Comparing the magnitudes of the first similarity and the second similarity;
    若所述第一相似度大于所述第二相似度,则识别所述未知性别用户为男性用户;If the first similarity is greater than the second similarity, identifying the unknown gender user as a male user;
    若所述第一相似度小于所述第二相似度,则识别所述未知性别用户为女性用户;If the first similarity is less than the second similarity, identifying the unknown gender user as a female user;
    若所述第一相似度等所述第二相似度,则无识别结果。If the first similarity is the second similarity, there is no recognition result.
  18. 一种存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上运行时,使得所述计算机执行如权利要求1至9任一项所述的用户性别识别方法。A storage medium having stored thereon a computer program, wherein when the computer program is run on a computer, the computer is caused to perform the user gender identification method according to any one of claims 1 to 9.
  19. 一种电子设备,包括处理器和存储器,所述存储器储存有计算机程序,其中,所述处理器通过调用所述计算机程序,用于执行如权利要求1至9任一项所述的用户性别识别方法。An electronic device comprising a processor and a memory, the memory storing a computer program, wherein the processor is configured to perform user gender recognition according to any one of claims 1 to 9 by calling the computer program method.
PCT/CN2018/116713 2017-12-22 2018-11-21 User gender identification method, apparatus, storage medium, and electronic device WO2019120024A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711405392.X 2017-12-22
CN201711405392.XA CN110020167B (en) 2017-12-22 2017-12-22 User gender identification method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2019120024A1 true WO2019120024A1 (en) 2019-06-27

Family

ID=66993038

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116713 WO2019120024A1 (en) 2017-12-22 2018-11-21 User gender identification method, apparatus, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN110020167B (en)
WO (1) WO2019120024A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851759B (en) * 2019-10-31 2022-11-29 上海连尚网络科技有限公司 Method and equipment for identifying gender of new user
CN111506819B (en) * 2020-04-24 2023-05-16 成都安易迅科技有限公司 Hardware equipment recommendation method and device, server and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388080A (en) * 2008-10-23 2009-03-18 北京航空航天大学 Passerby gender classification method based on multi-angle information fusion
CN102663001A (en) * 2012-03-15 2012-09-12 华南理工大学 Automatic blog writer interest and character identifying method based on support vector machine
US20120259619A1 (en) * 2011-04-06 2012-10-11 CitizenNet, Inc. Short message age classification
CN105320948A (en) * 2015-11-19 2016-02-10 北京文安科技发展有限公司 Image based gender identification method, apparatus and system
CN105654131A (en) * 2015-12-30 2016-06-08 小米科技有限责任公司 Classification model training method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015145683A1 (en) * 2014-03-27 2015-10-01 株式会社日立システムズ Atm user assistance device
KR102291039B1 (en) * 2015-03-25 2021-08-19 한국전자통신연구원 personalized sports service providing method and apparatus thereof
CN105069016A (en) * 2015-07-13 2015-11-18 小米科技有限责任公司 Photograph album management method, photograph album management apparatus and terminal equipment
CN105574512A (en) * 2015-12-21 2016-05-11 小米科技有限责任公司 Method and device for processing image
CN106897727A (en) * 2015-12-21 2017-06-27 百度在线网络技术(北京)有限公司 A kind of user's gender identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388080A (en) * 2008-10-23 2009-03-18 北京航空航天大学 Passerby gender classification method based on multi-angle information fusion
US20120259619A1 (en) * 2011-04-06 2012-10-11 CitizenNet, Inc. Short message age classification
CN102663001A (en) * 2012-03-15 2012-09-12 华南理工大学 Automatic blog writer interest and character identifying method based on support vector machine
CN105320948A (en) * 2015-11-19 2016-02-10 北京文安科技发展有限公司 Image based gender identification method, apparatus and system
CN105654131A (en) * 2015-12-30 2016-06-08 小米科技有限责任公司 Classification model training method and device

Also Published As

Publication number Publication date
CN110020167B (en) 2022-01-07
CN110020167A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
WO2018014717A1 (en) Method and device for clustering and electronic equipment
WO2019120019A1 (en) User gender prediction method and apparatus, storage medium and electronic device
US9256693B2 (en) Recommendation system with metric transformation
WO2018192496A1 (en) Trend information generation method and device, storage medium and electronic device
WO2020156389A1 (en) Information pushing method and device
CN109918669B (en) Entity determining method, device and storage medium
WO2019062414A1 (en) Method and apparatus for managing and controlling application program, storage medium, and electronic device
WO2017000109A1 (en) Search method, search apparatus, user equipment, and computer program product
CN108961267B (en) Picture processing method, picture processing device and terminal equipment
CN110909209B (en) Live video searching method and device, equipment, server and storage medium
WO2021120875A1 (en) Search method and apparatus, terminal device and storage medium
WO2019085743A1 (en) User gender identification method and apparatus, and storage medium and electronic device
CN111783039B (en) Risk determination method, risk determination device, computer system and storage medium
WO2019120024A1 (en) User gender identification method, apparatus, storage medium, and electronic device
CN115271931A (en) Credit card product recommendation method and device, electronic equipment and medium
CN112926310A (en) Keyword extraction method and device
CN111090877A (en) Data generation method, data acquisition method, corresponding devices and storage medium
CN111782913A (en) Method and device for determining brand intention words
WO2019141143A1 (en) Method and apparatus for mining relationship between articles and recommending article, computation device and storage medium
CN109726726B (en) Event detection method and device in video
CN114298123A (en) Clustering method and device, electronic equipment and readable storage medium
US10241988B2 (en) Prioritizing smart tag creation
CN108932704B (en) Picture processing method, picture processing device and terminal equipment
CN112417197B (en) Sorting method, sorting device, machine readable medium and equipment
CN115330522A (en) Credit card approval method and device based on clustering, electronic equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18890706

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18890706

Country of ref document: EP

Kind code of ref document: A1