CN111309913A - Method for analyzing gender by name - Google Patents

Method for analyzing gender by name Download PDF

Info

Publication number
CN111309913A
CN111309913A CN202010118259.1A CN202010118259A CN111309913A CN 111309913 A CN111309913 A CN 111309913A CN 202010118259 A CN202010118259 A CN 202010118259A CN 111309913 A CN111309913 A CN 111309913A
Authority
CN
China
Prior art keywords
gender
data
probability
model
establishing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010118259.1A
Other languages
Chinese (zh)
Inventor
王连喜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huibo Technology Co ltd
Original Assignee
Beijing Huibo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huibo Technology Co ltd filed Critical Beijing Huibo Technology Co ltd
Priority to CN202010118259.1A priority Critical patent/CN111309913A/en
Publication of CN111309913A publication Critical patent/CN111309913A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for analyzing gender by name, which comprises the following steps: (1) acquiring basic data; (2) cleaning a database, and establishing a modeling set and a verification set; (3) calculating prior probability; (4) establishing a Bayesian model; (5) performing probability correction according to the result; (6) substituting into a verification set; (7) and (5) verifying actual application. The method for analyzing the gender by the name provided by the invention can be used for carrying out targeted marketing putting according to the analysis result in the industry with strong gender correlation in commodity marketing.

Description

Method for analyzing gender by name
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a method for analyzing gender through names.
Background
Under the background of advertisement push, commodities are accurately popularized to designated crowds, the conversion rate can be improved, the relation between the commodity attributes and the gender is particularly important, and the method is particularly key to predicting the gender by utilizing a data set when the names of clients are known.
The customized accurate touchdown of the advertisement is an effective means for realizing sales increase of merchants, wherein the gender attribute of a touchdown person is widely used for accurate marketing, and the method has a strong demand scene. The company surveys and finds that different genders have obvious purchasing differences in the aspects of item selection, emphasis point, price acceptance degree and the like in different sales fields, and the technology is based on reality, and realizes effective identification on the genders of users according to the names used by the users during purchasing through Bayesian algorithm and company internal data accumulation.
In the prior art, for example, NFT (name prediction for gender) and a data set obtain an overall probability and a probability of a name of a gender to be obtained, the gender is predicted by a bayesian principle, and a weighted gender prediction model is based on the bayesian principle. However, the shortage of data set cardinality in the method has a large influence on the result, a large amount of real gender data accumulation is needed, and the establishment of the model cannot be realized by common companies and individuals.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method for analyzing the gender by name, which analyzes the prior probability of all names, establishes a prior probability database, establishes a basic gender prediction probability model by using a Bayesian model, corrects the basic probability model according to the positions of Chinese characters appearing in the names and the matching combination of the characters, and improves the model accuracy.
In order to realize the purpose, the technical scheme adopted by the invention is as follows:
a method of analyzing gender by name, comprising the steps of:
(1) basic data acquisition
The data is obtained from the marking result of the cooperation merchant when the cooperation merchant is in contact with the user, and the data is real and effective;
(2) cleaning database, establishing modeling set and verification set
The method comprises the following steps that I, original data are not all real names, and some original data are chemical names, the data are regularized by establishing rules, and the real names are extracted to serve as a data set;
dividing the cleaned data into a modeling set and a verification set at random according to the proportion of 7:3 for establishing a model;
(3) calculating prior probabilities
Grouping by gender, counting the proportion of each Chinese character in the same gender as prior probability, and recording the prior probability;
(4) establishing Bayesian model
Substituting the prior probability obtained in the step (3) into a Bayes model, and calculating the Bayes probability of each Chinese character relative to the gender;
(5) making a probability correction based on the result
Through fitting, the weighted value of the Bayesian probability of each part of the Chinese characters is found, and a modified Bayesian probability model is established;
(6) substitution verification set
Substituting the established model into the verification set in the step (2);
(7) practical application verification
And on the premise of protecting personal privacy, data is encrypted and delivered to a service department for verification.
The method for analyzing the gender by the name provided by the invention can be used for carrying out targeted marketing putting according to the analysis result in the industry with strong gender correlation in commodity marketing. The method has the following technical effects:
1. in the aspect of accurate recommendation, the addition of the gender improves the accuracy of recommendation.
2. In the aspect of marketing documents, different reaching documents can be designed according to different genders, and differentiated document key points are selected, so that the documents are more attractive.
3. In the aspect of industry, the marketing acceptability of a certain sex to a certain industry is found through investigation, the sex with high acceptability can be mainly marketed, and the popularization expenditure is saved.
4. In the aspect of marketing time nodes, the marketing effects of different genders at different time nodes are different, for example, ordinary women prefer ornament commodities, but the purchasing power of men during the period of valentines is increased in an explosive manner, and the marketing effects can be improved according to different time nodes.
5. In the aspect of after-sale, the feminized after-sale in most industries is more suitable for receiving male customers, the feminized after-sale is more suitable for receiving female customers, and after-sale satisfaction can be improved by aiming at gender.
Drawings
FIG. 1 is a schematic flow diagram of the present invention.
Detailed Description
The specific technical scheme of the invention is described by combining the embodiment.
The technical scheme of the invention is shown in figure 1, and the method for analyzing gender by name comprises the following steps:
(1) basic data acquisition
The data is obtained from the marking result of the cooperation merchant when the cooperation merchant is in contact with the user, and the data is real and effective;
(2) cleaning database, establishing modeling set and verification set
I, original data are not real names, but are partially chemical names (such as Rough, pig, Xiaoxian girl and the like), and the data are normalized by establishing rules, so that the real names are extracted to be used as a data set;
dividing the cleaned data into a modeling set and a verification set at random according to the proportion of 7:3 for establishing a model;
(3) calculating prior probabilities
Grouping by gender, counting the proportion of each Chinese character in the same gender as prior probability, and recording the prior probability;
(4) establishing Bayesian model
Substituting the prior probability obtained in the step (3) into a Bayes model, and calculating the Bayes probability of each Chinese character relative to the gender;
(5) making a probability correction based on the result
According to the obtained result, the result is substituted into the verification set, and the fact that the positions of the Chinese characters are different is found, so that the accuracy of the result is influenced. Through fitting, the weighted value of the Bayesian probability of each part of the Chinese characters is found, and a modified Bayesian probability model is established;
(6) substitution verification set
And (3) substituting the established model into the verification set in the step (2), so that the verification result is good.
(7) Practical application verification
On the premise of protecting personal privacy, data encryption is delivered to a business department for verification, and the verification result proves that model prediction has high accuracy.

Claims (2)

1. A method for analyzing gender by name, comprising the steps of:
(1) basic data acquisition
The data is obtained from the marking result of the cooperation merchant when the cooperation merchant is in contact with the user, and the data is real and effective;
(2) cleaning database, establishing modeling set and verification set
(3) Calculating prior probabilities
Grouping by gender, counting the proportion of each Chinese character in the same gender as prior probability, and recording the prior probability;
(4) establishing Bayesian model
Substituting the prior probability obtained in the step (3) into a Bayes model, and calculating the Bayes probability of each Chinese character relative to the gender;
(5) making a probability correction based on the result
Through fitting, the weighted value of the Bayesian probability of each part of the Chinese characters is found, and a modified Bayesian probability model is established;
(6) substitution verification set
Substituting the established model into the verification set in the step (2);
(7) practical application verification
And on the premise of protecting personal privacy, data is encrypted and delivered to a service department for verification.
2. The method for analyzing gender by name as claimed in claim 1, wherein the step (2) comprises the sub-steps of:
the method comprises the following steps that I, original data are not all real names, and some original data are chemical names, the data are regularized by establishing rules, and the real names are extracted to serve as a data set;
and II, randomly dividing the cleaned data into a modeling set and a verification set according to the proportion of 7:3 for establishing the model.
CN202010118259.1A 2020-02-26 2020-02-26 Method for analyzing gender by name Pending CN111309913A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010118259.1A CN111309913A (en) 2020-02-26 2020-02-26 Method for analyzing gender by name

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010118259.1A CN111309913A (en) 2020-02-26 2020-02-26 Method for analyzing gender by name

Publications (1)

Publication Number Publication Date
CN111309913A true CN111309913A (en) 2020-06-19

Family

ID=71146452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010118259.1A Pending CN111309913A (en) 2020-02-26 2020-02-26 Method for analyzing gender by name

Country Status (1)

Country Link
CN (1) CN111309913A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312905A (en) * 2021-06-23 2021-08-27 北京有竹居网络技术有限公司 Information prediction method, information prediction device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411665A (en) * 2010-09-21 2012-04-11 腾讯科技(深圳)有限公司 Method and device for analyzing names
CN103389973A (en) * 2013-07-23 2013-11-13 安阳师范学院 Method for judging gender by utilizing Chinese name
CN104598452A (en) * 2013-10-30 2015-05-06 北京思博途信息技术有限公司 Method and device for analyzing user gender
CN110442709A (en) * 2019-06-24 2019-11-12 厦门美域中央信息科技有限公司 A kind of file classification method based on model-naive Bayesian

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411665A (en) * 2010-09-21 2012-04-11 腾讯科技(深圳)有限公司 Method and device for analyzing names
CN103389973A (en) * 2013-07-23 2013-11-13 安阳师范学院 Method for judging gender by utilizing Chinese name
CN104598452A (en) * 2013-10-30 2015-05-06 北京思博途信息技术有限公司 Method and device for analyzing user gender
CN110442709A (en) * 2019-06-24 2019-11-12 厦门美域中央信息科技有限公司 A kind of file classification method based on model-naive Bayesian

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312905A (en) * 2021-06-23 2021-08-27 北京有竹居网络技术有限公司 Information prediction method, information prediction device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US7328169B2 (en) Method and system for purchase-based segmentation
US8571919B2 (en) System and method for identifying attributes of a population using spend level data
US20080086365A1 (en) Method of analyzing credit card transaction data
CN112418932B (en) Marketing information pushing method and device based on user tag
US20130325567A1 (en) System and method for creating a virtual coupon
US20050149398A1 (en) Media targeting system and method
US20110178849A1 (en) System and method for matching merchants based on consumer spend behavior
US20110178841A1 (en) System and method for clustering a population using spend level data
US20110178844A1 (en) System and method for using spend behavior to identify a population of merchants
US20060122886A1 (en) Media targeting system and method
CN116862592B (en) Automatic push method for SOP private marketing information based on user behavior
Lipyanina et al. Targeting Model of HEI Video Marketing based on Classification Tree.
Zheng et al. A scalable purchase intention prediction system using extreme gradient boosting machines with browsing content entropy
Saragih et al. Analysis of brand experience and brand satisfaction with brand loyalty through brand trust as a variable mediation
CN116308556A (en) Advertisement pushing method and system based on Internet of things
CN103577472A (en) Method and system for obtaining and presuming personal information as well as method and system for classifying and retrieving commodities
US20110178843A1 (en) System and method for using spend behavior to identify a population of consumers that meet a specified criteria
Huseynov et al. Behavioural segmentation analysis of online consumer audience in Turkey by using real e-commerce transaction data
CN111309913A (en) Method for analyzing gender by name
CN116777562A (en) Electronic commerce AI system based on big data
CN116797290A (en) Intelligent advertisement delivery system and method thereof
CN111539782A (en) Merchant information data processing method and system based on deep learning
Mohamad et al. To what extent are credibility and attractiveness of social media influencer important in developing positive brand image and customer attitude?
KR102404247B1 (en) Customer management system
CN116091171A (en) Member statistics and management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200619