CN106503094A - A kind of user preference analysis method based on document - Google Patents

A kind of user preference analysis method based on document Download PDF

Info

Publication number
CN106503094A
CN106503094A CN201610896081.7A CN201610896081A CN106503094A CN 106503094 A CN106503094 A CN 106503094A CN 201610896081 A CN201610896081 A CN 201610896081A CN 106503094 A CN106503094 A CN 106503094A
Authority
CN
China
Prior art keywords
document
user
value
user preference
style
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610896081.7A
Other languages
Chinese (zh)
Inventor
张强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd, Guangzhou Shirui Electronics Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201610896081.7A priority Critical patent/CN106503094A/en
Publication of CN106503094A publication Critical patent/CN106503094A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Abstract

The invention discloses a kind of user preference analysis method based on document, including:Receive user document;The corresponding informance of each dimension in customer documentation according to default document analysis model extraction is simultaneously calculated, so as to obtain the total preference value of user;Wherein, the document analysis model is:H (x)=V1S1+V2S2+…+VnSn, wherein, H (x) represents user for the total preference value of the user of particular aspects, V1、V2…VnThe user preference value of each dimension of the particular aspects is represented respectively;S1、S2…SnRepresent the corresponding default weight coefficient of each dimension, n >=2;The total preference value of the user for having preserved is updated using the total preference value of the calculated user.A kind of user preference analysis method based on document disclosed by the invention has filled up Behavior preference analysis in the blank in document field, provides new source for personalized ventilation system resource.

Description

A kind of user preference analysis method based on document
Technical field
A kind of the present invention relates to user behavior preference analysis field, more particularly to user preference analysis side based on document Method.
Background technology
With the gradually popularization of the Internet and becoming increasingly abundant for network information resource, people are gradually walked from the absence of information epoch Enter the information overload epoch, it is very big that the continuous increase of quantity of information causes either information producer or information consumer all to encounter Difficult and challenge:The information for finding oneself demand from the information of magnanimity has become more and more difficult.At the same time, want to allow Product is shown one's talent in numerous information, and the concern for obtaining consumer is also more and more difficult.The method that commonly used in the past is to allow user Oneself input keyword search, search engine can be traveled through in lane database according to the key word submitted to by user, suitable to find Information recommended.The shortcoming of this method is:The clear and definite demand of oneself of user is needed, and needs user actively to retrieve.
In order to high efficiency, quick, actively provide the user with recommendation information, excavate user's information needed from mass data And recommend to user, the time that user obtains useful information is saved, personalized ventilation system arises at the historic moment, personalized ventilation system Corresponding user behavior preference analysis can be carried out according to user behavior data first, conventional method is by user behavior analysis Set up a user preferences modeling, the preference that the behavior of user is converted to user.Mostly modeling to user behavior preference at present it is In information search engine, net purchase platform etc. to user information search keyword, search information, net purchase merchandise news, net purchase evaluation The aspects such as information carry out preference analysis.
The present inventor's discovery when the research and development present invention, is also not based on the user behavior preference of document at present on the market Analysis method and user behavior preference analysis model, but user has spent in plenty of time and the elaborate document of energy and has included The bulk information of user behavior preference, selection of the such as user to document font, the setting of document color, document characters in a fancy style Use etc., the Behavior preference of user can be reflected, which results in the Internet document user behavior preference analysis this The blank and the Internet document resources in field are using a kind of upper waste.
Content of the invention
The purpose of the embodiment of the present invention is to provide a kind of user preference analysis method based on document, can enrich user's row The utilization rate of source database and raising the Internet document resources for preference analysis, while fill up user behavior preference analysis exist The blank in this field of document.
For achieving the above object, a kind of user preference analysis method based on document is embodiments provided, including:
Receive user document;
The corresponding informance of each dimension in customer documentation according to default document analysis model extraction is simultaneously counted Calculate, so as to obtain the total preference value of user;Wherein, the document analysis model is:H (x)=V1S1+V2S2+V3S3+V4S4+…+ VnSn, wherein, H (x) represents user for the total preference value of the user of particular aspects, V1、V2、……VnThe certain party is represented respectively The user preference value of each dimension in face, the user preference value of each dimension is based on the corresponding informance for obtaining and by default Formula is calculated;S1、S2、……SnRepresent the corresponding default weight coefficient of each dimension, n >=2;
The total preference value of the user for having preserved is updated using the total preference value of the calculated user.
Compared with prior art, customer documentation is divided into multiple by the embodiment of the present invention according to default document analysis model Dimension, extracts the document corresponding informance of each dimension, and the document corresponding informance based on default formula to each dimension The user preference value of each dimension is calculated, so that user is obtained for the user of particular aspects (such as file design style) Total preference value is simultaneously updated.Newest user is obtained based on the user preference analysis method of document by what the present embodiment was provided After the total preference value of user of particular aspects (such as file design style), can be carried using the total preference value of newest user For corresponding personalized service resource (for example, there is provided meet Service Source of the user in the preference of file design style).Therefore, The source database that can enrich user behavior preference analysis based on the user preference analysis method of document that the present embodiment is provided, The utilization rate of the Internet document resources is improved, blank of the user behavior preference analysis in this field of document has also been filled up, while New source is provided for personalized ventilation system resource.
As the improvement of such scheme, also include step:
In Push Service resource, according to renewal after the total preference value of the user push and meet user in the certain party The Service Source of the preference in face.
Used as the improvement of such scheme, the particular aspects include that file design style, the file design style include Document font, document color, three dimensions of document characters in a fancy style, the corresponding informance of each dimension in the customer documentation include text Shelves font information, document color information and document characters in a fancy style information.
As the improvement of such scheme, pre-set the total preference value of user of the file design style max-thresholds and Minimum threshold, the corresponding file design style of the max-thresholds are that pole complicates style, the corresponding document of the minimum threshold For extremely simplifying style, each value between the max-thresholds and the minimum threshold corresponds to the extremely complicated wind-transformation to design style Each file design style between lattice and extremely simplified style.
As the improvement of such scheme, the document font information is extracted by following steps and the corresponding user of calculating is inclined Good value:
The font of each character in document is obtained, the character number of corresponding font used in statistic document, by formula P= (a1*b1+a2*b2+…+ai*bi)/(b1+b2+…bi) calculate document in character average thickness value, wherein, a1、a2…aiRepresent The thickness value of specific font;b1、b2…biRepresent the character number using corresponding font, P represents the average of character in the document Thickness value, i >=1;
The user preference value that document font is calculated by formula P (f)=PQ, wherein:P (f) represents the document font User preference value, Q represent thickness weight coefficient, Q=| 1.5-P |;
User preference value P (f) is normalized to the threshold range between the max-thresholds and the minimum threshold Interior, so as to obtain user preference value V1.
As the improvement of such scheme, the document color information is extracted by following steps and the corresponding user of calculating is inclined Good value:
To the document, per page carries out sectional drawing, obtains the document each pixel by carrying out image procossing to every width sectional drawing Depth angle value, so as to obtain the depth angle value of each color in the document, and pass through formula P (c)=Y1A1+Y2A2+Y3A3+ Y4A4+…+YjAjThe user preference value of document color is calculated, wherein:P (c) represents the user preference value of the document color, Y1、 Y2……YjRepresent the depth angle value of each color, A1、A2……AjRepresent that each shade angle value accounts for the area of whole document Ratio, j >=1;
User preference value P (c) is normalized to the threshold range between the max-thresholds and the minimum threshold Interior, so as to obtain user preference value V2.
As the improvement of such scheme, the area ratio that each shade angle value accounts for whole document is calculated by following steps Example:
To the document, per page carries out sectional drawing, obtains the document each pixel by carrying out image procossing to every width sectional drawing Depth angle value;
By the ratio of cumulative for the depth angle value identical pixel rear and total pixel of document, account for as each shade angle value whole The area ratio of individual document.
As the improvement of such scheme, by formula Yj=Rm+Gb+Bk calculates the depth angle value of each color, wherein:Yj Represent that the depth angle value of each color in the document, R, G, B represent the numerical value of RGB channel in each pixel, m, b, k respectively Represent the coefficient of the RGB channel.
As the improvement of such scheme, m=0.299, b=0.587, k=0.114.
As the improvement of such scheme, the document characters in a fancy style information is extracted by following steps and calculate corresponding user Preference value:
The character number of each characters in a fancy style in document is obtained, by formula P (a)=U1C1+U2C2+U3C3…+UtCtCalculate The user preference value of document characters in a fancy style, wherein:P (a) represents the user preference value of the document characters in a fancy style, U1、U2……UtRepresent The character number of each characters in a fancy style, C1、C2……CtRepresent the default preference weight system of each characters in a fancy style corresponding Number, t >=1;
User preference value P (a) is normalized to the threshold range between the max-thresholds and the minimum threshold Interior, so as to obtain user preference value V3.
Description of the drawings
Fig. 1 is a kind of flow chart of the user preference analysis method based on document in the embodiment of the present invention 1;
Fig. 2 is the idiographic flow schematic diagram of step S2 in Fig. 1;
Fig. 3 is that in step S201, the total preference value threshold value of document design style user arranges schematic diagram in Fig. 2;
Fig. 4 is the idiographic flow schematic diagram of step S202 in Fig. 2;
Fig. 5 is the idiographic flow schematic diagram of step S203 in Fig. 2;
Fig. 6 is the idiographic flow schematic diagram of step S204 in Fig. 2;
Fig. 7 is a kind of flow chart of the user preference analysis method based on document in the embodiment of the present invention 2.
Specific embodiment
Accompanying drawing in below in conjunction with the embodiment of the present invention, to the embodiment of the present invention in technical scheme carry out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiment.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
Referring to Fig. 1, it is a kind of flow chart of user preference analysis method based on document that the embodiment of the present invention 1 is provided. A kind of user preference analysis method based on document that the embodiment of the present invention 1 is provided includes step S1~S3:
S1:Receive user document.
When being embodied as, reception can be user's real-time manufacturing and synchronously preserve to the document in the Internet high in the clouds, or It is user's real-time manufacturing for carrying out the document of user preference analysis, can also be that user has been stored in local or high in the clouds And extract for carrying out the document of user preference analysis.In the present embodiment, in order in time and to automatically obtain user newest Preference, preferably receive user real-time manufacturing the document that synchronously preserves to the Internet high in the clouds is for carrying out user preference point Analysis.
S2:The corresponding informance of each dimension in customer documentation according to default document analysis model extraction is simultaneously carried out Calculate, so as to obtain the total preference value of user;Wherein, the document analysis model is:H (x)=V1S1+V2S2+V3S3+V4S4+…+ VnSn, wherein, H (x) represents user for the total preference value of the user of particular aspects, V1、V2、……VnThe certain party is represented respectively The user preference value of each dimension in face, the user preference value of each dimension is based on the corresponding informance for obtaining and by default Formula is calculated;S1、S2、……SnRepresent the corresponding default weight coefficient of each dimension, n >=2.
User behavior preference analysis model and computational methods based on document are provided in this step, specifically, at this In step, total preference value H (x) of the user of the particular aspects includes but is not limited to file design style, the file design wind The dimension of lattice includes but is not limited to document font, document color, document characters in a fancy style, and the corresponding informance of document font dimension includes text Shelves font information, the corresponding informance of document color dimension include that document color information, the corresponding informance of document characters in a fancy style include text Shelves characters in a fancy style information.
S3:The total preference value of the user for having preserved is updated using the total preference value of the calculated user.
Specifically, after document every time using document analysis model analysiss user, will every time calculated user total Preference value is updated, and the total preference value result of calculation of user that will be newest is updated to the total preference value of the user for having preserved.
Customer documentation is divided into multiple dimensions according to default document analysis model by the embodiment of the present invention 1, is extracted each The document corresponding informance of individual dimension, and each dimension is calculated based on default formula to the document corresponding informance of each dimension The user preference value of degree, so as to obtaining user for the total preference value of the user of particular aspects (such as file design style) and carrying out Update.Newest user is obtained in particular aspects (example based on the user preference analysis method of document by what the present embodiment 1 was provided Such as file design style) the total preference value of user after, corresponding individual character can be provided using the total preference value of newest user Change Service Source (for example, there is provided meet Service Source of the user in the preference of file design style).Therefore, the embodiment of the present invention 1 source database that can be enriched user behavior preference analysis based on the user preference analysis method of document for providing, is improved mutually The utilization rate of networking document resources, has also filled up blank of the user behavior preference analysis in this field of document, while being individual character Change recommendation service resource and provide new source.
Understand the corresponding informance of each dimension in the foundation of document analysis model and customer documentation in step S2 for convenience Extraction process, name file design style, as a example by dimension is respectively document font, document color, document characters in a fancy style, in detail Illustrate how to calculate the total preference value of user for determining user in file design style.
Referring to Fig. 2, by calculate determine user file design style the total preference value of user include step S201~ S205, wherein:
S201:The max-thresholds and minimum threshold of the total preference value of user of the file design style are pre-set, described The corresponding file design style of max-thresholds is that pole complicates style, and the corresponding file design style of the minimum threshold is extremely letter Wind-transformation lattice, the corresponding pole of each value between the max-thresholds and the minimum threshold complicate style and extremely simplify style Between each file design style.
Specifically, the brief or complicated of the file design style is intuitively represented in order to convenient, referring to Fig. 3, Fig. 3 Be in Fig. 2 in step S201 document design style threshold value arrange schematic diagram, by total for user preference value arrange minimum threshold be 0, most Big threshold value is 100, and minimum threshold 0 is corresponding extremely to simplify style, and the corresponding pole of max-thresholds 100 complicates style, and minimum threshold 0 is to most Each value between big 100 interval of threshold value corresponds to described each file design extremely simplified between style and pole complication style Style.It should be understood that the concrete numerical value that the minimum threshold and max-thresholds are arranged can be set according to practical situation and demand Put, such as may also set up minimum threshold for 0, max-thresholds is 10, and here is not particularly limited.
S202:The document font information in the document is extracted, user is calculated for document font according to preset formula User preference value, and calculated user preference value is normalized to the threshold between the max-thresholds and the minimum threshold In the range of value.
Specifically, referring to figure, 4, the step can be realized by following steps, including step S2021~S2023, wherein:
S2021:The font of each character in document is obtained, the character number of corresponding font, passes through used in statistic document Formula P=(a1*b1+a2*b2+…+ai*bi)/(b1+b2+…bi) calculate document in character average thickness value, wherein, a1、a2… aiRepresent the thickness value of specific font;b1、b2…biRepresent that the character number using corresponding font, P represent character in the document Average thickness value, i >=1.
Specifically, each font all corresponds to a thickness value, and this is written in every kind of font, can be with direct access Arrive, the thickness value that can obtain each character in document by the font for obtaining each character in document has in statistic document The character number of identical thickness value, can be calculated the average thickness value of character in document.
S2022:The user preference value that document font is calculated by formula P (f)=PQ, wherein:P (f) represents the document The user preference value of font, Q represent thickness weight coefficient, Q=| 1.5-P |.
Specifically, what thickness weight coefficient Q was represented is its impact feelings to document style for some average thickness value Condition, each thickness value correspond to a thickness weight coefficient, and the font weight value deflection of concision style is less, and definition is average thick When thin value is 1.5, the requirement of concision style is best suited.
S2023:User preference value P (f) is normalized to the threshold between the max-thresholds and the minimum threshold In the range of value, so as to obtain user preference value V1.
The step is set for calculated document font user preference value P (f) to be normalized to the document for pre-setting In the total max-thresholds of preference value of user and the threshold range of minimum threshold of meter style so that document font this dimension user Preference value result of calculation is in the range of the total preference value of file design style user so that document font user preference value is set with document The total preference value of meter style user has unified criterion.
Specifically, as each font all corresponds to a thickness value (thickness value scope is fixed), such as with 1.5-2.0 point The thickness value scope of each font is not represented, 1.5 is minimum, 2.0 is maximum.Can be obtained according to the computing formula in step S2021 To in the document, the scope of average thickness value P of character is also 1.5~2.0, therefore obtains document by formula P (f)=PQ The scope of user preference value P (f) of font is 0~1.Then, user preference value P (f) that scope is 0~1 is returned by mapping One max-thresholds for changing the total preference value of user to the file design style for pre-setting and the threshold range (example of minimum threshold Such as, 0~100) in, normalize to user preference value V in the threshold range so as to obtain user preference value P (f)1.
S203:The document color information in the document is extracted, user is calculated for document font according to preset formula User preference value, and calculated user preference value is normalized to the threshold between the max-thresholds and the minimum threshold In the range of value.
Specifically, referring to Fig. 5, the step can be realized by following steps, including step S2031~S2032, wherein:
S2031:To the document, per page carries out sectional drawing, obtains the document often by carrying out image procossing to each sectional drawing The depth angle value of individual pixel, so as to obtain the depth angle value of each color in the document, and passes through formula P (c)=Y1A1+ Y2A2+Y3A3+Y4A4+…+YjAjThe user preference value of document color is calculated, wherein:P (c) represents that the user of the document color is inclined Good value, Y1、Y2……YjRepresent the depth angle value of each color, A1、A2……AjRepresent that each shade angle value accounts for whole text The area ratio of shelves, j >=1.
Specifically, the document integral color of concision style is partially light, and to document, per page carries out sectional drawing, to document per cut Figure carries out image procossing, obtains the depth angle value of each pixel, so as to obtain the depth angle value of each color in the document, tool Body, by formula Yj=Rm+Gb+Bk calculates the depth angle value of each color, wherein:YjRepresent each face in the document The depth angle value of color, R, G, B represent that the numerical value of RGB channel in each pixel, m, b, k represent the coefficient of the RGB channel respectively, M=0.299, b=0.587, k=0.114.The numerical value of RGB channel is different, then the depth angle value of pixel is different, i.e. each color Depth angle value different, a kind of each depth angle value corresponding color respectively then can be obtained by the depth angle value of each pixel To all colours that document is used, and the corresponding depth angle value of each color, by will be identical for each depth angle value described Pixel carry out cumulative obtain accumulated value, accumulated value is each shade angle value with the ratio of the total pixel of document and accounts for whole text The area ratio of shelves.
S2032:User preference value P (c) is normalized to the threshold between the max-thresholds and the minimum threshold In the range of value, so as to obtain user preference value V2.
The step is set for calculated document color user preference value P (c) to be normalized to the document for pre-setting In the total max-thresholds of preference value of user and the threshold range of minimum threshold of meter style so that document color this dimension user Preference value result of calculation is in the range of the total preference value of file design style user so that document color user preference value is set with document The total preference value of meter style user has unified criterion.
Specifically, the numerical value of the RGB channel of black for (0,0,0), white RGB channel numerical value for (255,255, 255), according to formula Y in step S2031jIt is 0~255 that=Rm+Gb+Bk can obtain the scope of the depth angle value of each color, According to formula P (c) in step S2031=Y1A1+Y2A2+Y3A3+Y4A4+…+YjAjDocument color user preference value P can be obtained C the scope of () is 0~255, user preference value P (c) that scope is 0~255 is normalized to the text for pre-setting by mapping The max-thresholds and minimum threshold of the total preference value of user of shelves design style threshold range (for example, 0~100) in, so as to User preference value V in the threshold range is normalized to user preference value P (c)2.
S204:The document characters in a fancy style information in the document is extracted, user is calculated for document font according to preset formula User preference value, and calculated user preference value is normalized between the max-thresholds and the minimum threshold In threshold range
Specifically, referring to Fig. 6, the step can be realized by following steps, including step S2041~S2042, wherein:
S2041:The character number of each characters in a fancy style in document is obtained, by formula P (a)=U1C1+U2C2+U3C3…+ UtCtThe user preference value of document characters in a fancy style is calculated, wherein:P (a) represents the user preference value of the document characters in a fancy style, U1、 U2……UtThe character number of each characters in a fancy style, C described in representing1、C2……CtRepresent the default of each characters in a fancy style corresponding Preference weight coefficient, t >=1;
Specifically, the species of characters in a fancy style used in document and the character using each characters in a fancy style corresponding are determined Number, is to determine due to the effect of characters in a fancy style, and each characters in a fancy style has corresponding style attribute, rule of thumb defined each Plant the default preference weight coefficient of characters in a fancy style.
S2042:User preference value P (a) is normalized to the threshold between the max-thresholds and the minimum threshold In the range of value, so as to obtain user preference value V3.
The step is used for calculated document characters in a fancy style user preference value P (a) to be normalized to the document for pre-setting In the total max-thresholds of preference value of the user of design style and the threshold range of minimum threshold so that document characters in a fancy style this dimension User preference value result of calculation is in the range of the total preference value of file design style user so that document characters in a fancy style user preference value with The total preference value of file design style user has unified criterion.
Specifically, due to analyze document in character number be to determine, limited, then used in document characters in a fancy style species And the use of the character number of each characters in a fancy style corresponding is also limited, according to the calculated text of formula in step S2041 The user preference value scope of shelves characters in a fancy style is also limited, sets user preference value P (a) scope of document characters in a fancy style as 0~1, and 0 Represent, 1 represents that character is all using the characters in a fancy style of pole complication style in document, then, by model Enclosing, the total preference value of user of the file design style for pre-setting is normalized to by mapping for 0~1 user preference value P (a) Max-thresholds and minimum threshold threshold range (for example, 0-100) in, so as to obtain user preference value P (a) normalization User preference value V Dao the threshold range in3.
S205:User preference value V that step S202~S204 is respectively obtained1, user preference value V2, user preference value V3 Substitute into document analysis model H (x)=V1S1+V2S2+V3S3In, so as to be calculated user for the user of file design style is total Preference value.
In this step, H (x) represents user for the total preference value of the user of file design style, V1、V2、V3Represent respectively User for the document font in file design style, document color, three dimensions of document characters in a fancy style user preference value, and S1、S2、S3Represent the corresponding default weight coefficient of each dimension.It should be understood that due to S1、S2、S3Represent that each dimension is corresponding Default weight coefficient, S1、S2、S3Three coefficients and be 1.
Referring to Fig. 7, it is a kind of flow chart of user preference analysis method based on document that the embodiment of the present invention 2 is provided. A kind of user preference analysis method based on document that the embodiment of the present invention 2 is provided includes step S21~S24:
S21:Receive user document.
When being embodied as, reception can be user's real-time manufacturing and synchronously preserve to the document in the Internet high in the clouds, or It is user's real-time manufacturing for carrying out the document of user preference analysis, can also be that user has been stored in local or high in the clouds And extract for carrying out the document of user preference analysis.In the present embodiment, in order in time and to automatically obtain user newest Preference, preferably receive user real-time manufacturing the document that synchronously preserves to the Internet high in the clouds is for carrying out user preference point Analysis.
S22:The corresponding informance of each dimension in customer documentation according to default document analysis model extraction is gone forward side by side Row is calculated, so as to obtain the total preference value of user;Wherein, the document analysis model is:H (x)=V1S1+V2S2+V3S3+V4S4+… +VnSn, wherein, H (x) represents user for the total preference value of the user of particular aspects, V1、V2、……VnRepresent described specific respectively The user preference value of each dimension of aspect, the user preference value of each dimension is based on the corresponding informance for obtaining and by pre- If formula is calculated;S1、S2、……SnRepresent the corresponding default weight coefficient of each dimension, n >=2.;
Specifically, in this step, total preference value H (x) of the user of the particular aspects includes but is not limited to file design Style, the dimension of the file design style include but is not limited to document font, document color, document characters in a fancy style, document font The corresponding informance of dimension includes that document font information, the corresponding informance of document color dimension include document color information, document skill The corresponding informance of art word includes document characters in a fancy style information.
S23:The total preference value of the user for having preserved is updated using the total preference value of the calculated user.
Specifically, after document every time using document analysis model analysiss user, will every time calculated user total Preference value is updated, and the total preference value result of calculation of user that will be newest is updated to the total preference value of the user for having preserved.
S24:In Push Service resource, according to renewal after the total preference value of the user push and meet user in the spy The Service Source of the preference of fixed aspect.
Specifically, the step is used for for total for document analysis user preference value result of calculation being applied to personalized ventilation system neck In domain, the total preference value of the calculated user of the according to embodiments of the present invention 2 document analysis models for providing can be pushed to user Service Source, for example, provide resource supplying at aspects such as shopping, viewings to client.
The embodiment of the present invention 2 provide a kind of user preference analysis method based on document in step S21~S23 with this In a kind of user preference analysis method based on document that bright embodiment 1 is provided, step S1~S3 is identical, in the embodiment of the present invention 2 Step S21~S23 specific implementation process is identical with step S1~S3 in the embodiment of the present invention 1, will not be described here.
Compared with Example 1, a kind of user preference analysis method based on document that the embodiment of the present invention 2 is provided increased Step S24, when being embodied as, the step is used for for total for document analysis user preference value result of calculation being applied to personalized recommendation clothes In business field, the total preference value of the calculated user of the according to embodiments of the present invention 2 document analysis models for providing can be to user Push Service resource, for example, provide resource supplying at aspects such as shopping, viewings to client.
Compared with Example 1, a kind of user preference analysis method based on document that the embodiment of the present invention 2 is provided is by basis The total preference value result of calculation of the calculated user of document analysis model analysiss that the present invention sets up is applied to personalized recommendation clothes The source database of personalized recommendation in business field, is enriched, and new source is provided for personalized ventilation system resource.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (10)

1. a kind of user preference analysis method based on document, it is characterised in that include:
Receive user document;
The corresponding informance of each dimension in customer documentation according to default document analysis model extraction is simultaneously calculated, from And obtain the total preference value of user;Wherein, the document analysis model is:H (x)=V1S1+V2S2+V3S3+V4S4+…+VnSn, its In, H (x) represents user for the total preference value of the user of particular aspects, V1、V2、……VnThe particular aspects every is represented respectively The user preference value of individual dimension, the user preference value of each dimension based on the corresponding informance for obtaining and pass through preset formula meter Obtain;S1、S2、……SnRepresent the corresponding default weight coefficient of each dimension, n >=2;
The total preference value of the user for having preserved is updated using the total preference value of the calculated user.
2. a kind of user preference analysis method based on document as claimed in claim 1, it is characterised in that also include step:
In Push Service resource, according to renewal after the total preference value of the user push and meet user in the particular aspects The Service Source of preference.
3. a kind of user preference analysis method based on document as claimed in claim 1, it is characterised in that the particular aspects Including file design style, the file design style includes document font, document color, three dimensions of document characters in a fancy style, institute The corresponding informance for stating each dimension in customer documentation includes document font information, document color information and document characters in a fancy style letter Breath.
4. a kind of user preference analysis method based on document as claimed in claim 3, it is characterised in that pre-set described The max-thresholds and minimum threshold of the total preference value of the user of file design style, the corresponding file design style of the max-thresholds Complicate style for pole, the corresponding file design style of the minimum threshold for extremely simplifying style, max-thresholds and described Each file design style between the corresponding pole complication style of each value and extremely simplified style between minimum threshold.
5. a kind of user preference analysis method based on document as claimed in claim 4, it is characterised in that by following steps Extract the document font information and calculate corresponding user preference value:
The font of each character in document is obtained, the character number of corresponding font used in statistic document, by formula P=(a1* b1+a2*b2+…+ai*bi)/(b1+b2+…bi) calculate document in character average thickness value, wherein, a1、a2…aiRepresent specific The thickness value of font;b1、b2…biRepresent that the character number using corresponding font, P represent the average thickness of character in the document Value, i >=1;
The user preference value that document font is calculated by formula P (f)=PQ, wherein:P (f) represents the user of the document font Preference value, Q represent thickness weight coefficient, Q=| 1.5-P |;
User preference value P (f) is normalized in the threshold range between the max-thresholds and the minimum threshold, from And obtain user preference value V1.
6. a kind of user preference analysis method based on document as claimed in claim 3, it is characterised in that by following steps Extract the document color information and calculate corresponding user preference value:
To the document, per page carries out sectional drawing, by carrying out the depth that image procossing obtains each pixel of document to every width sectional drawing Either shallow value, so as to obtain the depth angle value of each color in the document, and passes through formula P (c)=Y1A1+Y2A2+Y3A3+Y4A4 +…+YjAjThe user preference value of document color is calculated, wherein:P (c) represents the user preference value of the document color, Y1、 Y2……YjRepresent the depth angle value of each color, A1、A2……AjRepresent that each shade angle value accounts for the area of whole document Ratio, j >=1;
User preference value P (c) is normalized in the threshold range between the max-thresholds and the minimum threshold, from And obtain user preference value V2.
7. a kind of user preference analysis method based on document as claimed in claim 6, it is characterised in that by following steps Calculate the area ratio that each shade angle value accounts for whole document:
To the document, per page carries out sectional drawing, by carrying out the depth that image procossing obtains each pixel of document to every width sectional drawing Either shallow value;
By the ratio of cumulative for the depth angle value identical pixel rear and total pixel of document, whole text is accounted for as each shade angle value The area ratio of shelves.
8. a kind of user preference analysis method based on document as claimed in claim 6, it is characterised in that by formula Yj= Rm+Gb+Bk calculates the depth angle value of each color, wherein:YjRepresent the depth angle value of each color in the document, R, G, B represents that the numerical value of RGB channel in each pixel, m, b, k represent the coefficient of the RGB channel respectively.
9. a kind of user preference analysis method based on document as claimed in claim 8, it is characterised in that m=0.299, b= 0.587, k=0.114.
10. a kind of user preference analysis method based on document as claimed in claim 3, it is characterised in that by following step Suddenly extract the document characters in a fancy style information and calculate corresponding user preference value:
The character number of each characters in a fancy style in document is obtained, by formula P (a)=U1C1+U2C2+U3C3…+UtCtCalculate document The user preference value of characters in a fancy style, wherein:P (a) represents the user preference value of the document characters in a fancy style, U1、U2……UtRepresent described The character number of each characters in a fancy style, C1、C2……CtThe default preference weight coefficient of expression each characters in a fancy style corresponding, t >= 1;
User preference value P (a) is normalized in the threshold range between the max-thresholds and the minimum threshold, from And obtain user preference value V3.
CN201610896081.7A 2016-10-13 2016-10-13 A kind of user preference analysis method based on document Pending CN106503094A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610896081.7A CN106503094A (en) 2016-10-13 2016-10-13 A kind of user preference analysis method based on document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610896081.7A CN106503094A (en) 2016-10-13 2016-10-13 A kind of user preference analysis method based on document

Publications (1)

Publication Number Publication Date
CN106503094A true CN106503094A (en) 2017-03-15

Family

ID=58294961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610896081.7A Pending CN106503094A (en) 2016-10-13 2016-10-13 A kind of user preference analysis method based on document

Country Status (1)

Country Link
CN (1) CN106503094A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426583A (en) * 2011-10-10 2012-04-25 北京工业大学 Chinese medicine tongue manifestation retrieval method based on image content analysis
CN103106668A (en) * 2011-11-09 2013-05-15 佳能株式会社 Method and system for describing image region based on color histogram
CN104077344A (en) * 2013-12-31 2014-10-01 河南大学 Self-adaption learning region importance based interactive image retrieval method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426583A (en) * 2011-10-10 2012-04-25 北京工业大学 Chinese medicine tongue manifestation retrieval method based on image content analysis
CN103106668A (en) * 2011-11-09 2013-05-15 佳能株式会社 Method and system for describing image region based on color histogram
CN104077344A (en) * 2013-12-31 2014-10-01 河南大学 Self-adaption learning region importance based interactive image retrieval method and system

Similar Documents

Publication Publication Date Title
CN110909176B (en) Data recommendation method and device, computer equipment and storage medium
JP7065122B2 (en) How to get recommended information, devices, electronic devices
CN105791157B (en) A kind of distribution method of flow, distribution system and server
CN108021929A (en) Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN109978630A (en) A kind of Precision Marketing Method and system for establishing user's portrait based on big data
CN108062375A (en) A kind of processing method, device, terminal and the storage medium of user's portrait
CN108984650B (en) Computer-readable recording medium and computer device
CN103955842B (en) A kind of online advertisement commending system and method towards mass media data
CN103793537A (en) System for recommending individual music based on multi-dimensional time series analysis and achieving method of system
CN107222795A (en) A kind of video abstraction generating method of multiple features fusion
CN105426514A (en) Personalized mobile APP recommendation method
CN105142028A (en) Television program content searching and recommending method oriented to integration of three networks
CN103488788A (en) Method and device for recommending applications automatically
CN104899315A (en) Method and device for pushing user information
CN107330020B (en) User entity analysis method based on structure and attribute similarity
CN111177559B (en) Text travel service recommendation method and device, electronic equipment and storage medium
CN104199938B (en) Agricultural land method for sending information and system based on RSS
CN108647818A (en) A kind of method and device of prediction enterprise concerning taxes risk
CN106600213A (en) Intelligent resume management system and method
CN110599393A (en) Picture style conversion method, device and equipment and computer readable storage medium
CN108804577A (en) A kind of predictor method of information label interest-degree
CN108846043A (en) Network trace mining analysis method and system based on internet big data
CN102831219B (en) A kind of be applied to community discovery can covering clustering method
CN106097113A (en) A kind of social network user sound interest digging method
CN104765763B (en) A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315

RJ01 Rejection of invention patent application after publication