CN105512220B - Image page output method and device - Google Patents
Image page output method and device Download PDFInfo
- Publication number
- CN105512220B CN105512220B CN201510855907.0A CN201510855907A CN105512220B CN 105512220 B CN105512220 B CN 105512220B CN 201510855907 A CN201510855907 A CN 201510855907A CN 105512220 B CN105512220 B CN 105512220B
- Authority
- CN
- China
- Prior art keywords
- image
- recognition
- recognition result
- keyword
- text information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/43—Querying
- G06F16/435—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure is directed to a kind of image page output method and devices, belong to multimedia technology field.The described method includes: receiving the acquisition request to the page;When the page includes image, content recognition is carried out to described image, obtains recognition result;According to the recognition result, the first text information that the content of described image is described is generated;The page is exported, the page includes first text information.Due in the case where not downloading image, provide the text description to each image in current browse webpage, therefore user can be helped tentatively to understand the content of image, to assist user to decide whether to open or skip over this image, intelligence is more excellent, and saves network flow and wait the time of image downloading.
Description
Technical field
This disclosure relates to multimedia technology field, in particular to a kind of image page output method and device.
Background technique
With the continuous development of information technology, the function that intelligent terminal has is more and more.For example, can by intelligent terminal
Image in browsing pages.Currently, the quality of image is more and more superior, an image usually has several hundred K even size of several M.
In this way in the case where intelligent terminal is in mobile network's connection status, the image crossed in browsing pages will consume mass data stream
Amount.For this purpose, intelligent terminal additionally provides a kind of no figure browse mode, i.e., only shown when showing the page in the text in the page
Hold, the image in the page is not shown.But the page of only content of text for a user undoubtedly more it is uninteresting and
Dullness, therefore a kind of image page output method is needed, to solve the problems, such as above-mentioned consuming flow and lack vividness.
Summary of the invention
To overcome the problems in correlation technique, the disclosure provides a kind of image page output method and device.
According to the first aspect of the embodiments of the present disclosure, a kind of image page output method is provided, which comprises
Receive the acquisition request to the page;
When the page includes image, content recognition is carried out to described image, obtains recognition result;
According to the recognition result, the first text information that the content of described image is described is generated;
The page is exported, the page includes first text information.
Optionally, described according to the recognition result, generate the first text envelope that the content of described image is described
Breath, comprising:
Obtain in the page with associated second text information of described image;
The recognition result is verified according to second text information;
According to verification result and the recognition result, the first text envelope that the content of described image is described is generated
Breath.
Optionally, the method also includes:
Obtain at least one pre-set shielding keyword;
If including any shielding keyword in the recognition result, the recognition result is filtered out;Or,
If the appearance ratio for shielding keyword described in the recognition result is more than the first preset threshold, by the identification
As a result it filters out.
Optionally, before the progress content recognition to described image, the method also includes:
Object mark is carried out to the sample image that multiple include goal-selling object, obtains first kind mark image;
Image is marked according to the first kind and carries out model training, obtains the first model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using first model;
When including any goal-selling object in described image, obtain for describing the object in described image
First keyword.
Optionally, before the progress content recognition to described image, the method also includes:
Scene mark is carried out to multiple sample images for including default scene, obtains the second class mark image;
Image is marked according to second class and carries out model training, obtains the second model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using second model;
When including any default scene in described image, second for describing the scene in described image is obtained
Keyword.
Optionally, before the progress content recognition to described image, the method also includes:
Text marking is carried out to multiple sample images, obtains third class mark image;
Image is marked according to the third class and carries out model training, obtains third model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using the third model;
When including text in described image, the third keyword for describing the text in described image is obtained.
Optionally, the corresponding recognition confidence of each of described recognition result keyword, it is described according to described the
Two text informations verify the recognition result, comprising:
Word segmentation processing is carried out to second text information, obtains multiple participles;
For each of recognition result keyword, judge in the multiple participle whether to include the keyword;
If including the keyword in the multiple participle, increase the identification confidence of the keyword according to preset rules
Degree;
Wherein, the recognition confidence is for characterizing the probability being correctly validated.
Optionally, described according to verification result and the recognition result, it generates and the content of described image is described
First text information, comprising:
Obtain the nominal key that recognition confidence in the recognition result is greater than the second preset threshold;
Using RNN (Recurrent neural Network, Multi-Layer Feedback network) model by the nominal key group
At a sentence, using the sentence as first text information.
According to the second aspect of an embodiment of the present disclosure, a kind of image page output device is provided, described device includes:
Receiving module is configured as receiving the acquisition request to the page;
Identification module is configured as when the page includes image, is carried out content recognition to described image, is identified
As a result;
Generation module is configured as generating first that the content of described image is described according to the recognition result
Text information;
Output module is configured as exporting the page, and the page includes first text information.
Optionally, the generation module is configured as obtaining in the current browse webpage and described image associated
Two text informations;The recognition result is verified according to second text information;According to verification result and the identification
As a result, generating the first text information that the content of described image is described.
Optionally, described device further include:
Module is obtained, is configured as obtaining at least one pre-set shielding keyword;
Filtering module is configured as when in the recognition result including any shielding keyword, by the recognition result
It filters out;Or, when the appearance ratio of shielding keyword described in the recognition result is more than the first preset threshold, by the knowledge
Other result filters out.
Optionally, described device further include:
Labeling module is configured as including that the sample image of goal-selling object carries out object mark to multiple, obtains the
One kind mark image;
Training module is configured as marking image progress model training according to the first kind, obtains the first model;
The identification module is configured as carrying out image recognition to described image using first model, in the figure
When including any goal-selling object as in, the first keyword for describing the object in described image is obtained.
Optionally, described device further include:
Labeling module is configured as including the sample image progress scene mark for presetting scene to multiple, obtains the second class
Mark image;
Training module is configured as marking image progress model training according to second class, obtains the second model;
The identification module is configured as carrying out image recognition to described image using second model, in the figure
When including any default scene as in, the second keyword for describing the scene in described image is obtained.
Optionally, described device further include:
Labeling module is configured as carrying out text marking to multiple sample images, obtains third class mark image;
Training module is configured as marking image progress model training according to the third class, obtains third model;
The identification module is configured as carrying out image recognition to described image using the third model, in the figure
When including text as in, the third keyword for describing the text in described image is obtained.
Optionally, the corresponding recognition confidence of each of described recognition result keyword, the authentication module, quilt
It is configured to carry out word segmentation processing to second text information, obtains multiple participles;For each of recognition result key
Whether word judges in the multiple participle to include the keyword;If in the multiple participle including the keyword, according to
Preset rules increase the recognition confidence of the keyword;
Wherein, the recognition confidence is for characterizing the probability being correctly validated.
Optionally, it is default greater than second to be configured as obtaining recognition confidence in the recognition result for the generation module
The nominal key of threshold value;The nominal key is formed into a sentence using RNN model, using the sentence as described the
One text information.
According to the third aspect of an embodiment of the present disclosure, a kind of image page output device is provided, comprising:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to: receive to the acquisition request of the page;It is right when the page includes image
Described image carries out content recognition, obtains recognition result;According to the recognition result, the content of described image is retouched in generation
The first text information stated;The page is exported, the page includes first text information.
The technical scheme provided by this disclosed embodiment can include the following benefits:
It is receiving the acquisition request to the page and is judging when the page includes image, content recognition is carried out to image,
And the first text information that the content of the image is described is generated according to obtained recognition result, output later will include the
Page of one text information, due in the case where not downloading image, providing to each image in current browse webpage
Text description, therefore user can be helped tentatively to understand the content of image, so that user be assisted to decide whether to open or skip over this
Image, intelligence is more excellent, and saves network flow and wait the time of image downloading.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The disclosure can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention
Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is a kind of flow chart of image page output method shown according to an exemplary embodiment.
Fig. 2 is a kind of flow chart of image page output method shown according to an exemplary embodiment.
Fig. 3 is a kind of block diagram of image page output device shown according to an exemplary embodiment.
Fig. 4 is a kind of block diagram of image page output device shown according to an exemplary embodiment.
Fig. 5 is a kind of block diagram of image page output device shown according to an exemplary embodiment.
Fig. 6 is a kind of block diagram of image page output device shown according to an exemplary embodiment.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended
The example of device and method being described in detail in claims, some aspects of the invention are consistent.
Fig. 1 is a kind of flow chart of image page output method shown according to an exemplary embodiment, as shown in Figure 1,
This method is for including the following steps in image page output device.
In a step 101, the acquisition request to the page is received.
In a step 102, when the page includes image, content recognition is carried out to the image, obtains recognition result.
In step 103, according to the recognition result, the first text information that the content of the image is described is generated.
At step 104, the page is exported, which includes the first text information.
The method that the embodiment of the present disclosure provides, is receiving the acquisition request to the page and is judging that in the page include image
When, content recognition is carried out to image, and the first text that the content of the image is described is generated according to obtained recognition result
This information, exports later by the page including the first text information, due in the case where not downloading image, providing to working as
The text description of each image in preceding browsing pages, therefore user can be helped tentatively to understand the content of image, to assist to use
Family decides whether to open or skip over this image, intelligent more excellent, and saves network flow and wait the time of image downloading.
Optionally, according to recognition result, the first text information that the content of image is described is generated, comprising:
Obtain the page in associated second text information of image;
Recognition result is verified according to the second text information;
According to verification result and recognition result, the first text information that the content of image is described is generated.
Optionally, this method further include:
Obtain at least one pre-set shielding keyword;
If including any shielding keyword in recognition result, recognition result is filtered out;Or,
If the appearance ratio for shielding keyword in recognition result is more than the first preset threshold, recognition result is filtered out.
Optionally, before carrying out content recognition to image, this method further include:
Object mark is carried out to the sample image that multiple include goal-selling object, obtains first kind mark image;
Image is marked according to the first kind and carries out model training, obtains the first model;
Content recognition is carried out to image, obtains recognition result, comprising:
Image recognition is carried out to image using the first model;
When in the picture including any goal-selling object, the first keyword for describing the object in image is obtained.
Optionally, before carrying out content recognition to image, this method further include:
Scene mark is carried out to multiple sample images for including default scene, obtains the second class mark image;
Image is marked according to the second class and carries out model training, obtains the second model;
Content recognition is carried out to image, obtains recognition result, comprising:
Image recognition is carried out to image using the second model;
When in the picture including any default scene, the second keyword for describing the scene in image is obtained.
Optionally, before carrying out content recognition to image, this method further include:
Text marking is carried out to multiple sample images, obtains third class mark image;
Image is marked according to third class and carries out model training, obtains third model;
Content recognition is carried out to image, obtains recognition result, comprising:
Image recognition is carried out to image using third model;
When in the picture including text, the third keyword for describing the text in image is obtained.
Optionally, the corresponding recognition confidence of each of recognition result keyword, according to the second text information pair
Recognition result is verified, comprising:
Word segmentation processing is carried out to the second text information, obtains multiple participles;
For each of recognition result keyword, judge in multiple participles whether to include keyword;
If in multiple participles including keyword, increase the recognition confidence of keyword according to preset rules;
Wherein, recognition confidence is for characterizing the probability being correctly validated.
Optionally, according to verification result and recognition result, the first text information that the content of image is described is generated,
Include:
Obtain the nominal key that recognition confidence in recognition result is greater than the second preset threshold;
Nominal key is formed into a sentence using RNN model, using sentence as the first text information.
All the above alternatives can form the alternative embodiment of the disclosure, herein no longer using any combination
It repeats one by one.
Fig. 2 is a kind of flow chart of image page output method shown according to an exemplary embodiment, as shown in Fig. 2,
This method is for including the following steps in image page output device.
In step 201, the acquisition request to the page is received, if provided with including figure in no figure browse mode and the page
Picture then carries out content recognition to each image that the page includes, obtains recognition result.
The acquisition request of the page can be triggered by the clicking operation that user executes in the browser or application that terminal is installed.
For example, user wants access to a certain portal website, to browse webpage, then can be arranged by clicking in browser for the portal website
Link realize.Wherein, which may include webpage, message display page in etc., the embodiment of the present disclosure to this not into
Row is specific to be limited.
In the embodiments of the present disclosure, when carrying out content recognition to image, i.e., before extracting content information in the picture, also
Model first need to be established based on sample image, extract content information in the picture using trained model.Wherein, content information can
For face, famous landmark or building, text, the scenes such as indoor and outdoor, sandy beach, meadow or snowfield, the animals such as cat, horse
Deng the embodiment of the present disclosure is to this without specifically limiting.Wherein, the type based on content information is different, and it is big to particularly may be divided into 3
Class.One kind is object, such as face, the animals such as cat, horse, famous landmark or building etc.;One kind is text, for example road refers to
Show the text etc. in board or trade company's brand mark board;One kind is scene, such as indoor and outdoor, sandy beach, meadow or snowfield etc..For
In the picture above-mentioned 3 class content is identified well, the embodiment of the present disclosure can train 3 corresponding models, using this 3
A model identifies object, text and the scene in image, as follows in detail:
The first, object mark is carried out to the sample image that multiple include goal-selling object, obtains first kind mark image;
Image is marked according to the first kind and carries out model training, obtains the first model;Image recognition is carried out to image using the first model,
When including any goal-selling object in the image, the first keyword for describing the object in the image is obtained.
Multiple sample images are collected in network.When carrying out object mark, where the object in each image
Use indicia framing by hand labeled in region.Multiple sample images can be divided into training dataset and test data set.Wherein, it tests
It include a large amount of images comprising object in data set;Remaining picture construction training dataset.
When carrying out model training, CNN (Convolutional Neural Network, convolutional neural networks) can be trained
Model.Firstly, for each sample image, multiple target object candidate areas are extracted in the sample image.Calculate multiple times
The feature of each target object candidate area in favored area.Target complete phenology favored area is clustered according to this feature, is obtained
To specifying number class.Initialize the parameters in CNN model;CNN model based on initialization calculates each candidate regions
The classification in domain responds.It for each candidate region, is responded according to the classification of candidate region, determines the training of candidate region ownership
Classification.It obtains in advance to the target substance markers result of sample image;According to target substance markers as a result, determining candidate region ownership
Concrete class.According to training classification and concrete class, optimize the parameters in CNN model, until the error in classification of CNN model
Less than preset threshold.
After obtaining trained CNN model, using the object of each image in the CNN model extraction page, obtain
To the first keyword for describing object.For example, if in image including cat and dog etc. toys and little girl, first is crucial
Word can be " cat, dog, girl ".
It should be noted that other than it can export the first keyword, can also when carrying out image recognition using CNN model
Export the recognition confidence of each keyword, i.e. the probability that is correctly validated of object.For example, if in image including one
Cat, if recognition result is " cat ", then recognition confidence is height, identification is correct.If recognition result is " people ", then identification confidence
Spend low, identification mistake.
The second, scene mark is carried out to multiple sample images for including default scene, obtains the second class mark image;According to
Second class marks image and carries out model training, obtains the second model;Image recognition is carried out to image using the second model, in the figure
When including any default scene as in, the second keyword for describing the scene in the image is obtained.
Multiple sample images are collected in network.Scene areas when carrying out scene areas mark, in each image
With indicia framing by hand labeled.Multiple sample images can be divided into training dataset and test data set.Wherein, test data
Concentration may include a large amount of scene image;Remaining picture construction training dataset.When carrying out model training, CNN can be trained
Model, the embodiment of the present disclosure is to this without specifically limiting.Specific CNN model training process can refer to above-mentioned steps model instruction
Practice process and realizes that details are not described herein again.
After obtaining trained CNN model, using the scene of each image in the CNN model extraction page, obtain
For describing the second keyword of scene.For example, if in image scene be sea and blue sky, the second keyword can for " sky,
Sea ".It should be noted that when carrying out image recognition using CNN model, it, can also be defeated other than it can export the second keyword
The probability that the recognition confidence of each keyword out, i.e. scene are correctly validated.For example, if scene is sea in image, if
Recognition result is " sea ", then recognition confidence is height, identification is correct.If recognition result is " meadow ", then identification confidence
Spend low, identification mistake.
Third carries out text marking to multiple sample images, obtains third class mark image;Image is marked according to third class
Model training is carried out, third model is obtained;Image recognition is carried out to image using third model, in the images includes text
When, obtain the third keyword for describing the text in the image.
Multiple sample images are collected in network.It is text filed in each image when carrying out text filed mark
With indicia framing by hand labeled.Multiple sample images can be divided into training dataset and test data set.Wherein, test data
Concentrating includes a large amount of text image;Remaining picture construction training dataset.When carrying out model training, CNN mould can be trained
Type or support vector machine classifier, the embodiment of the present disclosure is to this without specifically limiting.Specific CNN model training process can join
The realization of above-mentioned steps model training process is examined, details are not described herein again.
It in Training Support Vector Machines classifier, may be implemented in the following manner: for each sample image, obtaining should
The training feature vector of sample image;In whole training feature vectors, the corresponding training feature vector of text image, root are determined
According to the training feature vector of text image, optimize the parameters in SVM classifier.After obtaining trained third model,
Using the text of each image in the third model extraction page, the third keyword for describing text is obtained.It needs
It is bright, when carrying out image recognition using third model, other than it can export third keyword, it can also export each key
The probability that the recognition confidence of word, i.e. scene are correctly validated.
In step 202, at least one pre-set shielding keyword is obtained;Judge in the recognition result whether include
Shield keyword;If including any shielding keyword in the recognition result, following step 203 is executed;If in the recognition result
Do not include shielding keyword, then executes following step 204.
Wherein, shielding keyword can be configured in advance by user, for automatically shielding to be used under page browsing mode
The image that family does not want to see that.For example the star that does not like of the unsound image of content, user or scene etc., the disclosure are implemented
Example is to this without specifically limiting.In the embodiments of the present disclosure, if a certain image includes that shielding is crucial in current browse webpage
Word then filters out recognition result, does not carry out any description to this image.
In step 203, if including any shielding keyword in the recognition result, recognition result is filtered out.
In another embodiment, other than it aforesaid way can be taken to be filtered recognition result, the embodiment of the present disclosure
Another filter type is additionally provided, it is as follows in detail: if the appearance ratio for shielding keyword in recognition result is more than first pre-
If threshold value then filters out recognition result.Wherein, the first preset threshold can be 80% or 90% etc., the disclosure to this without
It is specific to limit.For example, recognition result is " certain so-and-so, meadow, horse ", and shields keyword and contain above-mentioned 3 keywords, then directly
It connects and falls the image filtering, without carrying out any description to this image under no figure browse mode, because user is to this figure
Picture is not relevant for, or even is sick of.
Recognition result is filtered out namely the corresponding image of the recognition result carries out any description, directly loses the image
It abandons, it is not shown under no figure browse mode.Wherein, the functional switch that image is shown is provided in browser.When opening
When closing in the open state, browser is in image page output mode;When switch is in close state, browser is in nothing
Under figure browse mode.
In step 204, it if not including shielding keyword in the recognition result, obtains associated with image in the page
Second text information verifies the recognition result according to the second text information.
Wherein, the second text information is text relevant to the image in current browse webpage.For example, writing one at present
Then when news, usually other than comprising content of text, also figure can be carried out for text content.So, this content of text
Just it is and associated second text information of this figure.
In the embodiments of the present disclosure, it when being verified according to the second text information to the recognition result, can take following
Mode is realized:
Word segmentation processing is carried out to the second text information, obtains multiple participles;For each of recognition result key
Whether word judges in multiple participles to include the keyword;If including the keyword in multiple participles, increase according to preset rules
The recognition confidence of the keyword.
Above-mentioned steps are explained in a simply example below.It is that " today is in certain sea with the second text information
Certain so-and-so for a certain brand wrist-watch have taken magazine on beach, and including certain, so-and-so leads the picture of horse-ride step row at seabeach ".Then exist
When carrying out word segmentation processing, can cutting be " today, certain, seabeach, it is upper, certain so-and-so, be, a certain, brand wrist-watch, shooting, picture
Report, wherein, include, certain so-and-so, seabeach, lead, horse, walking, picture " etc. multiple participles, if being wrapped in the recognition result
" certain so-and-so, meadow, horse, seashore " is included, since keyword in recognition result " certain so-and-so " and " horse " have been both present in multiple participles
In, therefore the probability that the two keywords are correctly validated greatly improves, so increasing the recognition confidence of the two keywords.
There is word as " seabeach " although this word does not fully appear in multiple participles in " seashore " in multiple participles
Language, therefore also improve its recognition confidence.Wherein, preset rules can improve for the recognition confidence for the keyword that will be exactly matched
10% or 15%, the recognition confidence of the matched keyword in part is improved 5% or 10% etc., the embodiment of the present disclosure to this not
Specifically limited.
In step 205, according to verification result and the recognition result, first that the content of the image is described is generated
Text information.
In the embodiments of the present disclosure, according to verification result and the recognition result, the content of the image is retouched in generation
When the first text information stated, it may be implemented in the following manner:
Obtain the nominal key that recognition confidence in recognition result is greater than the second preset threshold;It will be referred to using RNN model
Determine keyword and form a sentence, using the sentence as the first text information.
Wherein, the second preset threshold can be 80% or 90% etc., and the embodiment of the present disclosure is to this without specifically limiting.
RNN model is existing maturity model, the keyword of generation can be formed sentence, for example conjunction is added between keyword
Deng easy to read, details are not described herein again.Each of recognition result keyword root is being ranked up according to recognition confidence
Afterwards, the nominal key that recognition confidence is greater than the second preset threshold is selected.With nominal key for " certain so-and-so, horse, seashore "
For, then nominal key can be formed to " so-and-so leads a horse by the sea for certain " of this sort sentence using RNN model, as
First text information.
In step 206, output includes the page of the first text information.
Wherein, can show corresponding first text information of each image should show the position of this image in script,
The embodiment of the present disclosure is to this without specifically limiting.Due to the presence of the first text information, user can be based on text description information
Substantially picture material is understood, to choose whether to continue to download image or directly filter out image, is provided for user
It is convenient.
The method that the embodiment of the present disclosure provides, is receiving the acquisition request to the page and is judging that in the page include image
When, content recognition is carried out to image, and the first text that the content of the image is described is generated according to obtained recognition result
This information, exports later by the page including the first text information, due in the case where not downloading image, providing to working as
The text description of each image in preceding browsing pages, therefore user can be helped tentatively to understand the content of image, to assist to use
Family decides whether to open or skip over this image, intelligent more excellent, and saves network flow and wait the time of image downloading.This
Outside, also the image seen can be not intended to carry out shielding processing user by setting shielding keyword, further improves user
Experience Degree.
Fig. 3 is a kind of block diagram of image page output device shown according to an exemplary embodiment.Referring to Fig. 3, the dress
It sets including receiving module 301, identification module 302, generation module 303, output module 304.
Wherein, receiving module 301 are configured as receiving the acquisition request to the page;Identification module 302, is configured as
When the page includes image, content recognition is carried out to image, obtains recognition result;Generation module 303, is configured as according to identification
As a result, generating the first text information that the content of the image is described;Output module 304 is configured as exporting the page,
The page includes the first text information.
Optionally, generation module 303 are configured as obtaining in current browse webpage and associated second text envelope of image
Breath;Recognition result is verified according to the second text information;According to verification result and recognition result, the content to image is generated
The first text information being described.
Referring to fig. 4, the device further include:
Module 305 is obtained, is configured as obtaining at least one pre-set shielding keyword;
Filtering module 306 is configured as filtering recognition result when in recognition result including any shielding keyword
Fall;Or, recognition result is filtered out when the appearance ratio for shielding keyword in recognition result is more than the first preset threshold.
Referring to Fig. 5, the device further include:
Labeling module 307, be configured as include to multiple goal-selling object sample image carry out object mark, obtain
The first kind marks image;
Training module 308 is configured as marking image progress model training according to the first kind, obtains the first model;
Identification module 302 is configured as carrying out image recognition to image using the first model, in the picture includes any pre-
If when object, obtaining the first keyword for describing the object in image.
Optionally, the device further include:
Labeling module 307 is configured as including the sample image progress scene mark for presetting scene to multiple, obtains second
Class marks image;
Training module 308 is configured as marking image progress model training according to the second class, obtains the second model;
Identification module 302 is configured as carrying out image recognition to image using the second model, in the picture includes any pre-
If when scene, obtaining the second keyword for describing the scene in image.
Optionally, the device further include:
Labeling module 307 is configured as carrying out text marking to multiple sample images, obtains third class mark image;
Training module 308 is configured as marking image progress model training according to third class, obtains third model;
Identification module 302 is configured as carrying out image recognition to image using third model, in the picture includes text
When, obtain the third keyword for describing the text in image.
Optionally, the corresponding recognition confidence of each of recognition result keyword, authentication module 303 are configured
To carry out word segmentation processing to the second text information, multiple participles are obtained;For each of recognition result keyword, judge more
It whether include keyword in a participle;If in multiple participles including keyword, increase the identification of keyword according to preset rules
Confidence level;
Wherein, recognition confidence is for characterizing the probability being correctly validated.
Optionally, generation module 303 are configured as obtaining recognition confidence in recognition result and are greater than the second preset threshold
Nominal key;Nominal key is formed into a sentence using RNN model, using sentence as the first text information.
The device that the embodiment of the present disclosure provides, is receiving the acquisition request to the page and is judging that in the page include image
When, content recognition is carried out to image, and the first text that the content of the image is described is generated according to obtained recognition result
This information, exports later by the page including the first text information, due in the case where not downloading image, providing to working as
The text description of each image in preceding browsing pages, therefore user can be helped tentatively to understand the content of image, to assist to use
Family decides whether to open or skip over this image, intelligent more excellent, and saves network flow and wait the time of image downloading.This
Outside, also the image seen can be not intended to carry out shielding processing user by setting shielding keyword, further improves user
Experience Degree.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
Fig. 6 is a kind of device 600 for being configured as exporting image page shown according to an exemplary embodiment
Block diagram.For example, device 600 can be mobile phone, computer, digital broadcasting terminal, messaging device, game control
Platform, tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Referring to Fig. 6, device 600 may include following one or more components: processing component 602, memory 604, power supply
Component 606, multimedia component 608, audio component 610, the interface 612 of I/O (Input/Output, input/output), sensor
Component 614 and communication component 616.
The integrated operation of the usual control device 600 of processing component 602, such as with display, telephone call, data communication, phase
Machine operation and record operate associated operation.Processing component 602 may include that one or more processors 620 refer to execute
It enables, to perform all or part of the steps of the methods described above.In addition, processing component 602 may include one or more modules, just
Interaction between processing component 602 and other assemblies.For example, processing component 602 may include multi-media module, it is more to facilitate
Interaction between media component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in device 600.These data are shown
Example includes being configured as the instruction of any application or method operated on device 600, contact data, telephone directory number
According to, message, picture, video etc..Memory 604 can by any kind of volatibility or non-volatile memory device or they
Combination realize, such as SRAM (Static Random Access Memory, static random access memory), EEPROM
(Electrically-Erasable Programmable Read-Only Memory, the read-only storage of electrically erasable
Device), EPROM (Erasable Programmable Read Only Memory, Erasable Programmable Read Only Memory EPROM), PROM
(Programmable Read-Only Memory, programmable read only memory), and ROM (Read-Only Memory, it is read-only to deposit
Reservoir), magnetic memory, flash memory, disk or CD.
Power supply module 606 provides electric power for the various assemblies of device 600.Power supply module 606 may include power management system
System, one or more power supplys and other with for device 600 generate, manage, and distribute the associated component of electric power.
Multimedia component 608 includes the screen of one output interface of offer between described device 600 and user.One
In a little embodiments, screen may include LCD (Liquid Crystal Display, liquid crystal display) and TP (Touch
Panel, touch panel).If screen includes touch panel, screen may be implemented as touch screen, from the user to receive
Input signal.Touch panel includes one or more touch sensors to sense the gesture on touch, slide, and touch panel.Institute
The boundary of a touch or slide action can not only be sensed by stating touch sensor, but also be detected and the touch or slide phase
The duration and pressure of pass.In some embodiments, multimedia component 608 includes that a front camera and/or postposition are taken the photograph
As head.When device 600 is in operation mode, such as in a shooting mode or a video mode, front camera and/or rear camera can
With the multi-medium data outside reception.Each front camera and rear camera can be a fixed optical lens system
Or there are focusing and optical zoom capabilities.
Audio component 610 is configured as output and/or input audio signal.For example, audio component 610 includes a MIC
(Microphone, microphone), when device 600 is in operation mode, such as call mode, recording mode, and voice recognition mode
When, microphone is configured as receiving external audio signal.The received audio signal can be further stored in memory 604
Or it is sent via communication component 616.In some embodiments, audio component 610 further includes a loudspeaker, is configured as exporting
Audio signal.
I/O interface 612 provides interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock
Determine button.
Sensor module 614 includes one or more sensors, is configured as providing the state of various aspects for device 600
Assessment.For example, sensor module 614 can detecte the state that opens/closes of equipment 600, the relative positioning of component, such as group
Part is the display and keypad of device 600, and sensor module 614 can be with 600 1 components of detection device 600 or device
Position change, the existence or non-existence that user contacts with device 600, the temperature in 600 orientation of device or acceleration/deceleration and device 600
Degree variation.Sensor module 614 may include proximity sensor, be configured to detect without any physical contact attached
The presence of nearly object.Sensor module 614 can also include optical sensor, such as CMOS (Complementary Metal Oxide
Semiconductor, complementary metal oxide) or CCD (Charge-coupled Device, charge coupled cell) image biography
Sensor is configured as using in imaging applications.In some embodiments, which can also include acceleration
Sensor, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device
600 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation
In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 616 further includes that (Near Field Communication, near field are logical by NFC
Letter) module, to promote short range communication.For example, RFID (Radio Frequency can be based in NFC module
Identification, radio frequency identification) technology, IrDA (Infra-red Data Association, Infrared Data Association) skill
Art, UWB (Ultra Wideband, ultra wide band) technology, BT (Bluetooth, bluetooth) technology and other technologies are realized.
In the exemplary embodiment, device 600 can be by one or more ASIC (Application Specific
Integrated Circuit, application specific integrated circuit), DSP (Digital signal Processor, at digital signal
Manage device), DSPD (Digital signal Processor Device, digital signal processing appts), PLD (Programmable
Logic Device, programmable logic device), FPGA) (Field Programmable Gate Array, field programmable gate
Array), controller, microcontroller, microprocessor or other electronic components realize, be configured as executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided
It such as include the memory 604 of instruction, above-metioned instruction can be executed by the processor 620 of device 600 to complete the above method.For example,
The non-transitorycomputer readable storage medium can be ROM, RAM (Random Access Memory, random access memory
Device), CD-ROM (Compact Disc Read-Only Memory, compact disc read-only memory), tape, floppy disk and light data deposit
Store up equipment etc..
Image page exports the non-transitorycomputer readable storage medium that the embodiment of the present disclosure provides, and is receiving to page
The acquisition request in face and judge when the page includes image, content recognition is carried out to image, and according to obtained recognition result
The first text information that the content of the image is described is generated, is exported later by the page including the first text information,
Due in the case where not downloading image, providing the text description to each image in current browse webpage, therefore can help
User is helped tentatively to understand the content of image, so that user be assisted to decide whether to open or skip over this image, intelligence is more excellent, and saves
It has saved network flow and has waited the time of image downloading.In addition, also user can be not intended to see by setting shielding keyword
Image carry out shielding processing, further the user experience is improved degree.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
Claims (14)
1. a kind of image page output method, which is characterized in that the described method includes:
Receive the acquisition request to the page;
When the page includes image, content recognition is carried out to described image, obtains recognition result;
According to the recognition result, the first text information that the content of described image is described is generated;
The page is exported, the page includes first text information;
First text information that the content of described image is described according to the recognition result, generation, comprising: obtain
In the page with associated second text information of described image;The recognition result is carried out according to second text information
Verifying;According to verification result and the recognition result, the first text information that the content of described image is described is generated;
Wherein, the corresponding recognition confidence of each of described recognition result keyword, it is described according to second text
Information verifies the recognition result, comprising:
Word segmentation processing is carried out to second text information, obtains multiple participles;For each of recognition result keyword,
Judge in the multiple participle whether to include the keyword;If including the keyword in the multiple participle, according to pre-
If rule increases the recognition confidence of the keyword;Wherein, the recognition confidence is for characterizing the probability being correctly validated.
2. the method according to claim 1, wherein the method also includes:
Obtain at least one pre-set shielding keyword;
If including any shielding keyword in the recognition result, the recognition result is filtered out;Or,
If the appearance ratio for shielding keyword described in the recognition result is more than the first preset threshold, by the recognition result
It filters out.
3. the method according to claim 1, wherein it is described to described image carry out content recognition before, it is described
Method further include:
Object mark is carried out to the sample image that multiple include goal-selling object, obtains first kind mark image;
Image is marked according to the first kind and carries out model training, obtains the first model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using first model;
When including any goal-selling object in described image, first for describing the object in described image is obtained
Keyword.
4. the method according to claim 1, wherein it is described to described image carry out content recognition before, it is described
Method further include:
Scene mark is carried out to multiple sample images for including default scene, obtains the second class mark image;
Image is marked according to second class and carries out model training, obtains the second model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using second model;
When including any default scene in described image, the second key for describing the scene in described image is obtained
Word.
5. the method according to claim 1, wherein it is described to described image carry out content recognition before, it is described
Method further include:
Text marking is carried out to multiple sample images, obtains third class mark image;
Image is marked according to the third class and carries out model training, obtains third model;
It is described that content recognition is carried out to described image, obtain recognition result, comprising:
Image recognition is carried out to described image using the third model;
When including text in described image, the third keyword for describing the text in described image is obtained.
6. the method according to claim 1, wherein described according to verification result and the recognition result, generation
The first text information that the content of described image is described, comprising:
Obtain the nominal key that recognition confidence in the recognition result is greater than the second preset threshold;
The nominal key is formed into a sentence using Multi-Layer Feedback network RNN model, using the sentence as described the
One text information.
7. a kind of image page output device, which is characterized in that described device includes:
Receiving module is configured as receiving the acquisition request to the page;
Identification module is configured as when the page includes image, carries out content recognition to described image, obtains identification knot
Fruit;
Generation module is configured as generating the first text that the content of described image is described according to the recognition result
Information;
Output module is configured as exporting the page, and the page includes first text information;
The generation module is configured as obtaining in current browse webpage and associated second text information of described image;According to
Second text information verifies the recognition result;According to verification result and the recognition result, generate to described
The first text information that the content of image is described;
Wherein, the corresponding recognition confidence of each of described recognition result keyword, authentication module are configured as to institute
It states the second text information and carries out word segmentation processing, obtain multiple participles;For each of recognition result keyword, described in judgement
It whether include the keyword in multiple participles;If including the keyword in the multiple participle, increase according to preset rules
The recognition confidence of the big keyword;Wherein, the recognition confidence is for characterizing the probability being correctly validated.
8. device according to claim 7, which is characterized in that described device further include:
Module is obtained, is configured as obtaining at least one pre-set shielding keyword;
Filtering module is configured as filtering the recognition result when in the recognition result including any shielding keyword
Fall;Or, when the appearance ratio of shielding keyword described in the recognition result is more than the first preset threshold, by identification knot
Fruit filters out.
9. device according to claim 7, which is characterized in that described device further include:
Labeling module, be configured as include to multiple goal-selling object sample image carry out object mark, obtain the first kind
Mark image;
Training module is configured as marking image progress model training according to the first kind, obtains the first model;
The identification module is configured as carrying out image recognition to described image using first model, in described image
When including any goal-selling object, the first keyword for describing the object in described image is obtained.
10. device according to claim 7, which is characterized in that described device further include:
Labeling module is configured as including the sample image progress scene mark for presetting scene to multiple, obtains the second class mark
Image;
Training module is configured as marking image progress model training according to second class, obtains the second model;
The identification module is configured as carrying out image recognition to described image using second model, in described image
When including any default scene, the second keyword for describing the scene in described image is obtained.
11. device according to claim 7, which is characterized in that described device further include:
Labeling module is configured as carrying out text marking to multiple sample images, obtains third class mark image;
Training module is configured as marking image progress model training according to the third class, obtains third model;
The identification module is configured as carrying out image recognition to described image using the third model, in described image
When including text, the third keyword for describing the text in described image is obtained.
12. device according to claim 7, which is characterized in that the generation module is configured as obtaining the identification knot
Recognition confidence is greater than the nominal key of the second preset threshold in fruit;It will be described specified using Multi-Layer Feedback network RNN model
Keyword forms a sentence, using the sentence as first text information.
13. a kind of image page output device characterized by comprising
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to: receive to the acquisition request of the page;When the page includes image, to described
Image carries out content recognition, obtains recognition result;The content of described image is described according to the recognition result, generation
First text information;The page is exported, the page includes first text information;
First text information that the content of described image is described according to the recognition result, generation, comprising: obtain
In the page with associated second text information of described image;The recognition result is carried out according to second text information
Verifying;According to verification result and the recognition result, the first text information that the content of described image is described is generated;
Wherein, the corresponding recognition confidence of each of described recognition result keyword, it is described according to second text
Information verifies the recognition result, comprising:
Word segmentation processing is carried out to second text information, obtains multiple participles;For each of recognition result keyword,
Judge in the multiple participle whether to include the keyword;If including the keyword in the multiple participle, according to pre-
If rule increases the recognition confidence of the keyword;Wherein, the recognition confidence is for characterizing the probability being correctly validated.
14. a kind of computer readable storage medium, instruction is stored on the computer readable storage medium, which is characterized in that
The step of method described in any one of claims 1-6 is realized when described instruction is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510855907.0A CN105512220B (en) | 2015-11-30 | 2015-11-30 | Image page output method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510855907.0A CN105512220B (en) | 2015-11-30 | 2015-11-30 | Image page output method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105512220A CN105512220A (en) | 2016-04-20 |
CN105512220B true CN105512220B (en) | 2018-12-11 |
Family
ID=55720202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510855907.0A Active CN105512220B (en) | 2015-11-30 | 2015-11-30 | Image page output method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105512220B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446782A (en) * | 2016-08-29 | 2017-02-22 | 北京小米移动软件有限公司 | Image identification method and device |
CN107590252A (en) * | 2017-09-19 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | Method and device for information exchange |
US11003856B2 (en) * | 2018-02-22 | 2021-05-11 | Google Llc | Processing text using neural networks |
CN109359257A (en) * | 2018-10-09 | 2019-02-19 | 上海二三四五网络科技有限公司 | A kind of control method and control device realized in browser of mobile terminal without figure browsing |
CN110489674B (en) * | 2019-07-02 | 2020-11-06 | 百度在线网络技术(北京)有限公司 | Page processing method, device and equipment |
CN112149412A (en) * | 2020-10-23 | 2020-12-29 | 北京金和网络股份有限公司 | Catering industry service supervision method, device and system |
CN113239302A (en) * | 2021-04-23 | 2021-08-10 | 维沃移动通信(杭州)有限公司 | Page display method and device and electronic equipment |
CN115134319B (en) * | 2022-06-29 | 2024-06-21 | 维沃移动通信有限公司 | Information display method and device, electronic equipment and readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104536973A (en) * | 2014-12-03 | 2015-04-22 | 北京奇虎科技有限公司 | Picture identification method and browser client |
CN104808979A (en) * | 2014-01-28 | 2015-07-29 | 诺基亚公司 | Method and device for generating or using information associated with image contents |
CN105095498A (en) * | 2015-08-24 | 2015-11-25 | 北京旷视科技有限公司 | Information processing method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110016150A1 (en) * | 2009-07-20 | 2011-01-20 | Engstroem Jimmy | System and method for tagging multiple digital images |
-
2015
- 2015-11-30 CN CN201510855907.0A patent/CN105512220B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104808979A (en) * | 2014-01-28 | 2015-07-29 | 诺基亚公司 | Method and device for generating or using information associated with image contents |
CN104536973A (en) * | 2014-12-03 | 2015-04-22 | 北京奇虎科技有限公司 | Picture identification method and browser client |
CN105095498A (en) * | 2015-08-24 | 2015-11-25 | 北京旷视科技有限公司 | Information processing method and device |
Non-Patent Citations (2)
Title |
---|
"让电脑用一句话描述出图片的内容,Google现在可以做到了";匿名;《http://www.pingwest.com/google-description-photos/》;20141119;第1-3页 * |
图片加Alt属性还是Title属性有什么好处-SEO;匿名;《http://blog.sina.com.cn/s/blog_830edcf30101f3wb.html》;20130225;第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN105512220A (en) | 2016-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105512220B (en) | Image page output method and device | |
CN104219785B (en) | Real-time video providing method, device and server, terminal device | |
CN104572905B (en) | Print reference creation method, photo searching method and device | |
RU2659746C2 (en) | Method and device for image processing | |
CN106557768A (en) | The method and device is identified by word in picture | |
CN106295511B (en) | Face tracking method and device | |
CN105069083B (en) | The determination method and device of association user | |
CN104284240B (en) | Video browsing approach and device | |
KR101985955B1 (en) | Face photo album based music playing method, apparatus and terminal device and storage medium | |
CN104506443B (en) | Router sets interface display method and device | |
CN109040605A (en) | Shoot bootstrap technique, device and mobile terminal and storage medium | |
CN106331504A (en) | Shooting method and device | |
CN109871843A (en) | Character identifying method and device, the device for character recognition | |
CN105809174A (en) | Method and device for identifying image | |
CN107766820A (en) | Image classification method and device | |
CN109033991A (en) | A kind of image-recognizing method and device | |
CN110019676A (en) | A kind of method, apparatus and equipment identifying core word in query information | |
CN105933529A (en) | Shooting picture display method and device | |
CN109819288A (en) | Determination method, apparatus, electronic equipment and the storage medium of advertisement dispensing video | |
CN104980719A (en) | Image processing method, image processing apparatus and electronic equipment | |
CN109034150A (en) | Image processing method and device | |
CN111553372A (en) | Training image recognition network, image recognition searching method and related device | |
CN106293961A (en) | Text message processing method and device | |
CN109460556A (en) | A kind of interpretation method and device | |
CN110222706A (en) | Ensemble classifier method, apparatus and storage medium based on feature reduction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |