CN104199834B

CN104199834B - The method and system for obtaining remote resource from information carrier surface interactive mode and exporting

Info

Publication number: CN104199834B
Application number: CN201410377980.7A
Authority: CN
Inventors: 徐�明; 徐颢毓
Original assignee: Individual
Current assignee: Individual
Priority date: 2014-08-04
Filing date: 2014-08-04
Publication date: 2018-11-27
Anticipated expiration: 2034-08-04
Also published as: CN104199834A

Abstract

The present invention relates to video tracking and image recognition technology, the method and system for obtaining remote resource from information carrier surface interactive mode and exporting are provided, this approach includes the following steps：Step A, the camera of user terminal carries out target identification to indication body, tracks out motion profile and the pause of indication end endpoint, obtains instruction intention and the indicating area of user；Step B, it is intended to extract image block bitmap data according to instruction, and identifies content information included in image block bitmap data；Step C, expression formula for search is generated according to content information, and is sent to remote server；Step D, remote server retrieves qualified multimedia resource by expression formula for search in specialized knowledge base, and is sent to user terminal；Step E, user terminal exports multimedia resource, and prepares user's interaction next time.This method can not only carry out content recognition to indicating area from target information carrier surface, moreover it is possible to be associated retrieval and output.

Description

The method and system for obtaining remote resource from information carrier surface interactive mode and exporting

Technical field

The present invention relates to video and Image Acquisition and identification technology, more particularly to one kind to obtain from information carrier surface interactive mode The method and system for taking remote resource and exporting.

Background technique

Point reader and the talking pen occurred in recent years are well received children education study electronic products.Point reader is also known as Intelligent computer book reading machine, English interactive electronic textbook, synchronous book reading machine, electronic textbook etc., are a kind of sound interactive learning products, Textual books teaching material is become the Sound teaching material that any sounding can be needed by study by electromagnetic induction location technology by it. 1999, leapfrog company of the U.S. learnt the rule of language according to child, has developed the point reader product of early stage, rapidly becomes beauty State child begins to learn the indispensable tool in stage in language, is popular for Japan, Singapore and country in Southeast Asia later.2001, point was read The concept and technology of machine introduce China, and the treasured of Ying Yipai, sound is learned and finds pleasure in, gets well large quantities of public affairs such as first-class, student, Bubukao Department sets foot in this industry one after another.In the presence and effort of more companies, point reader experienced to be opened from veneer to double, from it is wired to Wirelessly arrive again it is wired, from low capacity storage to massive store, from without downloading to the downloading of RS232 serial ports again to USB downloading, from special Term sound press shrinking chip is compressed to using MP3 universal phonetic, and print content, shape is added to open from optic panel to panel from drawer type to double One development process of integral forming.

The basic principle of point reader：When file is pronounced in production, corresponding word content is pre-set to pronunciation file Longitude and latitude position.It is not that any textbook can put reading, has only made associated audio file resource in advance and stored into point The textbook of reading machine memory, just can be applied to point reader.In use, the correct position for textbook being placed on machine plate is required, Books and the page number are chosen using special pens first, then with the word content of position (X, Y) on special pens point textbook page, are put down Plate perceives provisioned pen point touching (X, Y) this point, just receives instruction, the corresponding audio file of this point is read, by machine Device plays out.Under reading mode of learning, the text or picture on teaching material are touched with talking pen, so that it may carry out human-computer interaction It practises, listens the explanation of current page content, carry out with reading, re-reading, comparison etc. of recording.

Occur the improved new product talking pen of function in recent years, principle be " latitude coordinates tablet " is made into it is transparent " latitude coordinates plastic foil ".This improve gives coordinate identification work to talking pen rather than above-mentioned tablet, uses When film need to be only covered on certain one page of books, click the corresponding page number, reading sounding can be put.

The working principle of newest talking pen learning system is：Add print two dimensional code on books in advance or adds in paper inner layer The Magnetic Induction material of upper recessiveness, meanwhile, make the storage chip that the audio files to match with books is stored in talking pen In.Talking pen head has the function of identifying that the two dimensional code printed on page or the included information of Magnetic Induction material, user exist Certain one page that Selective artery embolization is read in use process clicks the content at the positions such as this page of upper specific pattern, text, number, point The camera that reading pen can be transferred through assembling in nib identifies bookish two dimensional code or the electromagnetic induction by assembling in nib Head reads out the index information that Magnetic Induction material is included in paper, to find out corresponding sound text in storage chip Part is used for study.

Either point reader or talking pen, they have common feature：One, it needs special comprising latitude coordinates Device or film, or books printing is carried out using special process in advance；Two, education resource matches with books fixation, stores In the storage chip in point reader or talking pen.These features cause conventional point reading product cost high, inconvenient to carry, and And learning Content and books fixation are mating on memory, it is difficult to expand, using face is restricted.Corresponding numerous general books, money Graph text information on material, point, which reads product, can not carry out a reading.

Summary of the invention

Technical problem to be solved by the present invention lies in provide from information carrier surface interactive mode to obtain remote resource and defeated Method and system out, it is intended to solve existing point and read product and technical costs height, the narrow problem in application space.

The invention is realized in this way the method for obtaining remote resource from information carrier surface interactive mode and exporting, including Following steps：

Step A, target identification is carried out to the indication body for entering its camera view with user terminal, tracks out and refers on indication body The motion profile and stall position for showing end endpoint are intended to and refer to according to the instruction that the motion profile and stall position obtain user Show region；The camera of the user terminal is located above target information carrier surface, for shooting to target information carrier；

Step B, user terminal is intended to according to the instruction of the user, extracts the image bitmap block number in the indicating area According to, and identify content information included in described image bitmap block data；

Step C, user terminal according to the instruction of the user be intended to, by the content information be converted into current class code or Current term generates expression formula for search if the current term has update, and the expression formula for search is sent to Remote server, otherwise return step A；

Step D, the described remote server is retrieved in specialized knowledge base qualified by the expression formula for search Multimedia resource, and it is sent to the user terminal；

Step E, the described user terminal exports the multimedia resource received, then returnes to step A, prepares User's interaction next time.

Further, the step A specifically includes following steps：

Step A1, the camera of user terminal acquires current video image frame；

Step A2, user terminal one by one in read step A1 current video image frame collected each pixel RGB color Data calculate the pixel according to the indication body appearance color gauss hybrid models pre-established and belong to the indication body appearance color The probability value of gauss hybrid models, and judge whether the pixel matches with the appearance color of indication body；

Step A3, step A2 is repeated, until each of current video image frame pixel is all disposed；It selects again The pixel for taking all and indication body appearance color to match obtains the foreground image of current video image frame, to the foreground picture Contour detecting is carried out as being filtered the method for being then based on mathematical morphology except making an uproar, is chosen in detected whole profiles Profile of the largest connected region as indication body, the indication end endpoint position of indication body is searched out in the profile of the indication body It sets；

Step A4, according to the indication end endpoint location of indication body in the current video image frame of step A3 extraction, to described Indication end endpoint carries out motion trail analysis and instruction is intended to understand, if the instruction for not yet determining user is intended to and instruction area Domain then returns to step A1 and continues to execute.

Further, indicate that external table color gauss hybrid models are divided into finger complexion model and lip pencil in the step A2 Indicant color model；

The indication body appearance color gauss hybrid models are based on CrCgCb color space application gauss hybrid models skill Art is modeled, and is mixed using multiple single Gaussian Profiles, by following formula to the probability density function G (x) of model into Row weighted blend calculates：Wherein, M is the single Gauss that model includes The number of distribution, α_jFor the hybrid weight of the probability density function of each single Gaussian Profile, Ρ_j(c,μ_j,Σ_j) definition be：Wherein, the transposition of T representing matrix, c=[c_r,c_g,c_b]^TFor wait estimate The three-component column vector of texture color CrCgCb of pixel is calculated, μ is model expectation, and Σ is model variance, and μ, Σ are by several The CrCgCb feature column vector c of training sample pixel_iIt obtains, is mean vector, It is covariance matrix, n is the number of training sample；

G01-G02 training obtains each calculating parameter of model to the finger complexion model as follows：

Step G01, will in advance for a variety of illumination, different model camera and different sexes and the experiment people at age and Collected finger skin pixel rgb value carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, really Determine each calculating parameter of the Gaussian-mixture probability density function of finger complexion model；

Step G02, in user using before the step A, acquiring the user itself hand under its currently used environment Refer to colour of skin RGB data, using these numerical value as new observation sample value, re-uses expectation-maximization algorithm and carry out maximum likelihood Estimation carries out weight to each calculating parameter of the Gaussian-mixture probability density function of the finger complexion model determined in step G01 Estimate training, the calculating parameter of finger complexion model is updated according to the result of revaluation training；

G11-G12 training obtains each calculating parameter of model to the lip pencil indicant color model as follows：

Step G11, it will be directed to a variety of illumination, different model camera and collected lip pencil indicant appearance color in advance Pixel RGB values carry out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, determine lip pencil indicant color Each calculating parameter of the Gaussian-mixture probability density function of model；

Step G12, before user is using the step A, the lip pencil indicant under the currently used environment of the user is acquired Appearance color RGB data re-uses expectation-maximization algorithm and carries out maximum seemingly using these numerical value as new observation sample value So estimation, each calculating parameter of Gaussian-mixture probability density function to the lip pencil indicant color model determined in step G11 Revaluation training is carried out, the calculating parameter of lip pencil indicant color model is updated according to the result of revaluation training.

Further, the processing and lip pencil of finger contours are divided into when handling in the step A3 the profile of indication body The processing of indicant profile；

When handle and determine finger fingertip indicating positions to the finger contours, following steps A301- is specifically included A304：

Step A301, definition template subgraph carries out erosion operation processing to the finger contours image after binaryzation；

Step A302, by treated, finger contours progress transverse and longitudinal coordinate projection searches from top to bottom, from left to right The place of projection value significant change constructs rough search window as the rough position of finger fingertip, and centered on this position Mouthful；

Step A303, the finger fingertip position is predicted using single layer FL neural networks, is determined accurate Search window, be specifically calculated as：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hFor input Layer arrive hidden layer weight matrix, W be precondition go out weight matrix, the weight matrix on Training strategy using from a left side to Horizontal line, the horizontal line from right to left, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, square counterclockwise on the right side Shape shape is trained；

Step A304, finger fingertip detection is carried out based on template matching, defines several template fingers, described accurate The absolute value distance that subgraph to be matched Yu each template finger are calculated in search window obtains the exact position of finger fingertip；

It is specific to wrap when handle and determine the indication end endpoint location of lip pencil indicant to the lip pencil indicant profile Include following steps A311-A312：

Step A311, connection processing is carried out to the lip pencil indicant contour images after binaryzation, seeks the center of gravity of connected graph；

Step A312, using the center of gravity of the connected graph as search center point, and as center from upper left side, surface Search order to upper right side scans for, successively calculate described search central point to profile diagram on each pixel it is European away from From, described Euclidean distance value the maximum is taken out, maximum path finds the corresponding points on lip pencil indicant profile diagram according to this distance, Final position as lip pencil indicant indication end endpoint.

Further, the step A4 specifically includes following steps：

Step A401, obtaining step A3 extract current video image frame in indication body indication end endpoint location, with it The indication end endpoint location of indication body is compared in several frames in front, calculates movement speed and direction；

Step A402, the indication end endpoint movement speed in each frame is analyzed according to time sequencing, is detected when for the first time When indication end endpoint has an apparent pause process, using this location coordinate that pauses as starting point indicating positions；

Step A403, after recognizing starting point indicating positions, when the movement for detecting indication end endpoint again have one it is bright When aobvious pause, using the location of pause coordinate as End point indication position, if not yet detecting End point indication position, Return to step A1；

Step A404, between starting point indicating positions and End point indication position, the motion profile of indication end endpoint is carried out Analysis：If detecting that indication end endpoint from left to right or from right to left does the movement of near linear, then it represents that the instruction of user It is intended that text instruction, indicating area is image block areas locating for the line of text above indication end endpoint motion profile；If Detect that indication end endpoint does the closing campaign of similar circle shape or close to closed movement, then it represents that the instruction intention of user It is icon instruction, the image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for indicating area；If inspection It measures indication end endpoint and does the closing campaign of similar rectangle or close to closed movement, then it represents that the instruction schematic diagram of user is figure Shape code instruction, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

Further, picture material recognition methods described in the step B specifically includes：

According to determined in step A come user instruction be intended to, if text indicate, then select the text identification side OCR Method carries out content recognition to described image bitmap block data, obtains text and character string information included in it；

According to determined in step A come the instruction of user be intended to, indicated if icon, then according to preset icon library, Using each icon characteristics template in icon library, the method for image recognition is selected to carry out content to described image bitmap block data Identification obtains the icon index character string information corresponded in the icon library；

According to determined in step A come user instruction be intended to, if graphic code indicate, then respectively select two dimensional code and Barcode recognition method carries out content recognition to described image bitmap block data, obtains character string information included in it.

Further, the process for expression formula for search being generated in the step C specifically includes following steps：

Step C1, according to determined in step A come user indicate be intended to, by the content information recognized in step B turn It is melted into search condition item, specific method includes：

If the instruction of user is intended to icon instruction or graphic code instruction, the content recognized in step B is believed Breath is used as current class code, and removes all current terms；

If the instruction of user is intended to text instruction, the content information recognized in step B is carried out at participle Reason, extracts vocabulary or phrase therein as current term；

Step C2, the current class code and each current term are carried out if the current term has update Logical combination generates expression formula for search.

The present invention also provides a kind of systems for obtaining remote resource from information carrier surface interactive mode and exporting, including：Refer to Show body type setup module, video capture and target identification module, indication body appearance color gauss hybrid models library, motion profile Analysis module, image block extracts and content identifier module, search condition generation module, transmission of network module, remote resource retrieval Module, specialized knowledge base and information are shown or playing module；

The indication body type setup module is connected with the video capture and target identification module, for selecting user Currently used indication body type, the indication body type include the lip pencil instruction of the finger and multiple color of different ethnic groups Object；The video capture and the target identification module indication body type currently selected according to the indication body type setup module Enable corresponding indication body appearance color gauss hybrid models and indication end endpoint searching method；

Indication body appearance color gauss hybrid models library include different ethnic groups finger complexion model and a variety of face The lip pencil indicant color model of color, each model is clapped by the video in indication body appearance color gauss hybrid models library It takes the photograph and target identification module current indication body type according to selected by the indication body type setup module is selected；

The video capture and target identification module include camera, foreground image extraction unit and indication end endpoint location Unit, the camera, which is located above target information carrier surface, carries out video capture, carries out mesh to the indication body for entering the visual field Mark is other；

Each collected current video image frame is read each picture by the foreground image extraction unit one by one The RGB color data of element calculate whether the pixel belongs to according to the indication body appearance color gauss hybrid models pre-established The probability value of the indication body appearance color gauss hybrid models, and differentiate the pixel whether the appearance color phase with indication body Matching, after each of described current video image frame pixel is all disposed, before obtaining the current video image frame Scape image；

The indication end endpoint location unit is filtered except making an uproar the foreground image, is then based on mathematical morphology Method carry out contour detecting, choose profile of the largest connected region in detected whole profiles as indication body, The indication end endpoint location of indication body is searched out in the profile of the indication body；

The motion trail analysis module is connected with the video capture and target identification module, indicates for tracking out The motion profile of indication end endpoint and pause on body are intended to and indicate according to the instruction that the motion profile and pause obtain user Region；

Described image block extracts and content identifier module is intended to according to the instruction of the user, extracts the indicating area Interior image block bitmap data, and identified in included in described image bitmap block data using picture material recognition methods Hold information；

The search condition generation module extracts respectively with described image block and content identifier module, the transmission of network mould Block is connected, and is intended to according to the instruction of the user, described image block is extracted and content identifier module identifies content information Perhaps it as current class code or carries out word segmentation processing and extracts vocabulary or phrase therein as current term, such as The current term of fruit has update that current class code and each current term are then carried out logical combination to generate retrieval expression Formula, and the expression formula for search is transmitted to the transmission of network module；

The transmission of network module is shown respectively with the remote resource retrieval module, the information or playing module is connected It connects, the expression formula for search is sent to the remote resource retrieval module, the transmission of network using wired or wireless network The information is given in the multimedia resource forwarding retrieved by the remote resource retrieval module that module is also used to receive Display or playing module are exported；

The remote resource retrieval module is connected with the specialized knowledge base, by the expression formula for search in the profession Qualified multimedia resource is retrieved in knowledge base, and sends back the transmission of network module；

Text, hypertext, audio, video, animation and three-dimensional artificial resource are included at least in the specialized knowledge base, Each resource is at least labeled with for retrieving the keyword used, classification code and autograph information.

Further, the indication body appearance color gauss hybrid models are divided into finger complexion model and lip pencil indicant face Color model；

The finger complexion model and lip pencil indicant color model carry out model instruction by acquisition observation sample in advance Experienced method obtains each calculating parameter of model；

When the training finger complexion model, it will be directed to a variety of illumination, different model camera and different sexes in advance Collected finger skin pixel rgb value is as observation sample value with the experiment people at age, using expectation-maximization algorithm into Row maximal possibility estimation determines each calculating parameter of finger complexion model Gaussian-mixture probability density function；

Before the video capture and target identification module are using the finger complexion model, user can also be acquired and existed The finger colour of skin RGB data of itself under its currently used environment re-uses the phase using these numerical value as new observation sample value Hope that maximizing algorithm carries out maximal possibility estimation, to each meter of the Gaussian-mixture probability density function of the finger complexion model It calculates parameter and carries out revaluation training；

Training the lip pencil indicant color model when, will be directed in advance a variety of illumination, different model camera and acquire The lip pencil indicant appearance colored pixels rgb value arrived carries out maximum likelihood as observation sample value, using expectation-maximization algorithm Estimation, determines each calculating parameter of the lip pencil indicant color model Gaussian-mixture probability density function；

Before the video capture and target identification module are using the lip pencil indicant color model, it can also acquire Lip pencil indicant appearance color RGB data under the currently used environment of the user, using these numerical value as new observation sample value, It re-uses expectation-maximization algorithm and carries out maximal possibility estimation, to the Gaussian-mixture probability of the lip pencil indicant color model Each calculating parameter of density function carries out revaluation training.

Further, the indication end endpoint location unit includes finger indication end endpoint location unit and lip pencil indicant Indication end endpoint location unit；

Finger indication end endpoint location unit definition template subgraph first, to the finger contours image after binaryzation Carry out erosion operation processing；Then by treated, finger contours progress transverse and longitudinal coordinate projection is searched from top to bottom, from left to right Rope is to the place of projection value significant change, as the rough position of finger fingertip, and centered on this position, constructs rough search Window；The finger fingertip position is predicted using single layer FL neural networks again, determines accurate search window, Specifically it is calculated as：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hHidden layer is arrived for input layer Weight matrix, W be precondition go out weight matrix, the weight matrix on Training strategy using from left to right horizontal line, Horizontal line, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, rectangular shape counterclockwise from right to left comes It is trained；Finger fingertip detection is carried out based on template matching, several template fingers are defined, in the accurate search window The middle absolute value distance for calculating subgraph and each template finger to be matched, obtains the exact position of finger fingertip；

Lip pencil indicant indication end endpoint location unit first to the lip pencil indicant contour images after binaryzation into Row connection processing, seeks the center of gravity of connected graph；Then using the center of gravity of the connected graph as search center point, and as center It scans for, successively calculates each on described search central point to profile diagram from the search order on upper left side, surface to upper right side The Euclidean distance of pixel takes out described Euclidean distance value the maximum, and maximum path finds lip pencil indicant wheel according to this distance Corresponding points on wide figure, the final position as lip pencil indicant instruction endpoint.

Further, the motion trail analysis module includes endpoint movement speed and direction calculating unit, plays point location Unit, terminal positioning unit and instruction are intended to understand unit；

The endpoint movement speed and direction calculating unit are by the indication end endpoint position of indication body in current video image frame It sets, is compared with the indication end endpoint location of indication body in several frames of the front, calculate movement speed and moving direction；

The starting point positioning unit is analyzed according to movement speed of the time sequencing to the indication end endpoint in each frame, works as head It is secondary when detecting that there is an apparent pause process in indication end, using this coordinate that the location of pauses as starting point indicating bit It sets；

The terminal positioning unit is after recognizing starting point indicating positions, when the movement for detecting indication end again has one When marked halt, using the location of pause coordinate as End point indication position, if not yet detecting End point indication position, Then continue the movement speed and moving direction of calculating indication end by the movement speed and direction calculating unit；

The instruction is intended to understand unit for the fortune to indication end between starting point indicating positions and End point indication position Dynamic rail mark is analyzed：If detecting that the movement of near linear is done in indication end from left to right or from right to left, then it represents that user Instruction be intended that text instruction, indicating area be indication end endpoint motion profile above line of text locating for image block area Domain；If detecting that the closing campaign of similar circle shape is done or close to closed movement in indication end, then it represents that the instruction of user It is intended that icon instruction, the image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for indicating area；Such as Fruit detects that the closing campaign of similar rectangle is done or close to closed movement in indication end, then it represents that the instruction schematic diagram of user is figure Shape code instruction, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

Further, described image block extract and content identifier module include image block extraction unit, text identification unit, Icon-based programming unit and graphic code recognition unit；

The user indicating area that described image block extraction unit understands out according to the motion trail analysis module is extracted Image block bitmap data in the indicating area out, and indicated according to the user that the motion trail analysis module understands out It is intended to, calls a kind of corresponding recognition unit to know from text identification unit, icon-based programming unit and graphic code recognition unit Content information included in other described image bitmap block data；

The text identification unit carries out content recognition to described image bitmap block data using OCR text recognition method, Obtain text and character string information included in it；

The icon-based programming unit is selected according to preset icon library using each icon characteristics template in icon library The method of image recognition carries out content recognition to described image bitmap block data, obtains the icon index word corresponded in icon library Symbol string information；

The graphic code recognition unit selects two dimensional code and barcode recognition method to described image bitmap block data respectively Content recognition is carried out, character string information included in it is obtained.

Compared with prior art, the present invention beneficial effect is：The slave information carrier surface interactive mode obtains long-range Resource and the method exported can be suitably used for the knowledge that the various media surfaces such as print media, electronic display unit specify information in region , and expression formula for search not being converted by the content information recognized, associated multimedia money is scanned for into specialized knowledge base Source, also associated multimedia resource is fed back, this method can not only from all communications media surfaces to indicating area into Row content recognition, moreover it is possible to associated multimedia resource and feedback output, application surface are retrieved in the specialized knowledge base big from amount of storage Widely.

Detailed description of the invention

Fig. 1 is the flow chart for the method that the present invention obtains remote resource and output from information carrier surface interactive mode；

Fig. 2 is the structural schematic diagram for the system for obtaining remote resource from information carrier surface interactive mode and exporting；

Fig. 3 is the structural schematic diagram of image block extraction and content identifier module.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.

The present invention is retrieved and is exported to the information on information carrier by Internet resources, and the mode of output can use Voice play mode, the display mode of text or graph image, hypertext web page browsing mode, video cartoon broadcast mode Deng.The information carrier includes paper media and Digital Media, wherein paper media include books, publication, newspaper, magazine, The Digital Medias such as printed material, Digital Media include the webpage shown on flat-panel display devices, text, hypertext, figure, figure Picture, video, animation etc., flat-panel display devices can be mobile phone, tablet computer, laptop, liquid crystal display, TV etc.. The Internet resources include the single matchmakers such as text, hypertext, graph image, audio, video, plane animation, three-dimensional artificial scene Body resource, and the hypermedia composite construction being made of these single medium resources according to a certain concept or theme.

As shown in Figure 1, for a preferred embodiment of the invention, it is a kind of to obtain remote resource from information carrier surface interactive mode And the method exported, include the following steps：Step A, with user terminal to enter its camera view indication body carry out shooting and Target identification tracks out the motion profile and stall position of indication end endpoint on indication body, according to the motion profile and pause Position obtains instruction intention and the indicating area of user；The camera of the user terminal is located above target information carrier surface, For being shot to target information carrier；Step B, user terminal is intended to according to the instruction of the user, extracts the instruction Image block bitmap data in region, and identified included in described image bitmap block data using picture material recognition methods Content information；Step C, user terminal is intended to according to the instruction of the user, and the content information is converted into current class code Or current term, expression formula for search is generated if the current term has update, and the expression formula for search is sent out Remote server is given, otherwise return step A；Step D, the described remote server is by the expression formula for search in professional knowledge Qualified multimedia resource is retrieved in library, and is sent to the user terminal；Step E, described user terminal will receive The multimedia resource is exported, and step A is then returned to, and prepares user's interaction next time.

Step A specifically includes following steps：

Step A1, user terminal uses the camera being located above target information carrier surface to acquire current video image frame.

Step A2, user terminal one by one in read step A1 current video image frame collected each pixel RGB face Chromatic number evidence calculates the pixel according to the indication body appearance color gauss hybrid models pre-established and belongs to the indication body appearance face The probability value of color gauss hybrid models, and judge whether the pixel matches with the appearance color of indication body.

Indication body appearance color gauss hybrid models are divided into finger complexion model and lip pencil indicant color model：

Indication body appearance color gauss hybrid models be based on CrCgCb color space application gauss hybrid models technology into Row modeling is mixed using multiple single Gaussian Profiles, is added by probability density function G (x) of the following formula to model Power mixing calculates：Wherein, M is the single Gaussian Profile that model includes Number, α_jFor the hybrid weight of the probability density function of each single Gaussian Profile, Ρ_j(c,μ_j,Σ_j) definition be：Wherein, the transposition of T representing matrix, c=[c_r,c_g,c_b]^TFor wait estimate The three-component column vector of texture color CrCgCb of pixel is calculated, μ is model expectation, and Σ is model variance, and μ, Σ are by several The CrCgCb feature column vector c of training sample pixel_iIt obtains, is mean vector, It is covariance matrix, n is the number of training sample.

G01-G02 training obtains each calculating parameter of model to finger complexion model as follows：Step G01, will In advance for collected finger under the conditions of a variety of illumination, different model camera and different sexes and the experiment people at age etc. Skin pixel rgb value carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, determines finger colour of skin mould Each calculating parameter of the Gaussian-mixture probability density function of type.Step G02, before user is using the step A, acquisition should User's itself finger colour of skin RGB data under its currently used environment, using these numerical value as new observation sample value, again Maximal possibility estimation is carried out using expectation-maximization algorithm, it is general to the Gaussian Mixture of the finger complexion model determined in step G01 Each calculating parameter of rate density function carries out revaluation training, and the calculating of finger complexion model is updated according to the result of revaluation training Parameter.

G11-G12 training obtains each calculating parameter of model to the lip pencil indicant color model as follows： Step G11, it will be directed to collected lip pencil indicant appearance color picture under the conditions of a variety of illumination, different model camera etc. in advance Plain rgb value carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, determines lip pencil indicant color mould Each calculating parameter of the Gaussian-mixture probability density function of type.Step G12, before user is using the step A, acquisition should Lip pencil indicant appearance color RGB data under the currently used environment of user, using these numerical value as new observation sample value, weight It is new to carry out maximal possibility estimation using expectation-maximization algorithm, the height to the lip pencil indicant color model determined in step G11 This each calculating parameter of mixing probability density function carries out revaluation training, updates lip pencil indicant face according to the result of revaluation training The calculating parameter of color model.

Step A3, step A2 is repeated, until each of current video image frame pixel is all disposed；It selects again The pixel for taking all and indication body appearance color to match obtains the foreground image of current video image frame, to the foreground picture Contour detecting is carried out as being filtered the method for being then based on mathematical morphology except making an uproar, is chosen in detected whole profiles Profile of the largest connected region as indication body, the indication end endpoint position of indication body is searched out in the profile of the indication body It sets.

It is divided into processing and the lip pencil indicant profile of finger contours when handling in step A3 the profile of indication body Processing.

When handle and determine finger fingertip indicating positions to the finger contours, following steps A301- is specifically included A304：Step A301, definition template subgraph carries out erosion operation processing to the finger contours image after binaryzation.Step A302, by treated, finger contours progress transverse and longitudinal coordinate projection searches projection value and obviously becomes from top to bottom, from left to right The place of change constructs rough search window as the rough position of finger fingertip, and centered on this position.Step A303, it adopts The finger fingertip position is predicted with single layer FL neural networks, determines accurate search window, it is specific to calculate For：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hFor the weight matrix of input layer to hidden layer, W is the weight matrix that precondition goes out, and the weight matrix is on Training strategy using horizontal line, from right to left from left to right Horizontal line, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, rectangular shape counterclockwise are trained.Step Rapid A304, finger fingertip detection is carried out based on template matching, several template fingers is defined, in the accurate search window The absolute value distance for calculating subgraph to be matched Yu each template finger obtains the exact position of finger fingertip.

It is specific to wrap when handle and determine the indication end endpoint location of lip pencil indicant to the lip pencil indicant profile Include following steps A311-A312：Step A311, connection processing is carried out to the lip pencil indicant contour images after binaryzation, sought The center of gravity of connected graph.Step A312, using the center of gravity of the connected graph as search center point, and as center from upper left side, The search order in surface to upper right side scans for, and successively calculates the Europe of each pixel on described search central point to profile diagram Formula distance takes out described Euclidean distance value the maximum, and maximum path finds pair on lip pencil indicant profile diagram according to this distance Ying Dian, the final position as lip pencil indicant indication end endpoint.

Step A4 specifically includes following steps：Step A401, it is indicated in the current video image frame that obtaining step A3 is extracted The indication end endpoint location of body is compared with the indication end endpoint location of indication body in several frames of the front, calculates mobile speed Degree and direction.Step A402, the indication end endpoint movement speed in each frame is analyzed according to time sequencing, is detected when for the first time When indication end endpoint has an apparent pause process, using this location coordinate that pauses as starting point indicating positions.Step Rapid A403, after recognizing starting point indicating positions, will when the movement for detecting indication end endpoint again has a marked halt The location of pause coordinate, if not yet detecting End point indication position, returns as End point indication position and executes step Rapid A1.Step A404, between starting point indicating positions and End point indication position, the motion profile of indication end endpoint is divided Analysis：If detecting that indication end endpoint from left to right or from right to left does the movement of near linear, then it represents that the instruction of user is anticipated Figure is text instruction, and indicating area is image block areas locating for the line of text above indication end endpoint motion profile；If inspection It measures indication end endpoint and does the closing campaign of similar circle shape or close to closed movement, then it represents that the instruction of user is intended that The image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for icon instruction, indicating area；If detection The closing campaign of similar rectangle is done to indication end endpoint or close to closed movement, then it represents that the instruction of user is intended that graphic code Instruction, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

The instruction that the user come is determined in step A is intended to text instruction respectively, icon instruction and graphic code instruction.Step Picture material recognition methods described in B specifically includes：

According to determined in step A come user instruction be intended to, if text indicate, then select the text identification side OCR Method carries out content recognition to described image bitmap block data, obtains text and character string information included in it.

According to determined in step A come the instruction of user be intended to, indicated if icon, then according to preset icon library, Using each icon characteristics template in icon library, the method for image recognition is selected to carry out content to described image bitmap block data Identification obtains the icon index character string information corresponded in icon library.

The process that expression formula for search is generated in step C specifically includes following steps：

Step C1, according to determined in step A come user indicate be intended to, by the content information recognized in step B turn It is melted into search condition item, specific method includes：If the instruction of user is intended to icon instruction or graphic code instruction, by step B In the content information that recognizes as current class code, and remove all current terms；If the instruction of user is intended to Text instruction, then carry out word segmentation processing for the content information recognized in step B, extracts vocabulary or phrase therein and makees For current term.

Step C2, current class code and each current term are subjected to logic if the current term has update Combination generates expression formula for search.

As shown in Fig. 2, the system for obtaining remote resource from information carrier surface interactive mode and exporting, including：Indication body class Type setup module 210, video capture and target identification module 202, indication body appearance color gauss hybrid models library 211, movement Trajectory analysis module 203, image block extracts and content identifier module 204, search condition generation module 205, transmission of network module 206, remote resource retrieval module 207, specialized knowledge base 209 and information are shown or playing module 208.

Indication body type setup module 210 is connected with video capture and target identification module 202, for selecting user to work as Indication body type used in preceding, indication body type include the finger of different ethnic groups and the lip pencil indicant of multiple color, difference Ethnic group includes white people, yellow, black race and brown race etc..Video capture and target identification module 202 are according to indication body The currently selected indication body type of type setup module 210 enable corresponding indication body appearance color gauss hybrid models and Indication end endpoint searching method.

Indication body appearance color gauss hybrid models library 211 includes the finger complexion model and multiple color of different ethnic groups Lip pencil indicant color model, each model is by video capture and mesh in indication body appearance color gauss hybrid models library The mark current indication body type according to selected by indication body type setup module 210 of identification module 202 is selected.

Video capture and target identification module 202 include camera, foreground image extraction unit and indication end endpoint location Unit.The camera, which is located above communications media surface 201, carries out video capture, carries out target to the indication body for entering the visual field Identification.Each collected current video image frame is read the RGB of each pixel by foreground image extraction unit one by one Color data calculates whether the pixel belongs to the instruction according to the indication body appearance color gauss hybrid models pre-established The probability value of external table color gauss hybrid models, and differentiate whether the pixel matches with the appearance color of indication body, institute It states after each of current video image frame pixel is all disposed, obtains the foreground image of the current video image frame. Indication end endpoint location unit is filtered the method for being then based on mathematical morphology except making an uproar to the foreground image and carries out profile Detection chooses profile of the largest connected region in detected whole profiles as indication body, in the wheel of the indication body The indication end endpoint location of indication body is searched out in exterior feature.

Motion trail analysis module 203 is connected with video capture and target identification module 202, for tracking out indication body The motion profile of upper indication end endpoint and pause are intended to according to the instruction that the motion profile and pause obtain user and indicate area Domain.

Image block extracts and content identifier module 204 is intended to according to the instruction of the user, extracts the indicating area Interior image block bitmap data, and identified in included in described image bitmap block data using picture material recognition methods Hold information.

Search condition generation module 205 extracts respectively with image block and content identifier module 204, transmission of network module 206 Be connected, be intended to according to the instruction of the user, by image block extract and the content information that identifies of content identifier module 204 or Person is as current class code, or carries out word segmentation processing and extract vocabulary or phrase therein as current term, if Current term has update then by current class code and each current term progress logical combination to generate expression formula for search, And the expression formula for search is transmitted to the transmission of network module.

Transmission of network module 206 is shown respectively with remote resource retrieval module 207, information or playing module 208 is connected, The expression formula for search is sent to remote resource retrieval module 207 using wired or wireless network, transmission of network module 206 is also Mould is shown or played for giving the multimedia resource forwarding retrieved by remote resource retrieval module 207 received to information Block 208 is exported.

Remote resource retrieval module 207 is connected with specialized knowledge base 209, by the expression formula for search in specialized knowledge base Qualified multimedia resource is retrieved in 209, and sends back the transmission of network module.

It include the resources such as text, hypertext, audio, video, animation and three-dimensional emulation in specialized knowledge base 209, often A resource is all at least labeled with for retrieving the keyword used, classification code and autograph information.

Indication body appearance color gauss hybrid models are divided into finger complexion model and lip pencil indicant color model, in reality In use, different color model is selected to carry out target identification and rail according to finger used in user or lip pencil indicant Mark analysis.

Indication body appearance color gauss hybrid models be based on CrCgCb color space application gauss hybrid models technology into Row modeling is mixed using multiple single Gaussian Profiles, is added by probability density function G (x) of the following formula to model Power mixing calculates：Wherein, M is the single Gaussian Profile that model includes Number, α_jFor the hybrid weight of the probability density function of each single Gaussian Profile, Ρ_j(c,μ_j,Σ_j) definition be：Wherein, the transposition of T representing matrix, c=[c_r,c_g,c_b]^TIt is to be evaluated The three-component column vector of texture color CrCgCb of pixel, μ are model expectation, and Σ is model variance, and μ, Σ are by several instructions Practice the CrCgCb feature column vector c of sampled pixel point_iIt obtains, is mean vector, It is covariance matrix, n is the number of training sample.

Finger complexion model and lip pencil indicant color model carry out model training by acquisition observation sample in advance Method obtains each calculating parameter of model.

When the training finger complexion model, it will be directed to a variety of illumination, different model camera and different sexes in advance Collected finger skin pixel rgb value utilizes expectation maximization as observation sample value under the conditions of waiting with the experiment people at age Algorithm carries out maximal possibility estimation, determines each calculating parameter of finger complexion model Gaussian-mixture probability density function.Institute Video capture and target identification module are stated using before the finger complexion model, user can also be acquired in its currently used ring The finger colour of skin RGB data of itself under border re-uses expectation-maximization algorithm using these numerical value as new observation sample value Maximal possibility estimation is carried out, weight is carried out to each calculating parameter of the Gaussian-mixture probability density function of the finger complexion model Estimate training.

When training lip pencil indicant color model, adopted being directed under the conditions of a variety of illumination, different model camera etc. in advance The lip pencil indicant appearance colored pixels rgb value collected carries out maximum seemingly as observation sample value, using expectation-maximization algorithm So estimation, determines each calculating parameter of the lip pencil indicant color model Gaussian-mixture probability density function.In the view Frequency shooting and target identification module can also acquire the currently used ring of the user using before the lip pencil indicant color model It is maximum to re-use expectation using these numerical value as new observation sample value for lip pencil indicant appearance color RGB data under border Change algorithm and carry out maximal possibility estimation, to each meter of the Gaussian-mixture probability density function of the lip pencil indicant color model It calculates parameter and carries out revaluation training.

Indication end endpoint location unit includes that finger indication end endpoint location unit and lip pencil indicant indication end endpoint are fixed Bit location.

Endpoint location unit definition template subgraph first in finger indication end carries out the finger contours image after binaryzation Erosion operation processing.Then by treated, finger contours progress transverse and longitudinal coordinate projection searches from top to bottom, from left to right The place of projection value significant change constructs rough search window as the rough position of finger fingertip, and centered on this position Mouthful.The finger fingertip position is predicted using single layer FL neural networks again, determines accurate search window, is had Body is calculated as：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hFor the power of input layer to hidden layer Value matrix, W be precondition go out weight matrix, the weight matrix on Training strategy using from left to right horizontal line, from The right side to the horizontal line on a left side, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, rectangular shape counterclockwise come into Row training.Finger fingertip detection is carried out based on template matching, several template fingers are defined, in the accurate search window The absolute value distance for calculating subgraph to be matched Yu each template finger obtains the exact position of finger fingertip.

Lip pencil indicant indication end endpoint location unit first connects the lip pencil indicant contour images after binaryzation Logical processing, seeks the center of gravity of connected graph.Then using the center of gravity of the connected graph as search center point, and as center from a left side Top, surface to upper right side search order scan for, successively calculate described search central point to profile diagram on each pixel The Euclidean distance of point, takes out described Euclidean distance value the maximum, maximum path finds lip pencil indicant profile diagram according to this distance On corresponding points, as lip pencil indicant instruction endpoint final position.

Motion trail analysis module 203 includes that endpoint movement speed and direction calculating unit, starting point positioning unit, terminal are fixed Bit location and instruction are intended to understand unit.

Endpoint movement speed and direction calculating unit by the indication end endpoint location of indication body in current video image frame, with The indication end endpoint location of indication body is compared in several frames of the front, calculates movement speed and moving direction.

Starting point positioning unit is analyzed according to movement speed of the time sequencing to the indication end endpoint in each frame, is examined when for the first time When measuring indication end has an apparent pause process, using this location coordinate that pauses as starting point indicating positions.

Terminal positioning unit after recognizing starting point indicating positions, when the movement for detecting indication end again have one it is obvious When pause, using the location of pause coordinate as End point indication position, if not yet detecting End point indication position, by The movement speed and direction calculating unit continue to calculate the movement speed and moving direction of indication end.

It indicates to be intended to understand unit for the movement rail to indication end between starting point indicating positions and End point indication position Mark is analyzed：If detecting that the movement of near linear is done in indication end from left to right or from right to left, then it represents that the finger of user Schematic diagram is text instruction, and indicating area is image block areas locating for the line of text above indication end endpoint motion profile；Such as Fruit detects that the closing campaign of similar circle shape is done or close to closed movement in indication end, then it represents that the instruction of user is intended that The image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for icon instruction, indicating area；If detection The closing campaign of similar rectangle is done to indication end or close to closed movement, then it represents that the instruction schematic diagram of user is that graphic code refers to Show, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

As shown in figure 3, image block extracts and content identifier module 204 includes image block extraction unit 301, text identification list Member 302, icon-based programming unit 303 and graphic code recognition unit 304.

Image block extraction unit 301 understands the user indicating area come out according to motion trail analysis module 203, extracts Image block bitmap data in the indicating area, and understand that the user come out refers to according to the motion trail analysis module 203 Schematic diagram calls a kind of corresponding knowledge from text identification unit 302, icon-based programming unit 303 and graphic code recognition unit 304 Other unit identifies content information included in described image bitmap block data.Refer to when the instruction of the user is intended to text When showing, text identification unit 302 is called；When the instruction of the user is intended to icon instruction, icon-based programming unit is called 303；When the instruction of the user is intended to graphic code instruction, graphic code recognition unit 304 is called.

Text identification unit 302 carries out content recognition to described image bitmap block data using OCR text recognition method, obtains To text and character string information included in it.

Icon-based programming unit 303 selects figure using each icon characteristics template in icon library according to preset icon library Content recognition is carried out to described image bitmap block data as knowing method for distinguishing, obtains the icon index character corresponded in icon library String information.

Graphic code recognition unit 304 select respectively two dimensional code and barcode recognition method to described image bitmap block data into Row content recognition obtains character string information included in it.

Product is read using the point that the system for obtaining remote resource from information carrier surface interactive mode and exporting is made into, Be easy to expand with easy to carry, learning Content, application space it is wide, and the advantages of can support autonomous learning whenever and wherever possible.

Now illustrate to the concrete application for the method for obtaining remote resource from information carrier surface interactive mode and exporting.

Embodiment one, the lookup of smart phone auxiliary and the autonomous learning systems of books associated network resource

The system is made of smart phone client software and remote resource management server, mobile phone with have professional knowledge Data communication is carried out by Wifi or 3G/4G mobile network between the server in library, uploads and is indicated by user's finger interaction The information category and text of acquisition, then download the education resource retrieved from specialized knowledge base.

Autonomous learning client software is designed under the operating systems such as smart phone Android, IOS, utilizes taking the photograph for mobile phone Target following and track point are carried out as head shoots the print medias current page such as books, while to the finger on the page Analysis understands finger movement intention out, judges the type of indicated area media information, be text instruction, icon instruction or Graphic code instruction, and region specified by finger is navigated to, extract the image block bitmap data in the specified region.Certainly, make When with the system, lip pencil indicant replacing finger can also be used.According to the information type in the aforementioned indicated region obtained, for Icon instruction and graphic code region, identify information sequence string conduct wherein included to the image block bitmap data extracted Current class code；According to the information type in the aforementioned indicated region obtained, for text indicating area, to the figure extracted As bitmap block data progress text identification, identifies text wherein included, extract vocabulary therein as current term； Using Wifi the or 3G/4G communication network of mobile phone, by the aforementioned current class code obtained and the composition retrieval expression of current term Formula is sent to remote server, and the digital resource of associated various formats is retrieved in specialized knowledge base；It will be from server The various digital resources retrieved in specialized knowledge base are passed back by wireless network, and tissue, display are carried out on mobile phone liquid crystal screen Or it plays.

In actual use, due to can all have slight shake, pass through when user, which holds mobile phone, to be shot to media surface The video image frame picture area of mobile phone camera acquisition has slight movement, and background is also understable.Thus in opponent's fingering row When target following and trajectory extraction, using finger skin color modeling and knows the foreground image that method for distinguishing extraction includes hand, pass through The Gaussian Mixture Model Probability Density Function of finger complexion model differentiates whether each pixel of picture frame matches with the finger colour of skin, utilizes base It extracts finger contours shape in the profile testing method of mathematical morphology to eliminate the influence of general light shade, then passes through It defines subtemplate image and connection processing is carried out to the aforementioned finger contours that obtain, the rough position of finger is predicted by Projection Analysis It sets, predicts to determine search window in conjunction with prediction model, finally detect Fingers with template matching method in search window The exact position of point.Behind the exact position for extracting finger tip, gesture understanding and track following are carried out, navigates to the letter of user's instruction Region is ceased, identifies the text information for including in the region or classification information, expression formula for search is formed and passes through wireless network module It is transferred to specialized knowledge base, carries out the retrieval of associated education resource, is finally downloaded the corresponding education resource found out in one's hands Machine carries out tissue and display in display screen.

Embodiment two, students in middle and primary schools' learning System based on intelligence learning LED desk lamp

The system is made of intelligence learning LED desk lamp and remote teaching Resource Server, intelligence learning LED desk lamp and long-range Data communication is carried out by Wifi or 3G/4G mobile network between teaching resource server, uploads and is referred to by user's finger interaction The information category and text for showing and obtaining download the education resource retrieved from Network Database.Intelligence learning LED platform The physical structure of lamp is based on common LED desk lamp, near the LED module of the rack-mount aspect plate of camera, pedestal table Face is embedded with liquid crystal display, has been internally integrated the moulds such as embedded type CPU, memory, Flash storage, wireless network, video acquisition Block.LED light can provide light source at night, and student, can be with by the key in liquid crystal display in use, books are placed below desk lamp Adjust books and camera relative position, with guarantee student concern current page camera within sweep of the eye.

In actual design, select ARM11 as embedded microprocessor, using Eclipse IDE for C/C++ Developers as developing instrument, design support video acquisition, target following, gesture identification, trajectory extraction, based on image Text identification, the icon-based programming based on image, the graphic code identification based on image, wireless network transmissions, multimedia messages are shown Etc. functions embedded software, realize students in middle and primary schools' Web-based Self-regulated Learning system based on intelligence learning LED desk lamp.

Embodiment three, the professional knowledge inquiry system interacted based on intelligent glasses with finger

The system is made of intelligent glasses and long-range knowledge services device, is passed through between intelligent glasses and long-range knowledge services device The mobile radio networks such as Wifi or 3G/4G carry out data communication, upload the info class obtained by user's finger interaction instruction Other and text downloads the associated resources retrieved from network specialty knowledge base.The physical structure of intelligent glasses is in common eye On the basis of mirror, miniature webcam is mounted on cell mount outwardly, eyeglass inner surface is embedded with flexible ultra-thin liquid crystal display, interior Portion is integrated with the modules such as embedded type CPU, memory, Flash storage, wireless network, video acquisition.System is in use, books etc. print Brush data or plane information are shown in front of the intelligent glasses that equipment is in, and the position on head are adjusted, to guarantee that user is of interest Current written information be in camera within sweep of the eye.

Intelligence is designed using Google Glass Mirror API Google glass development interface under Android operation system The embedded software at energy glasses end calls the camera on frame to the print media current page for entering the visual field or puts down Face shows that equipment surface is shot, while opponent's fingering row target following and trajectory analysis, understands finger movement intention out, sentences The type of disconnected area media information indicated out, and region specified by finger is navigated to, extract the image in the specified region Bitmap block data, and identify text information therein or mark classification information.Utilize the Wi-Fi that design is embedded on intelligent glasses Network communication module, the classification information sequence string or text that will identify that are sent to long-range knowledge services device, in professional knowledge The digital resource of associated various formats is retrieved in library.The various number moneys that will be retrieved from server specialized knowledge base Source is passed back by wireless network, and tissue, display or broadcasting are carried out on the liquid crystal display on intelligent glasses.

The slave information carrier surface interactive mode obtains remote resource and the method exported can be used as a kind of revolutionary character Technology, upgrading at present before, in primary grades language education field widely used point reader, talking pen series produce Product.This method has using unsophisticated, at low cost, information scale is big, expands the characteristics such as convenient, applicability is wide.In addition to language Study is outer, and technical method proposed by the present invention can also be applied to the learning activities of other subject course, expands corresponding network After knowledge base, each classman even university student can browse the various textbooks such as mathematics, physics, chemistry, history, geography on one side Or data is obtained the relational learning resource for currently encountering knotty problem, is independently learned by natural interaction and network transmission It practises.This method can also be applied in the daily work and life of people, can be in time to currently aobvious in printed material or plane Show that the difficulties encountered in equipment pass through internet search solution.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. the method for obtaining remote resource from information carrier surface interactive mode and exporting, which is characterized in that include the following steps：

Step A, target identification is carried out to the indication body for entering its camera view with user terminal, tracks out indication end on indication body The motion profile and stall position of endpoint are intended to according to the instruction that the motion profile and stall position obtain user and indicate area Domain；The camera of the user terminal is located above target information carrier surface, for shooting to target information carrier；

Step B, user terminal is intended to according to the instruction of the user, extracts the image block bitmap data in the indicating area, And identify content information included in described image bitmap block data；

Step C, user terminal is intended to according to the instruction of the user, and the content information is converted into current class code or current Term generates expression formula for search if the current term has update, and the expression formula for search is sent to remotely Server, otherwise return step A；

Step D, the described remote server retrieves qualified more matchmakers by the expression formula for search in specialized knowledge base Body resource, and it is sent to the user terminal；

Step E, the described user terminal exports the multimedia resource received, then returnes to step A, prepares next Secondary user's interaction；

Wherein, picture material recognition methods described in the step B specifically includes：

According to determined in step A come user instruction be intended to, if text indicate, then select OCR text recognition method pair Described image bitmap block data carry out content recognition, obtain text and character string information included in it；

According to determined in step A come the instruction of user be intended to, indicate if icon, then according to preset icon library, utilize Each icon characteristics template in icon library selects the method for image recognition to carry out content knowledge to described image bitmap block data Not, the icon index character string information corresponded in the icon library is obtained；

According to determined in step A come the instruction of user be intended to, indicated if graphic code, then select two dimensional code and bar shaped respectively Code recognition methods carries out content recognition to described image bitmap block data, obtains character string information included in it；

Wherein, the process for expression formula for search being generated in the step C specifically includes following steps：

Step C1, according to determined in step A come user indicate be intended to, the content information recognized in step B is converted to Search condition item, specific method include：

If the instruction of user is intended to icon instruction or graphic code instruction, the content information recognized in step B is made For current class code, and remove all current terms；

If the instruction of user is intended to text instruction, the content information recognized in step B is subjected to word segmentation processing, is mentioned Vocabulary or phrase therein are taken out as current term；

Step C2, the current class code and each current term are subjected to logic if the current term has update Combination generates expression formula for search.

2. the method according to claim 1, wherein the step A specifically includes following steps：

Step A1, the camera of user terminal acquires current video image frame；

Step A2, user terminal one by one in read step A1 current video image frame collected each pixel RGB color data, The pixel, which is calculated, according to the indication body appearance color gauss hybrid models pre-established belongs to the indication body appearance color Gauss The probability value of mixed model, and judge whether the pixel matches with the appearance color of indication body；

Step A3, step A2 is repeated, until each of current video image frame pixel is all disposed；Institute is chosen again There is the pixel to match with the appearance color of indication body to obtain the foreground image of current video image frame, to the foreground image into Row filtering and eliminating noise, the method for being then based on mathematical morphology carry out contour detecting, choose in detected whole profiles most Big profile of the connected region as indication body, searches out the indication end endpoint location of indication body in the profile of the indication body；

Step A4, according to the indication end endpoint location of indication body in the current video image frame of step A3 extraction, to the instruction Endpoint is held to carry out motion trail analysis and indicate to be intended to understand, if the instruction intention and indicating area that not yet determine user It is continued to execute back to step A1.

3. according to the method described in claim 2, it is characterized in that, indicating external table color Gaussian Mixture mould in the step A2 Type is divided into finger complexion model and lip pencil indicant color model；

The indication body appearance color gauss hybrid models be based on CrCgCb color space application gauss hybrid models technology into Row modeling is mixed using multiple single Gaussian Profiles, is added by probability density function G (x) of the following formula to model Power mixing calculates：Wherein, M is the single Gaussian Profile that model includes Number, α_jFor the hybrid weight of the probability density function of each single Gaussian Profile, P_j(c,μ_j,Σ_j) definition be：Wherein, the transposition of T representing matrix, c=[c_r,c_g,c_b]^TFor wait estimate The three-component column vector of texture color CrCgCb of pixel is calculated, μ is model expectation, and Σ is model variance, and μ, Σ are by several The CrCgCb feature column vector c of training sample pixel_iIt obtains,For mean vector, It is covariance matrix, n is the number of training sample；

Step G01, a variety of illumination, different model camera and different sexes and the experiment people at age will be directed in advance and acquired The finger skin pixel rgb value arrived carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, determines hand Refer to each calculating parameter of the Gaussian-mixture probability density function of complexion model；

Step G02, in user using before the step A, acquiring the user itself finger skin under its currently used environment Color RGB data re-uses expectation-maximization algorithm progress maximum likelihood and estimates using these numerical value as new observation sample value Meter carries out revaluation to each calculating parameter of the Gaussian-mixture probability density function of the finger complexion model determined in step G01 Training updates the calculating parameter of finger complexion model according to the result of revaluation training；

Step G11, it will be directed to a variety of illumination, different model camera and collected lip pencil indicant appearance colored pixels in advance Rgb value carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, determines lip pencil indicant color model Gaussian-mixture probability density function each calculating parameter；

Step G12, before user is using the step A, the lip pencil indicant appearance under the currently used environment of the user is acquired Color RGB data re-uses expectation-maximization algorithm progress maximum likelihood and estimates using these numerical value as new observation sample value Meter carries out each calculating parameter of the Gaussian-mixture probability density function of the lip pencil indicant color model determined in step G11 Revaluation training updates the calculating parameter of lip pencil indicant color model according to the result of revaluation training.

4. according to the method described in claim 2, it is characterized in that, when handling in the step A3 the profile of indication body It is divided into the processing of finger contours and the processing of lip pencil indicant profile；

When handle and determine finger fingertip indicating positions to the finger contours, following steps A301-A304 is specifically included：

Step A302, by treated, finger contours progress transverse and longitudinal coordinate projection searches projection from top to bottom, from left to right It is worth the place of significant change, as the rough position of finger fingertip, and centered on this position, constructs rough search window；

Step A303, the finger fingertip position is predicted using single layer FL neural networks, determination is accurately searched Rope window, is specifically calculated as：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hIt is arrived for input layer The weight matrix of hidden layer, W are the weight matrix that precondition goes out, and the weight matrix uses from left to right on Training strategy Horizontal line, horizontal line from right to left, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, rectangle shape counterclockwise Shape is trained；

Step A304, finger fingertip detection is carried out based on template matching, several template fingers is defined, in the accurate search The absolute value distance that subgraph to be matched Yu each template finger are calculated in window obtains the exact position of finger fingertip；

To the lip pencil indicant profile carry out handle and determine lip pencil indicant indication end endpoint location when, specifically include with Lower step A311-A312：

Step A312, using the center of gravity of the connected graph as search center point, and as center from upper left side, surface to the right side The search order of top scans for, and successively calculates the Euclidean distance of each pixel on described search central point to profile diagram, takes Described Euclidean distance value the maximum out finds the correspondence on lip pencil indicant profile diagram according to the maximum path of Euclidean distance value Point, the final position as lip pencil indicant indication end endpoint.

5. according to the method described in claim 2, it is characterized in that, the step A4 specifically includes following steps：

Step A401, obtaining step A3 extract current video image frame in indication body indication end endpoint location, with the front The indication end endpoint location of indication body is compared in several frames, calculates movement speed and direction；

Step A402, the indication end endpoint movement speed in each frame is analyzed according to time sequencing, when detecting instruction for the first time When end endpoint has an apparent pause process, using this location coordinate that pauses as starting point indicating positions；

Step A403, after recognizing starting point indicating positions, when the movement for detecting indication end endpoint again has one obviously to stop Immediately, it using the location of pause coordinate as End point indication position, if not yet detecting End point indication position, returns Execute step A1；

Step A404, between starting point indicating positions and End point indication position, the motion profile of indication end endpoint is analyzed： If detecting that indication end endpoint from left to right or from right to left does the movement of near linear, then it represents that the instruction of user is intended that Text instruction, indicating area are image block areas locating for the line of text above indication end endpoint motion profile；If detected Indication end endpoint does the closing campaign of similar circle shape or close to closed movement, then it represents that the instruction of user is intended that icon The image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for instruction, indicating area；If detecting finger Show that end endpoint does the closing campaign of similar rectangle or close to closed movement, then it represents that the instruction of user is intended that graphic code and refers to Show, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

6. the system for obtaining remote resource from information carrier surface interactive mode and exporting, which is characterized in that including：Indication body type Setup module, video capture and target identification module, indication body appearance color gauss hybrid models library, motion trail analysis mould Block, image block extract and content identifier module, search condition generation module, transmission of network module, remote resource retrieval module, specially Industry knowledge base and information are shown or playing module；

The indication body type setup module is connected with the video capture and target identification module, for selecting user current Used indication body type, the indication body type include the finger of different ethnic groups and the lip pencil indicant of multiple color；Institute Video capture and target identification module is stated to be enabled according to the currently selected indication body type of the indication body type setup module Corresponding indication body appearance color gauss hybrid models and indication end endpoint searching method；

Indication body appearance color gauss hybrid models library includes the finger complexion model and multiple color of different ethnic groups Lip pencil indicant color model, in indication body appearance color gauss hybrid models library each model by the video capture and Target identification module current indication body type according to selected by the indication body type setup module is selected；

The video capture and target identification module include camera, foreground image extraction unit and indication end endpoint location list Member, the camera, which is located above target information carrier surface, carries out video capture, carries out target to the indication body for entering the visual field Identification；

Each collected current video image frame is read each pixel by the foreground image extraction unit one by one RGB color data, it is according to whether the indication body appearance color gauss hybrid models calculating pixel pre-established belongs to The probability value of indication body appearance color gauss hybrid models, and differentiate the pixel whether the appearance color phase with indication body Match, after each of described current video image frame pixel is all disposed, obtains the prospect of the current video image frame Image；

The indication end endpoint location unit is filtered except making an uproar the foreground image, is then based on the side of mathematical morphology Method carries out contour detecting, profile of the largest connected region in detected whole profiles as indication body is chosen, described The indication end endpoint location of indication body is searched out in the profile of indication body；

The motion trail analysis module is connected with the video capture and target identification module, for tracking out indication body The motion profile of indication end endpoint and pause are intended to according to the instruction that the motion profile and pause obtain user and indicate area Domain；

Described image block extracts and content identifier module is intended to according to the instruction of the user, extracts in the indicating area Image block bitmap data, and identify that content included in described image bitmap block data is believed using picture material recognition methods Breath；

The search condition generation module extracts respectively with described image block and content identifier module, the transmission of network module phase Connection, according to the instruction of the user be intended to, by described image block extract and content identifier module identify content information or As current class code, or carries out word segmentation processing and extract vocabulary or phrase therein as current term, if worked as Preceding term has update that current class code and each current term are then carried out logical combination to generate expression formula for search, and The expression formula for search is transmitted to the transmission of network module；

The transmission of network module is shown respectively with the remote resource retrieval module, the information or playing module is connected, The expression formula for search is sent to the remote resource retrieval module, the transmission of network module using wired or wireless network The multimedia resource forwarding retrieved by the remote resource retrieval module for being also used to receive is given the information and is shown Or playing module is exported；

The remote resource retrieval module is connected with the specialized knowledge base, by the expression formula for search in the professional knowledge Qualified multimedia resource is retrieved in library, and sends back the transmission of network module；

Text, hypertext, audio, video, animation and three-dimensional artificial resource are included at least in the specialized knowledge base, each Resource is all at least labeled with for retrieving the keyword used, classification code and autograph information.

7. system according to claim 6, which is characterized in that the indication body appearance color gauss hybrid models are divided into hand Refer to complexion model and lip pencil indicant color model；

The indication body appearance color gauss hybrid models be based on CrCgCb color space application gauss hybrid models technology into Row modeling is mixed using multiple single Gaussian Profiles, is added by probability density function G (x) of the following formula to model Power mixing calculates：Wherein, M is the single Gaussian Profile that model includes Number, α_jFor the hybrid weight of the probability density function of each single Gaussian Profile, P_j(c,μ_j,Σ_j) definition be：Wherein, the transposition of T representing matrix, c=[c_r,c_g,c_b]^TFor wait estimate The three-component column vector of texture color CrCgCb of pixel is calculated, μ is model expectation, and Σ is model variance, and μ, Σ are by several The CrCgCb feature column vector of training sample pixel_ciIt obtains,For mean vector, It is covariance matrix, n is the number of training sample；

The finger complexion model and lip pencil indicant color model carry out model training by acquisition observation sample in advance Method obtains each calculating parameter of model；

When the training finger complexion model, it will be directed to a variety of illumination, different model camera and different sexes and year in advance The experiment people in age and collected finger skin pixel rgb value is carried out most as observation sample value using expectation-maximization algorithm Maximum-likelihood estimation, determines each calculating parameter of finger complexion model Gaussian-mixture probability density function；

Before the video capture and target identification module are using the finger complexion model, user can also be acquired and worked as at it The finger colour of skin RGB data of itself under preceding use environment re-uses expectation most using these numerical value as new observation sample value Bigization algorithm carries out maximal possibility estimation, joins to each calculating of the Gaussian-mixture probability density function of the finger complexion model Number carries out revaluation training；

Training the lip pencil indicant color model when, will be directed in advance a variety of illumination, different model camera and it is collected Lip pencil indicant appearance colored pixels rgb value carries out maximal possibility estimation as observation sample value, using expectation-maximization algorithm, Determine each calculating parameter of the lip pencil indicant color model Gaussian-mixture probability density function；

Before the video capture and target identification module are using the lip pencil indicant color model, the use can also be acquired Lip pencil indicant appearance color RGB data under the currently used environment in family, using these numerical value as new observation sample value, again Maximal possibility estimation is carried out using expectation-maximization algorithm, to the Gaussian-mixture probability density of the lip pencil indicant color model Each calculating parameter of function carries out revaluation training.

8. system according to claim 6, which is characterized in that the indication end endpoint location unit includes finger indication end Endpoint location unit and lip pencil indicant indication end endpoint location unit；

Finger indication end endpoint location unit definition template subgraph first carries out the finger contours image after binaryzation Erosion operation processing；Then by treated, finger contours progress transverse and longitudinal coordinate projection searches from top to bottom, from left to right The place of projection value significant change constructs rough search window as the rough position of finger fingertip, and centered on this position Mouthful；The finger fingertip position is predicted using single layer FL neural networks again, determines accurate search window, is had Body is calculated as：Y=[X | f (XW_h+β_h)] W, wherein X is input vector, β_hFor bias matrix, W_hFor the power of input layer to hidden layer Value matrix, W be precondition go out weight matrix, the weight matrix on Training strategy using from left to right horizontal line, from The right side to the horizontal line on a left side, clockwise circle loop-shaped, counter clock wise circles shape, rectangular shape clockwise, rectangular shape counterclockwise come into Row training；Finger fingertip detection is carried out based on template matching, several template fingers are defined, in the accurate search window The absolute value distance for calculating subgraph to be matched Yu each template finger obtains the exact position of finger fingertip；

Lip pencil indicant indication end endpoint location unit first connects the lip pencil indicant contour images after binaryzation Logical processing, seeks the center of gravity of connected graph；Then using the center of gravity of the connected graph as search center point, and as center from a left side Top, surface to upper right side search order scan for, successively calculate described search central point to profile diagram on each pixel The Euclidean distance of point, takes out described Euclidean distance value the maximum, finds lip pencil indicant according to the maximum path of Euclidean distance value Corresponding points on profile diagram, the final position as lip pencil indicant instruction endpoint.

9. system according to claim 6, which is characterized in that the motion trail analysis module includes endpoint movement speed It is intended to understand unit with direction calculating unit, starting point positioning unit, terminal positioning unit and instruction；

The endpoint movement speed and direction calculating unit by the indication end endpoint location of indication body in current video image frame, with The indication end endpoint location of indication body is compared in several frames of the front, calculates movement speed and moving direction；

The starting point positioning unit is analyzed according to movement speed of the time sequencing to the indication end endpoint in each frame, is examined when for the first time When measuring indication end has an apparent pause process, using this location coordinate that pauses as starting point indicating positions；

The terminal positioning unit after recognizing starting point indicating positions, when the movement for detecting indication end again have one it is obvious When pause, using the location of pause coordinate as End point indication position, if not yet detecting End point indication position, by The movement speed and direction calculating unit continue to calculate the movement speed and moving direction of indication end；

The instruction is intended to understand unit for the movement rail to indication end between starting point indicating positions and End point indication position Mark is analyzed：If detecting that the movement of near linear is done in indication end from left to right or from right to left, then it represents that the finger of user Schematic diagram is text instruction, and indicating area is image block areas locating for the line of text above indication end endpoint motion profile；Such as Fruit detects that the closing campaign of similar circle shape is done or close to closed movement in indication end, then it represents that the instruction of user is intended that The image block areas that rectangle includes is inscribed by indication end endpoint circus movement track for icon instruction, indicating area；If detection The closing campaign of similar rectangle is done to indication end or close to closed movement, then it represents that the instruction schematic diagram of user is that graphic code refers to Show, indicating area is by the image block areas that includes in the endpoint regular-shape motion track of indication end.

10. system according to claim 6, which is characterized in that described image block extracts and content identifier module includes figure As block extraction unit, text identification unit, icon-based programming unit and graphic code recognition unit；

The user indicating area that described image block extraction unit understands out according to the motion trail analysis module, extracts institute The image block bitmap data in indicating area is stated, and meaning is indicated according to the user that the motion trail analysis module understands out Figure calls a kind of corresponding recognition unit to identify from text identification unit, icon-based programming unit and graphic code recognition unit Content information included in described image bitmap block data；

The text identification unit carries out content recognition to described image bitmap block data using OCR text recognition method, obtains Text and character string information included in it；

The icon-based programming unit selects image using each icon characteristics template in icon library according to preset icon library Know method for distinguishing and content recognition is carried out to described image bitmap block data, obtains the icon index character string corresponded in icon library Information；

The graphic code recognition unit selects two dimensional code and barcode recognition method to carry out described image bitmap block data respectively Content recognition obtains character string information included in it.