CN110020668B - Canteen self-service pricing method based on bag-of-words model and adaboost - Google Patents

Canteen self-service pricing method based on bag-of-words model and adaboost Download PDF

Info

Publication number
CN110020668B
CN110020668B CN201910155376.2A CN201910155376A CN110020668B CN 110020668 B CN110020668 B CN 110020668B CN 201910155376 A CN201910155376 A CN 201910155376A CN 110020668 B CN110020668 B CN 110020668B
Authority
CN
China
Prior art keywords
image
tray
images
training
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910155376.2A
Other languages
Chinese (zh)
Other versions
CN110020668A (en
Inventor
盛庆华
郭晨洁
李竹
王韵涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910155376.2A priority Critical patent/CN110020668B/en
Publication of CN110020668A publication Critical patent/CN110020668A/en
Application granted granted Critical
Publication of CN110020668B publication Critical patent/CN110020668B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Finance (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a self-service canteen pricing method based on a bag-of-words model and adaboost, which comprises the following steps of S1: the image acquisition device 9 acquires settlement area images every 1 second; step S2: the PC 8 carries out preprocessing on the acquired image, extracts feature points, constructs a visual dictionary and carries out adaboost recognition; step S3: the settlement terminal device 11 calculates the total price of the dishes and displays the payment for the diner. By adopting the technical scheme of the invention, the bag-of-words model is established, the SIFI algorithm is utilized to extract key features from the block images, the final visual dictionary is constructed by adopting the weight-layering-based k-means clustering, the features can be furthest distinguished from other objects by virtue of the strong distinguishability of the extracted features, and the features can be better detected and identified even if the objects are under very complicated conditions. The image is trained by adopting an adaboost-based classifier to obtain a preset training library, so that the learning precision is obviously improved.

Description

Canteen self-service pricing method based on bag-of-words model and adaboost
Technical Field
The invention relates to the technical field of image recognition, in particular to a canteen self-service pricing method based on a bag-of-words model and adaboost.
Background
Almost all factories, units, enterprises, schools and the like generally solve the daily dining problem of employees and students by means of a self-service dining hall. The canteens basically adopt an autonomous selection and card swiping settlement mode, so that some cost on manpower is saved. However, as the quality of life of people increases, the types of dishes in a canteen gradually increase, which puts a great strain on terminal settlement staff. Particularly, in peak consumption, long trips are often arranged, and terminal settlement personnel often make mistakes due to various reasons such as large people flow and various dishes, so that unnecessary economic disputes are caused, and part of economic losses of the canteens are caused.
Image recognition has been one of the hot spots in the current computer vision field, where images are processed, analyzed, and understood by computers to identify objects in various situations. With the development of science and technology, the construction of an intelligent canteen is particularly important, and dishes serving as key contents for the construction of the intelligent canteen are identified by more and more attention. However, in practical applications, there are some complex scenes, and general image recognition cannot effectively cope with situations such as underexposure, light and shade, weak or small targets, or blocked targets. Therefore, the product processes the image by adopting the bag-of-words model, can be furthest distinguished from other objects by virtue of the strong distinguishability of the extracted features, and can better detect and identify the features even if the objects are under very complicated conditions; the adaboost classifier is adopted, the weak classifier is combined to become the strong classifier, and compared with the common dinner plate identification methods such as pattern identification, TCS230 color identification, HSV space detection and shape detection, the method is more accurate and has good classification and identification effects.
Therefore, aiming at the technical problems in the prior art, the invention aims to solve the problems of long queuing time and high settlement error rate in a canteen by using the technology.
Disclosure of Invention
Aiming at the defects of the traditional image recognition technology, the invention provides a canteen self-service pricing method based on a word bag model and adaboost, which can automatically shoot images of a tray with selected dishes loaded by diners and transmit the shot images to a PC (personal computer) for connected domain blocking. Extracting key features from the segmented images by adopting an SIFI algorithm, constructing a local feature library of an image library, finishing mapping between the features and visual words by adopting weight layering-based k-means clustering, establishing description of the images, and constructing a final visual dictionary. For a given image, calculating the distance between the local features and the visual words, and counting the occurrence frequency of the visual words closest to each local feature, a bag histogram of the occurrence frequency of the visual words can represent an image. And then training the images by using an adaboost-based classifier to obtain a preset training library, finally classifying the test images by using a learning model to effectively finish the identification of dishes in the tray, calculating the total price of the dishes selected by the diner by using the obtained information of the dishes, and finishing the whole settlement process by using various settlement modes such as IC card, WeChat payment, Paibao payment and the like to realize the whole self-service settlement process of the diner.
In order to achieve the purpose, the invention provides a canteen self-service pricing method based on a bag-of-words model and adaboost.
The image acquisition device acquires the current clearing area image every 1 second and transmits the acquired image to the PC for storage.
The PC further comprises a settlement judging device, a connected domain marking model, a word bag model and a dish identification model, namely, the images are subjected to dish identification pricing and then transmitted to the settlement terminal device.
Preferably, the settlement judging means identifies whether or not a tray enters the settlement area to wait for settlement, and if so, performs processing and identification operations on the image.
The connected domain marking model marks the dinner plate on the tray, and divides the tray image collected by the image collecting device into blocks, so that the introduction of redundant (background) information is greatly reduced, and the value of useful information provided by the image is improved.
The bag-of-words model is used for extracting the characteristics of the images, constructing visual dictionaries of all the images and finishing the adaboost classifier training.
The dish identification model identifies the type and the quantity of the dishes by using an adaboost algorithm-based classifier, and transmits data to the settlement terminal device.
The settlement terminal device calculates the total price of the dishes taken by the current tray according to the corresponding relationship between the dishes identified by the PC and the price thereof, and finally presents payment information on a display screen;
preferably, the settlement judging means. And the PC carries out background difference on the currently received picture and the background picture originally stored in the PC to obtain a difference image. Considering the noise of the external environment, a reasonable difference threshold value is set, and when the difference value exceeds the threshold value, the tray enters the settlement area.
Preferably, the settlement judging means. After the tray enters the settlement area through background difference recognition, whether the tray is static or not is judged through an optical flow method, and if the tray is static, subsequent operations such as connected domain marking, word bag model establishment and the like are carried out.
Preferably, the connected domain marking model transmits the image acquisition device to a tray image of the PC, converts the image into a differential image and then into a binary image, makes use of a contour search algorithm to mark white pixels in the binary image to enable each individual connected domain to form an identified block, namely, marks dishes with dinner plates on the tray, and then marks the original image to obtain the connected domain block model.
Preferably, the bag of words model extracts visual words from the connected domain block images by using scale-independent feature transformation (SIFI). Then all the extracted visual words are gathered together, a visual dictionary is constructed by adopting weight-layering-based k-means clustering, Laplace spectral structure characteristics and SIFT local characteristics are respectively extracted from N block images in a training set and clustered, mapping between the characteristics and the visual words is completed, and a visual dictionary C with more complete image information description is obtainedLKAnd CSK. The center of each cluster is defined as the visual word, i.e., the "word" of the image, and the collection of all visual words is the visual vocabulary. And constructing a visual dictionary by using the visual words, then completing mapping between the features and the visual words, establishing a description of the image, and constructing a final visual dictionary.
Preferably, the bag of words model extracts visual words from the connected domain block images by using scale-independent feature transformation (SIFI). And then all the extracted visual words are gathered together to obtain a large corpus of local features of the image library. Adopting weight layering-based k-means clustering to construct a visual dictionary, respectively extracting Laplace spectrum structure characteristics and SIFT local characteristics from N block images in a training set, clustering, defining the center of each cluster as a visual word, namely a word of an image, and defining the set of all visual words as a visual vocabularyThe table is used for completing the mapping between the features and the visual words to obtain a visual dictionary C with more complete image information descriptionLKAnd CSK. And finally, carrying out weight value distribution processing on the two father visual dictionaries, balancing the functions of the two image characteristics in the image classification process, and obtaining a total visual dictionary.
Preferably, after the steps of feature extraction, visual dictionary construction and the like, the bag-of-words model calculates the distance between the local features and the visual words of a given image, and counts the occurrence frequency of the visual words closest to each local feature, so that an image containing a large amount of high-dimensional local feature data can be converted into a list of the number of the visual words according to the frequency statistics, and a bag-of-words histogram of the occurrence frequency of the visual words can be calculated to represent an image. And (3) after the visual information of the image is described by using the local feature distribution histogram, constructing and training a classifier, wherein the adaboost classifier is adopted for training to finish image classification.
Preferably, the dish identification model performs classification and identification by using a trained adaboost classifier to obtain the type and quantity of the dishes, and transmits data to the settlement terminal device.
Preferably, the settlement terminal device first obtains information of the number of the dishes according to the number of the connected domains of the identification tray image, calculates the total price of the dishes contained in the dinner plate according to the unit price and the number according to the corresponding relation between the dishes identified by the PC and the price, and finally displays the number of the dishes contained in the tray, the unit price and the total price and displays the selectable payment modes for the diner to pay.
Compared with the prior art, the invention has the following beneficial effects:
aiming at the complex scene of the dining hall, the automatic image shooting of the tray for containing dishes of diners is realized, and the connected domain marking and partitioning are carried out on the dinner plate image, so that the introduction of redundant (background) information is greatly reduced, and the value of useful information provided by the image is improved;
in the bag-of-words model, a classical SIFT algorithm is used for extracting feature points, the feature descriptors have scale scaling invariance and rotation invariance, and stable feature point detection is carried out in a scale space, so that the influence of illumination, a visual angle, a scale and affine transformation can be resisted to a certain extent, and the noise resistance is good; a method for constructing a K-means clustering visual dictionary based on weight level is adopted, different weights are distributed to the image features and visual words according to different distances between the image features and the visual words, the weights are summed to be used as histogram representation of the image based on a visual word library, and the classification performance is effectively improved. Compared with the problems of instability, inaccuracy, high calculation overhead and the like of a k-means clustering algorithm, the calculation complexity is reduced, and the calculation efficiency is improved;
the classifier based on the adaboost algorithm is adopted, training samples of the classifier are generated by putting back weighted random sampling, each sample needs to maintain a weight, and the probability of being extracted as the training sample is larger when the weight is larger. Therefore, the model meeting the requirements of people can be trained by changing the composition of the sample set, and the learning precision is obviously improved.
Drawings
FIG. 1 is a block diagram of a method structure according to an embodiment of the present invention
Fig. 2 is a schematic diagram of a hardware structure according to an embodiment of the present invention.
FIG. 3 is a diagram of a connected domain tag according to an embodiment of the present invention.
FIG. 4 is a basic flowchart of the bag-of-words model according to an embodiment of the present invention
FIG. 5 is a construction diagram of a k-means clustering visual dictionary based on weight hierarchy according to an embodiment of the present invention
FIG. 6 is a basic flowchart of the adaboost classifier algorithm according to an embodiment of the present invention
FIG. 7 is a flowchart of an image acquisition process according to an embodiment of the present invention.
The following specific embodiments will further illustrate the invention in conjunction with the above-described figures.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
As shown in fig. l, a self-help canteen pricing method based on a bag-of-words model and adaboost extraction includes: the system comprises an image acquisition device 9, a PC 8 and a settlement terminal device 11, wherein the PC 8 further comprises a settlement judgment device 11, a connected domain mark model 12, a word bag model 13 and a dish identification model 14. The image acquisition device 9 acquires the image of the settlement area at the current moment and transmits the acquired dinner plate image to the PC 8; the settlement judging device identifies 11 whether a tray enters a settlement area to wait for settlement, if so, the image is processed and identified; a connected domain marking model 12, wherein dishes with dinner plates on trays are marked out, and then the original images are marked, namely the collected tray images are divided into blocks; the bag-of-words model 13 is used for extracting and integrating visual dictionaries of all images, constructing and training an adaboost classifier and finishing image classification; the dish identification model 14 identifies the type and number of dishes by using a classifier based on the adaboost algorithm, and transmits the data to the settlement terminal device 10.
In a preferred embodiment, as shown in fig. 2, the hardware setup of a canteen self-help pricing method based on the bag-of-words model and adaboost includes: the system comprises a workbench l, a waiting area 2, a settlement area 3, a display screen 4, a card swiping area 5, a camera device 6, a camera support 7 and a PC 8, wherein the workbench 1 is used as a carrier of other components, the card swiping area 5, the checkout area 3 and the waiting area 2 are sequentially arranged on the workbench 1 from left to right, the camera device 6 arranged on the camera support 7 is arranged right above the checkout area 3, the display screen 4 is arranged right in front of the card swiping area 5, the camera device 6 transmits collected tray images to the PC 8, and the PC 8 transmits information to be displayed to the display screen 4.
As shown in fig. 7, the present invention includes 3 major steps, step S1: the image acquisition device 9 acquires images of the settlement area; step S2: the PC 8 carries out preprocessing on the acquired image, extracts feature points, constructs a visual dictionary and carries out adaboost recognition; step S3: the settlement terminal device 11 calculates the total price of the dishes and displays the payment for the diner. The following is a detailed explanation of the dish identification process as follows:
step S1: the image acquisition device acquires the current settlement area image every 1 second and transmits the acquired settlement area image to the PC.
Step S2: the PC 8 carries out preprocessing on the acquired image, extracts feature points, constructs a visual dictionary and carries out adaboost recognition; wherein, further include:
step S21: and carrying out differential operation on the input image received at the current time t and a background image which is stored in a settlement area of a PC in advance, judging whether a tray enters the settlement area or not, and setting a reasonable differential threshold value in consideration of the noise of the external environment. Dividing the acquired tray image into n x n pixels fk(x, y), performing Gaussian distribution modeling on each pixel, calculating the Gaussian scale space L and the scale space factor sigma of each pixel, presetting a reasonable difference threshold Th for each pixel, and presetting the pixel value of the current image and the pixel value B of the background imagekWhen the difference value of (x, y) exceeds the threshold value, | fk(x,y)-Bk(x,y)|>Th, judging that a tray enters the settlement area, and determining that the pixel exceeding the threshold value is the tray entering the settlement area.
Step S22: when the tray is detected to enter the settlement area, the tray is judged to be static by a light stream method. Setting the stream (dx, dy, dz) to a size of m × m × m (m)<1) Is a constant, then the pixel value I can be found in pixel 1 … n, n-m × mxn,Iyn,IznThe formula of the variation:
Figure GDA0002679804030000081
because the system of equations is an overdetermined equation, i.e., the system of equations has redundancy, the system of equations can be expressed as
Figure GDA0002679804030000082
Wherein, It1,It2,…,ItnFor the light intensity of the current pixel, a least square method is used
Figure GDA0002679804030000083
Figure GDA0002679804030000084
A represents a matrix
Figure GDA0002679804030000085
b represents a column matrix
Figure GDA0002679804030000086
After dx, dy, and dz are obtained, when dx, dy, and dz are 0, it can be determined that the tray is in a stationary state.
Step S23: the background difference is made between the input image received at the current moment and the empty tray image prestored in the PC, and meanwhile, a reasonable threshold Th1 needs to be set in consideration of the influence of noise, and the difference image is converted into a binary image.
Dk(x,y)=|fk(x,y)-Pk(x,y)|
Figure GDA0002679804030000087
Wherein f isk(x, y) is the pixel value of the current input image, Pk(x, y) is a pixel value of an empty tray image prestored in the PC, Dk(x, y) represents the difference result, Mk(x, y) represents the binarization result.
Step S24: by using a contour search algorithm, each individual connected region forms an identified block by marking white pixels in a binary image, namely, a dinner plate with dishes on a tray is marked out as shown in fig. 3, and then the original image is marked to obtain a connected region block model.
Step S25: and establishing a bag-of-words model, extracting and integrating visual dictionaries of all images, constructing and training an adaboost classifier, and finishing image classification, as shown in the following figure 4.
Step S251: and extracting visual vocabularies from the connected domain block images by using an SIFI algorithm. And (4) convolving the image with a Gaussian kernel function to obtain a Gaussian difference scale space. And then, the extreme point detection is utilized to preliminarily determine the position and the scale of the key point, detect the position of the candidate feature point, accurately position the key points to obtain the scale and direction information of the key points, and determine the accurate position information of the required candidate feature point.
Taylor expansion is performed on the scale space D (x, y, σ) at the candidate feature points as shown below:
Figure GDA0002679804030000091
where X (X, y, σ) is the offset of the corresponding sample point. And (3) carrying out derivation on the formula and enabling the derivation to be equal to 0 to obtain the position information of the extreme point:
Figure GDA0002679804030000092
the two formulas are combined, and only the first two items are kept:
Figure GDA0002679804030000093
when in use
Figure GDA0002679804030000094
When the contrast ratio is smaller than the set threshold value, the point is regarded as a low contrast ratio point, and thus the elimination is performed.
Thus, each feature has four parameters, the horizontal coordinate of the center point, the vertical coordinate of the center point, the dimension and the direction. When the scale space image L (x, y, σ) is denoted as L (x, y), the gradient modulus m (x, y) and the direction θ (x, y) at the feature point (x, y) can be calculated as
Figure GDA0002679804030000095
Figure GDA0002679804030000101
Wherein, L (x +1, y), L (x-1, y), L (x, y +1), L (x, y-1) are the Gauss scale space corresponding to the corresponding coordinates.
Finally, the features are described, taking a 16 × 16 neighborhood window with the keypoint as the center, dividing the window into 4 × 4 subregions, calculating gradient accumulation values in 8 directions (0 °, 45 °, 90 °, 135 °, 180 °, 225 °, 270 °, 315 °) in each subregion, and each feature can be represented by a vector of 128 dimensions, 4 × 4 × 8. Then, each feature point is assigned a direction and a scale to ensure that the SIFI description has image rotation invariance.
The description of the characteristics according to the method can avoid the influence of scale transformation and rotation change. Meanwhile, in order to eliminate the influence of illumination change on the characteristic vector, normalization processing is carried out on the characteristic vector. Let the 128-dimensional feature vector be D ═ D (D)1,d2,…,d128) Wherein d is1,d2,…,d128For the gradient of each sub-region, after normalization processing, we obtain:
Figure GDA0002679804030000102
step S252: collecting all visual words extracted from the block images by SIFI algorithm, adopting weight layering-based k-means clustering to construct a visual dictionary, respectively extracting Laplace spectral structure characteristics and SIFT local characteristics from N block images in a training set, clustering to obtain a visual dictionary C with more complete image information descriptionLKAnd CSKAs shown in fig. 5 below.
Firstly, the images in the image library are clustered in a layered way, namely, the images of each category are clustered respectively to obtain the images based on each categoryVisual dictionary of category image, namely sub-visual dictionary CLKaAnd CSKa。(CLKaIs the Laplace spectral structure feature clustering center of the class a image, CSKaClustering centers for SIFT local features of class a images, where kaTraining the number of image clusters for class a, where a is l, 2, … M, and M is the number of image classes)
Secondly, clustering the set of the child visual dictionaries again to obtain a parent visual dictionary CLKAnd CSK
Figure GDA0002679804030000111
And finally, carrying out weight value distribution processing on the two father visual dictionaries to balance the functions of the two image characteristics in the image classification process.
Figure GDA0002679804030000112
Wherein, C is a total visual dictionary,
Figure GDA0002679804030000113
is the weight coefficient of the cluster.
Step S253: after the steps of feature extraction, visual dictionary construction and the like, the visual information of the image can be depicted through a local feature distribution histogram, and in order to complete image classification, a classifier is constructed and trained, wherein the adaboost classifier is used for training, as shown in fig. 6 below. The algorithm is as follows:
(1) given a series of training samples (x)1,y1),(x2,y2),…,(xN,yN) Wherein y isi0 denotes negative sample, y i1 indicates that it is a positive sample. N is the total number of training samples.
(2) Initialization weight wi=D(i)。
(3) For T1, …, T; t represents the T-th training, and T represents the total training times;
(4) normalized weight
Figure GDA0002679804030000114
wt,iRepresents the weight, w, corresponding to the ith trainer in the t-th trainingt,jRepresenting the weight corresponding to the jth trainer in the tth training; training a weak classifier h (x, f, P, theta) for each feature f; calculating weighted error rates of weak classifiers corresponding to all featuresff=∑iqi|h(xi,f,P,θ)-yi|,qiRepresenting normalized weights; then selecting the best weak classifier ht(x) (having a minimum error Ratet):
Figure GDA0002679804030000115
And then according to the optimal weak classifier, the weight is adjusted:
Figure GDA0002679804030000116
ei0 represents xiIs correctly classified, e i1 represents xiIs misclassified;
Figure GDA0002679804030000117
(5) the final strong classifier is:
Figure GDA0002679804030000121
ht(k) represents the t-th weak classifier;
step S27: and classifying and identifying the test set by using the trained adaboost classifier to obtain the type and the number of the dishes, and transmitting data to the settlement terminal device.
Step S31: the settlement terminal device firstly obtains the information of the quantity of the dishes according to the number of the connected domains of the identification tray image, and calculates the total price of the dishes contained in the dinner plate according to the unit price and the quantity according to the corresponding relation between the dishes and the price which are identified and obtained by the PC
Step S32: the quantity, unit price and total price of the dishes in the tray are displayed, and the selectable payment modes are displayed for the diners to pay.
In conclusion, the invention can automatically shoot the tray of the dishes taken by the diner, establish a word bag model for the tray image, train an adaboost classifier, effectively complete the identification of the dishes in the tray, automatically calculate the total price of the dishes contained in the tray and display the total price on the display device, and effectively solve the problems of long queuing time and high settlement error rate in the dining hall.
The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (1)

1. A canteen self-service pricing method based on a bag-of-words model and adaboost is characterized by comprising the following steps:
step S1: the image acquisition device acquires images of the current settlement area at intervals and transmits the acquired images of the settlement area to the PC;
step S2: the PC machine processes the acquired image, and the processing method comprises the following steps: preprocessing, extracting feature points, constructing a visual dictionary and performing adaboost recognition;
step S3: the settlement terminal device calculates the total price of the dishes and displays the total price for the diner to pay;
wherein the step S2 further includes:
step S21: carrying out difference operation on the input image received at the current time t and a background image which is pre-stored in a settlement area of a PC (personal computer), judging whether a tray enters the settlement area or not at present, and dividing the acquired tray image into n multiplied by n pixels fk(x, y), performing Gaussian distribution modeling on each pixel, calculating the Gaussian scale space L and the scale space factor sigma of each pixel, presetting a reasonable difference threshold Th for each pixel, and presetting the pixel value of the current image and the pixel value B of the background imagekWhen the difference value of (x, y) exceeds the threshold value, | fk(x,y)-Bk(x, y) | > Th, judging that a tray enters the settlement area, and determining that the pixel exceeding the threshold value is the tray entering the settlement area;
step S22: when the tray enters the settlement area, judging whether the tray is static by using a light stream method; if the stream (dx, dy, dz) is constant in a small window of size m × m × m, m < 1, then a pixel 1 … n, n × m × m may be found for the pixel value Ixn,Iyn,IznThe formula of the variation:
Figure FDA0002708227200000011
the system of equations can be further expressed as:
Figure FDA0002708227200000021
wherein, It1,It2,…,ItnAfter dx, dy and dz are calculated for the light intensity of the current pixel by adopting a least square method, and when dx, dy and dz are 0, the tray can be judged to be in a static state;
step S23: performing background difference on an input image received at the current moment and an empty tray image prestored in a PC (personal computer), and converting a difference image into a binary image:
Dk(x,y)=|fk(x,y)-Pk(x,y)|
Figure FDA0002708227200000022
wherein f isk(x, y) is the pixel value of the current input image, Pk(x, y) is a pixel value of an empty tray image prestored in the PC, Dk(x, y) represents the difference result, Mk(x, y) represents the binarization result, and Th1 is a set threshold value;
step S24: by utilizing a contour search algorithm, marking white pixels in a binary image to enable each independent connected region to form a marked block, namely marking a dinner plate with dishes on a tray, and then marking on an original image to obtain a connected region block model;
step S25: establishing a bag-of-words model, extracting and integrating visual dictionaries of all images, and constructing and training an adaboost classifier to finish image classification;
step S251: extracting visual words from the connected domain block images by using an SIFI algorithm; convolving the image with a Gaussian kernel function to obtain a Gaussian difference scale space; then, the extreme point detection is utilized to preliminarily determine the position and the scale of the key point, detect the position of the candidate feature point, accurately position the key points to obtain the scale and direction information of the key points, and determine the accurate position information of the required candidate feature point;
taylor expansion is performed on the scale space D (x, y, σ) at the candidate feature points as shown below:
Figure FDA0002708227200000023
where X (X, y, σ) is the offset of the corresponding sample point; and (3) carrying out derivation on the formula and enabling the derivation to be equal to 0 to obtain the position information of the extreme point:
Figure FDA0002708227200000031
the two formulas are combined, and only the first two items are kept:
Figure FDA0002708227200000032
when in use
Figure FDA0002708227200000033
When the contrast ratio is smaller than the set threshold value, the point is regarded as a low-contrast point to be removed;
therefore, each feature has four parameters, namely the horizontal coordinate of the central point, the vertical coordinate, the dimension and the direction of the central point; let L (x, y, σ) be denoted as L (x, y), the gradient modulus m (x, y) and the direction θ (x, y) at the feature point (x, y) can be calculated as:
Figure FDA0002708227200000034
Figure FDA0002708227200000035
wherein, L (x +1, y), L (x-1, y), L (x, y +1), L (x, y-1) is the Gauss scale space corresponding to the corresponding coordinate;
finally, describing features, taking a 16 × 16 neighborhood window with the key point as a center, dividing the window into 4 × 4 sub-regions, calculating gradient accumulation values of 8 directions, 0 °, 45 °, 90 °, 135 °, 180 °, 225 °, 270 °, 315 ° in each sub-region, and representing each feature by a vector with dimensions of 4 × 4 × 8 ═ 128; distributing direction and scale to each feature point to ensure that SIFI description has image rotation invariance;
normalizing the feature vector; let the 128-dimensional feature vector be D ═ D (D)1,d2,…,d128) Wherein d is1,d2,…,d128For the gradient of each sub-region, after normalization processing, we obtain:
Figure FDA0002708227200000036
step S252: collecting all visual words extracted from the block images by SIFI algorithm, adopting weight layering-based k-means clustering to construct a visual dictionary, respectively extracting Laplace spectral structure characteristics and SIFT local characteristics from N block images in a training set, clustering to obtain a visual dictionary C with more complete image information descriptionLKAnd CSK
Firstly, the images in the image library are clustered in a layering way to obtain a visual dictionary based on each category of images, namely a sub-visual dictionary CLKaAnd CSKaWherein, CLKaIs the Laplace spectral structure feature clustering center of the class a image, CSKaFor SIFT local feature clustering center, k, of class a imagesaClustering the number of the class a training images, wherein a is 1, 2, … M, and M is the number of image classes;
secondly, clustering the set of the child visual dictionaries again to obtain a parent visual dictionary CLKAnd CSK
Figure FDA0002708227200000041
And finally, carrying out weight value distribution processing on the two father visual dictionaries to balance the functions of two image characteristics in the image classification process:
Figure FDA0002708227200000042
wherein, C is a total visual dictionary,
Figure FDA0002708227200000043
is the weight coefficient of the cluster;
step S253: training by using an adaboost classifier, wherein the algorithm is as follows:
(1) given a series of training samples (x)1,y1),(x2,y2),…,(xN,yN) Wherein y isi0 denotes negative sample, yi1 denotes it is a positive sample; n is the total number of training samples;
(2) initialization weight wi=D(i);
(3) For T1.., T; t represents the T-th training, and T represents the total training times;
(4) normalized weight
Figure FDA0002708227200000044
wt,iRepresents the weight, w, corresponding to the ith trainer in the t-th trainingt,jRepresenting the weight corresponding to the jth trainer in the tth training; training a weak classifier h (x, f, P, theta) for each feature f; calculating weighted error rates of weak classifiers corresponding to all featuresff=∑iqi|h(xi,f,P,θ)-yi|,qiRepresenting normalized weights; then selecting the best weak classifier ht(x) Having a minimum error ratet
Figure FDA0002708227200000051
And then according to the optimal weak classifier, the weight is adjusted:
Figure FDA0002708227200000052
ei0 represents xiIs correctly classified, ei1 represents xiIs misclassified; order to
Figure FDA0002708227200000053
(5) The final strong classifier is:
Figure FDA0002708227200000054
ht(k) represents the t-th weak classifier;
step S27: classifying and identifying the test set by using a trained adaboost classifier to obtain the type and the number of dishes, and transmitting data to the settlement terminal device;
wherein, the step S3 further includes:
step S31: the settlement terminal device firstly obtains the information of the quantity of the dishes according to the number of the connected domains of the identification tray image, and calculates the total price of the dishes contained in the dinner plate according to the unit price and the quantity according to the corresponding relation between the dishes identified by the PC and the price;
step S32: the quantity, unit price and total price of the dishes in the tray are displayed, and the selectable payment modes are displayed for the diners to pay.
CN201910155376.2A 2019-03-01 2019-03-01 Canteen self-service pricing method based on bag-of-words model and adaboost Active CN110020668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910155376.2A CN110020668B (en) 2019-03-01 2019-03-01 Canteen self-service pricing method based on bag-of-words model and adaboost

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910155376.2A CN110020668B (en) 2019-03-01 2019-03-01 Canteen self-service pricing method based on bag-of-words model and adaboost

Publications (2)

Publication Number Publication Date
CN110020668A CN110020668A (en) 2019-07-16
CN110020668B true CN110020668B (en) 2020-12-29

Family

ID=67189126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910155376.2A Active CN110020668B (en) 2019-03-01 2019-03-01 Canteen self-service pricing method based on bag-of-words model and adaboost

Country Status (1)

Country Link
CN (1) CN110020668B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005043416A2 (en) * 2003-11-03 2005-05-12 Cloudmark, Inc. Methods and apparatuses for determining and designating classifications of electronic documents
CN104915673A (en) * 2014-03-11 2015-09-16 株式会社理光 Object classification method and system based on bag of visual word model
CN107908715A (en) * 2017-11-10 2018-04-13 中国民航大学 Microblog emotional polarity discriminating method based on Adaboost and grader Weighted Fusion

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100067799A1 (en) * 2008-09-17 2010-03-18 Microsoft Corporation Globally invariant radon feature transforms for texture classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005043416A2 (en) * 2003-11-03 2005-05-12 Cloudmark, Inc. Methods and apparatuses for determining and designating classifications of electronic documents
CN104915673A (en) * 2014-03-11 2015-09-16 株式会社理光 Object classification method and system based on bag of visual word model
CN107908715A (en) * 2017-11-10 2018-04-13 中国民航大学 Microblog emotional polarity discriminating method based on Adaboost and grader Weighted Fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MAKING FULL USE OF SPATIAL-TEMPORAL INTEREST POINTS: AN ADABOOST APPROACH FOR ACTION RECOGNITION;Xunshi Yan等;《IEEE》;20101203;第4677-4680页 *
基于主元分析构造强分类器的人脸检测算法;郭东峰;《科技通报》;20131031;第29卷(第10期);第220-224页 *

Also Published As

Publication number Publication date
CN110020668A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN107506703B (en) Pedestrian re-identification method based on unsupervised local metric learning and reordering
CN105023008B (en) The pedestrian of view-based access control model conspicuousness and multiple features recognition methods again
US20190197466A1 (en) Inventory control for liquid containers
Liu et al. Attribute-restricted latent topic model for person re-identification
CN108171184A (en) Method for distinguishing is known based on Siamese networks again for pedestrian
CN104915673B (en) A kind of objective classification method and system of view-based access control model bag of words
CN103295024B (en) Classification and method for checking object and device and image taking and processing equipment
CN109685780B (en) Retail commodity identification method based on convolutional neural network
CN109165645A (en) A kind of image processing method, device and relevant device
CN105488809A (en) Indoor scene meaning segmentation method based on RGBD descriptor
CN108345912A (en) Commodity rapid settlement system based on RGBD information and deep learning
WO2016190814A1 (en) Method and system for facial recognition
CN103793717A (en) Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same
CN113033706B (en) Multi-source two-stage dish identification method based on visual detection and re-identification
CN115272652A (en) Dense object image detection method based on multiple regression and adaptive focus loss
CN108073940B (en) Method for detecting 3D target example object in unstructured environment
CN106650798B (en) A kind of indoor scene recognition methods of combination deep learning and rarefaction representation
CN109583498A (en) A kind of fashion compatibility prediction technique based on low-rank regularization feature enhancing characterization
CN109740417A (en) Invoice type recognition methods, device, storage medium and computer equipment
AU2017231602A1 (en) Method and system for visitor tracking at a POS area
CN110517497A (en) A kind of road traffic classification method, device, equipment, medium
CN106557783B (en) A kind of automatic extracting system and method for caricature dominant role
CN110020668B (en) Canteen self-service pricing method based on bag-of-words model and adaboost
CN117315863A (en) Article structure cashing system based on AI intelligent recognition
CN107679528A (en) A kind of pedestrian detection method based on AdaBoost SVM Ensemble Learning Algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190716

Assignee: Zhejiang senshi Electronic Technology Co.,Ltd.

Assignor: HANGZHOU DIANZI University

Contract record no.: X2021330000652

Denomination of invention: A cafeteria self-service pricing method based on word bag model and adaboosting

Granted publication date: 20201229

License type: Common License

Record date: 20211103

EE01 Entry into force of recordation of patent licensing contract