CN113727029B - Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine - Google Patents

Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine Download PDF

Info

Publication number
CN113727029B
CN113727029B CN202111291308.2A CN202111291308A CN113727029B CN 113727029 B CN113727029 B CN 113727029B CN 202111291308 A CN202111291308 A CN 202111291308A CN 113727029 B CN113727029 B CN 113727029B
Authority
CN
China
Prior art keywords
video
sub
commodity
target
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111291308.2A
Other languages
Chinese (zh)
Other versions
CN113727029A (en
Inventor
陈俏锋
黄超群
王浩
张�杰
束学璋
张元熙
郭家龙
邱俊波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
YOPOINT SMART RETAIL TECHNOLOGY Ltd.
Original Assignee
Yopoint Smart Retail Technology Ltd
Wuhan Xingxun Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yopoint Smart Retail Technology Ltd, Wuhan Xingxun Intelligent Technology Co ltd filed Critical Yopoint Smart Retail Technology Ltd
Priority to CN202111291308.2A priority Critical patent/CN113727029B/en
Priority to CN202210356689.6A priority patent/CN114640797A/en
Publication of CN113727029A publication Critical patent/CN113727029A/en
Application granted granted Critical
Publication of CN113727029B publication Critical patent/CN113727029B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/951Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor

Abstract

The invention belongs to the technical field of image processing, solves the technical problem of poor user experience caused by the fact that the definition of commodities in a shopping video of an intelligent vending machine in the prior art is changed continuously to influence a final detection result, and provides an intelligent order generation method and an intelligent vending machine which can collect images in multiple visual angles and then combine the images. The method comprises the following steps: the method comprises the steps of obtaining a first main video and each first sub video of a commodity area, and a second main video and each second sub video, replacing an image area corresponding to the main video with each sub video to obtain a first target video and a second target video, combining the first target video and the second target video, analyzing each frame image of the combined target video to obtain a target commodity for producing order information, and generating commodity order information. According to the invention, the sub-video is used for replacing the low-definition image area in the main video, so that the definition of the commodity in each frame of image of the target video is ensured, and the detection accuracy and the user experience effect are improved.

Description

Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine
Technical Field
The invention relates to the technical field of image analysis, in particular to an intelligent order generating method and an intelligent vending machine for combining multi-view collected images.
Background
With the continuous development of artificial intelligence technology, the selling mode of retail industry has also changed greatly, wherein intelligent vending machine has spread all over various occasions in the city, including station, shopping mall, tourist attraction or department store can find various intelligent vending machine, intelligent vending machine with it need not special person guard, user's automatic mode of ordering, shopping settle accounts, has greatly traversed the shopping demand of special scene user to the commodity.
The current intelligent vending machine that opens door entirely is because having many commodities of once choosing of traditional vending machine, once shopping can change commodity many times, greatly satisfies the autonomic selection of user's shopping process, however because the intelligent vending machine that opens door entirely can follow goods shelves and take commodity or change commodity many times because the customer, consequently the imaging definition of commodity constantly changes in the shopping video to the final testing result of influence that the definition is low leads to producing unusual order, influences user's experience.
Disclosure of Invention
In view of this, the embodiment of the invention provides an order generating method based on multi-view image analysis and an intelligent vending machine, so as to solve the technical problem that the imaging definition of a commodity in a shopping video of the existing intelligent vending machine is constantly changed to influence a final detection result, so that an abnormal order is formed, and the user experience is poor.
The technical scheme adopted by the invention is as follows:
the invention provides an order generation method based on multi-view image analysis, which comprises the following steps:
s20: acquiring a first main video acquired by a first main camera in a commodity area, a second main video acquired by a second main camera in the commodity area, a first sub video acquired by at least one first sub camera in the commodity area and a second sub video acquired by at least one second sub camera in the commodity area;
s21: covering the same image area as the first sub video in the first main video by using the commodity area of the first sub video to obtain a first target video, and covering the same image area as the second sub video in the second main video by using the commodity area of the second sub video to obtain a second target video;
s22: merging each frame image of the first target video and each frame image corresponding to the acquisition time sequence in the second target video to obtain a target video, wherein the size of each frame image of the target video is the sum of the size of each frame image of the first target video and the size of each frame image corresponding to the second target video;
s23: inputting each frame image of the target video into a preset target detection model to obtain commodity information of a plurality of target commodities;
s24: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
Preferably, the S20 includes:
s201: the method comprises the steps of obtaining a commodity placing area of the intelligent vending machine along the goods shelf arrangement direction of the intelligent vending machine and dividing the commodity placing area into a plurality of virtual commodity areas;
s202: the method comprises the steps of controlling cameras corresponding to each commodity area to collect video data corresponding to the commodity area, and obtaining a first main video collected by a first main camera, a second main video collected by a second main camera, and at least one first sub video collected by a first sub camera and at least one second sub video collected by a second sub camera.
Preferably, the S20 includes:
s203: acquiring the frame rate and the number of cameras for acquiring video data of a commodity area;
s204: determining the interval time for each camera to start to acquire the video data of the corresponding commodity area according to the frame rate and the number of the cameras;
s205: according to the interval time, respectively obtaining the starting time when the first main camera, the second main camera, each first sub-camera and each second sub-camera start to collect video data;
s206: and controlling a corresponding camera to acquire video data of a commodity area according to each starting moment to obtain the first main video, the second main video, each first sub video and each second sub video.
Preferably, the S21 includes:
s211: acquiring a target commodity area corresponding to commodity change;
s212: determining the first sub video and the second sub video containing the target commodity area according to the position information of the target commodity area;
s213: and merging the first sub video containing the target commodity and the first main video to obtain the first target video, and merging the second sub video containing the target commodity and the second main video to obtain the second target video.
Preferably, the S212 includes:
s2121: analyzing each frame image of the first main video and/or the second main video to determine each image frame corresponding to the commodity change;
s2122: dividing the first main video and the second main video into a plurality of video segments according to each image frame with commodity change;
s2123: and determining the first sub-video and the second sub-video corresponding to each video segment according to a target commodity area corresponding to commodity change in each video segment.
Preferably, the step S213 of recording the first sub-video and the second sub-video containing the target commodity area as target sub-videos includes:
s2131: determining other first sub videos or second sub videos positioned between the target sub video and the first main video or the second main video as middle sub videos according to the position information of the target sub video;
s2132: deleting areas, which are not corresponding to the target sub-video and the middle sub-video, in the first main video and the second main video to respectively obtain an optimized first video and an optimized second video;
s2133: and merging the target sub-video and each intermediate sub-video belonging to the first sub-video with the first video to obtain the first target video, and merging the target sub-video and each intermediate sub-video belonging to the second sub-video with the second video to obtain the second target video.
Preferably, the S22 includes:
s221: acquiring a first starting time for starting to acquire the first target video and a second starting time for starting to acquire the second target video;
s222: and combining each frame image of the first target video and each frame image corresponding to the acquisition time sequence in the second target video one by one according to the first starting time and the second starting time to obtain the target video.
The invention also provides an intelligent order generating device for combining the collected images in multiple visual angles, which comprises:
the video acquisition module:
the system comprises a first main camera, a second main camera, at least one first sub camera, at least one second sub camera and at least one second sub camera, wherein the first main video is obtained by the first main camera collecting a commodity area, the second main video is obtained by the second main camera collecting the commodity area, the first sub video is obtained by the at least one first sub camera collecting the commodity area, and the second sub video is obtained by the at least one second sub camera collecting the commodity area;
a video splicing module: the commodity area of the first sub-video is used for covering the image area, which is the same as the first sub-video, in the first main video to obtain a first target video, and the commodity area of the second sub-video is used for covering the image area, which is the same as the second sub-video, in the second main video to obtain a second target video;
a video merging module: the image processing device is used for merging each frame of image of the first target video and each frame of image corresponding to the acquisition time sequence in the second target video to obtain a target video, wherein the size of each frame of image of the target video is the sum of the size of each frame of image of the first target video and the size of each frame of image corresponding to the second target video;
a data analysis module: the system comprises a target video acquisition module, a target detection module, a commodity information acquisition module and a commodity information acquisition module, wherein the target video acquisition module is used for acquiring a target commodity of each target commodity;
an order information module: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
The invention also provides an intelligent vending machine, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of the above.
The invention also provides a medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any of the above.
In conclusion, the beneficial effects of the invention are as follows:
the invention provides an intelligent order generation method and an intelligent vending machine for combining collected images at multiple visual angles, wherein a first main camera and a second main camera for acquiring video data of the same commodity area at different visual angles are arranged on the intelligent vending machine, a first sub-camera and a second sub-camera are also arranged for acquiring the video data of the commodity area by the first camera and the second camera, the first sub-video acquired by the first sub-camera is used for replacing the data of the corresponding area in the first main video acquired by the first main camera to acquire a first target video, the second target video is acquired by adopting the same method, then the first target video and the second target video are combined to acquire the target video, the imaging definition of a target object in each frame image in the target video is ensured, the target detection is carried out on the combined target video, and abnormal orders caused by the continuous change of the commodity imaging definition can be prevented, the detection accuracy and the user experience effect are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, without any creative effort, other drawings may be obtained according to the drawings, and these drawings are all within the protection scope of the present invention.
Fig. 1 is a schematic flowchart of a method for intelligently generating an order in an integrated manner after images are acquired from multiple views in embodiment 1;
FIG. 2 is a schematic structural diagram of an intelligent vending machine having a plurality of cameras with different viewing angles in embodiment 1;
fig. 3 is a schematic flowchart of a method for intelligently generating an order by combining images acquired from multiple viewing angles in embodiment 2;
fig. 4 is a schematic flowchart of the intelligent order generating device integrated after multi-view image acquisition in embodiment 3;
fig. 5 is a schematic flowchart of an intelligent order generation apparatus combined after multi-view image acquisition in embodiment 4;
FIG. 6 is a schematic view showing the construction of an automatic settlement system including a smart vending machine according to embodiment 5;
FIG. 7 is a schematic structural view of a smart vending machine according to embodiment 6;
reference numerals of fig. 1 to 7:
1. a cabinet body; 11. a partition plate; 12. A commodity area; 2. A cabinet door; 3. A camera is provided.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. In the description of the present invention, it is to be understood that the terms "center", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. In case of conflict, the various features of the present invention and embodiments may be combined with each other and are within the scope of the present invention.
Example 1
The existing full-open intelligent vending machine has the advantages that as the traditional vending machine is convenient for selecting and purchasing a plurality of commodities at one time and replacing the commodities for a plurality of times in one-time shopping, and meanwhile, an order can be quickly generated in the process that a user completes one-time complex shopping, and the settlement is quickly realized in an autonomous settlement mode, compared with the existing full-open intelligent vending machine which can only purchase one commodity through one-time code scanning and cannot reselect after purchase, the full-open intelligent vending machine is simple to operate and has stronger autonomous shopping selectivity; however, because the intelligent vending machine with the fully-opened door is used for shopping once, a user can purchase a plurality of commodities and can simultaneously carry out shelving and shelving of the commodities for a plurality of times, so that a large number of commodities can be put on or off the shelves due to the shielding problem, different results are detected before and after the same commodity due to different shielding positions, and the user experience effect and the merchant credit are influenced by abnormal orders.
The invention is based on the feasibility research of obtaining the shopping video of the user shopping from the intelligent vending machine from multiple angles, and sets the cameras for real-time monitoring the commodity area from different directions in the commodity area of the intelligent vending machine, combines the shopping videos shot by the cameras, then obtains the order information of the user shopping through modes of picture splicing, comparative analysis and the like, and then automatically settles through the server, thereby improving the shopping experience of the user and simultaneously reducing the manual settlement process.
Specifically, please refer to fig. 2, fig. 2 is a schematic structural diagram of an intelligent vending machine with a fully opened door, the intelligent vending machine includes a cabinet 1 and a cabinet door 2, the cabinet 1 and the cabinet door 2 are rotatably connected, and when the cabinet door 2 is in a closed state relative to the cabinet 1, the cabinet door 2 covers all commodity areas of the cabinet 1 where commodities are placed, that is, the commodities in the cabinet cannot be taken out, when the cabinet door 2 is opened, all commodities in the cabinet 1 are displayed in front of a user, the user can select any commodity in a shopping center at one time, and can select multiple commodities, which can be taken out or put back after selection, a partition plate 11 is arranged in the cabinet 1, the partition plate 11 can be a rack dividing the cabinet 1 into multiple commodity areas 12, wherein each commodity area in the cabinet 1 is provided with a camera, can follow the angle like this and acquire the shopping video that the user was shopping from intelligent vending machine, avoid the shopping video of single angle collection because the problem of sheltering from leads to the not high problem of the video reliability of shopping, thereby the intelligent vending machine shown in figure 2 all is equipped with a plurality of cameras at the inside left side wall and the right side wall of vending machine and can gather the shopping video with relative visual angle direction to same commodity district, improves video data's reliability.
Referring to fig. 1, fig. 1 is a schematic flow chart of an intelligent order generating method for fusion after multi-view image acquisition in embodiment 1 of the present invention, where the method includes:
s10: acquiring a first main video acquired by a first main camera in a commodity area at a first visual angle and a second main video acquired by a second main camera in the commodity area at a second visual angle different from the first visual angle, wherein the first main video and the second main video have the same frame rate and are acquired with the same frame number;
specifically, the intelligent vending machine is provided with a plurality of cameras, the plurality of cameras are divided into two groups of cameras, namely a first group of cameras and a second group of cameras, the first group of cameras comprise a first main camera, the first group of cameras can also comprise other first sub-cameras, the second group of cameras comprise a second main camera, the second group of cameras can also comprise other second sub-cameras, when only the first main camera and the second main camera are included, the first main camera and the second main camera are arranged in a space corresponding to the uppermost partition board in all the partition boards for placing the commodities, in particular to be arranged on a left side board and a right side board in a cabinet body of the intelligent vending machine and are positioned in the area above the uppermost partition board, therefore, the first main camera and the second main camera can shoot videos of commodities taken out of all areas of the intelligent vending machine; the visual angle ranges of the first main camera and the second main camera can be the same or different, but the visual angle ranges of the first main camera and the second main camera can cover all commodity areas; therefore, in a preferred embodiment, the first main camera is arranged on the left side wall in the cabinet body, and the second main camera is arranged on the right side wall in the cabinet body, so that the first main camera can shoot a video of the commodity area from the left side, the second main camera can shoot a video of the commodity area from the right side, a first main video of the commodity area collected by the first main camera from the left side at a first visual angle and a second main video of the commodity area collected by the second main camera from the right side at a second visual angle are obtained, and by acquiring the videos of the commodity area from different visual angles, under the condition that partial features of the commodity of a user are blocked by the commodity, more features of the commodity can be obtained from other visual angle directions, so that the accuracy of target detection is improved; it should be noted that: the video frame rates of the first main video and the second main video are the same.
S11: combining each frame of image of the first main video and each frame of image corresponding to the acquisition time sequence in the second main video one by one to generate a target video, wherein the size of each frame of image of the target video is the sum of the size of each frame of image of the first main video and the size of each frame of image corresponding to the second main video;
specifically, each frame image of the first main video and each frame image of the second main video are combined one by one according to the acquisition time sequence, so as to obtain a target video, wherein the combination mode of each frame image of the first main video and each frame image of the second main video includes but is not limited to: combining image frames of a first main video corresponding to the same shooting moment with image frames corresponding to a second main video, combining image frames of the first main video corresponding to the shooting moment in a staggered manner with image frames of the second main video corresponding to the same frame number, specifically combining a first frame image of the first main video with a first frame image of the second main video, specifically combining a second frame image of the first main video with a second frame image of the second main video, and so on, specifically combining an Nth frame image of the first main video with an Nth frame image of the second main video to obtain a target video consisting of the combined images; such as: the start time of the first main video to start shooting precedes the start time of the second main video to start shooting, and it should be noted that: the time interval between the starting moment of the first main video for starting shooting and the starting moment of the second main video for starting shooting is less than the interval time of two adjacent frames of images, for example, the interval time is 1/2, 1/3, 1/4 and the like of the interval time of the two adjacent frames of images, then the image frames of the first main video and the image frames of the second main video are combined according to the shooting sequence, images at more moments can be obtained at the corresponding shooting time interval under the set frame rate, the number of samples is enriched, and the detection accuracy is improved; it should be noted that: because the frame rate of the video is related to the eye reaction, if the frame rate is too low or causes discontinuous pictures, if the frame rate is too high, the eye reaction is not caused, and the eye discomfort is caused, a frame rate meeting the requirements is set by one camera when the video is collected, and the frame rate is generally 20 frames/second to 30 frames/second; taking 20 frames/second as an example, each two adjacent frames of images are 0.05s, namely only two frames of images at the end of 0.05s are shot in a staggered mode, so that the image frames at more than one moment can be obtained in 0.05s, the effective number of image frames is increased, and discomfort of eyes cannot be caused.
In one embodiment, the S11 includes:
s111: acquiring a first starting time corresponding to the first main video and the second starting time different from the first starting time, corresponding to the second main video, and the first main camera starts to acquire the first main video;
s112: combining each frame image of the first main video acquired within a preset time with each frame image corresponding to a time sequence in the second main video one by one according to the first starting time and the second starting time to generate the target video;
and the time interval between the first starting moment and the second starting moment is smaller than the time interval between two adjacent frames of images corresponding to the frame rate of the camera.
Specifically, different starting moments are set for a first main camera and a second main camera to start to acquire video data of a commodity area, the starting moment of the first main camera is set as a first starting moment, the starting moment of the second main camera to start shooting is set as a second starting moment, and the time difference between the first starting moment and the second starting moment is smaller than the interval time between two adjacent frames corresponding to the video data acquired by the cameras; and then merging the frame images of the first main video and the second main video one by one according to the acquisition time sequence to obtain the target video, wherein the specific merging mode is not repeated here.
S12: inputting each frame image of the target video into a preset target detection model to obtain commodity information of a plurality of target commodities;
specifically, each frame image of the synthesized target video is detected by using a preset target detection model, so that the target commodity taken out of the intelligent vending machine and the commodity information of the target commodity taken out of the intelligent vending machine and then put back are determined, the target detection model is obtained by manually marking each frame image of a video corresponding to a large number of commodities taken at different angles, commodities held in different holding modes and commodities taken at different speeds, and then training the model by using the sample set.
In one embodiment, the S12 includes:
s121: dividing each frame image of the target video into a first image area belonging to the first main video and a second image area belonging to the second main video;
specifically, each frame image in the target video is partitioned, an image area belonging to the first main video is recorded as a first image area, and an image area belonging to the second main video is recorded as a second image area, so that the detection results of each frame image are convenient to count, and the detection results are prevented from being mixed.
S122: detecting each frame of image of the target video by using the target detection model to obtain a first detection result corresponding to the first image area and a second detection result corresponding to the second image area;
specifically, the detection result of the first image area of each frame of image is analyzed to obtain a first detection result belonging to the first main video and a second detection result belonging to the second main video in the target video, the detection result includes the category of each commodity and the confidence of each commodity, the confidence may be the confidence in the detection result corresponding to each detected target, or the average confidence corresponding to each class of commodity calculated from the confidence of each target in all detected targets after each detected target is classified according to the category information of the detection result; if the detection results of all the frame images of the target video comprise 3 times of detection of the commodity A, 5 times of detection of the commodity B and 4 times of detection of the commodity C, the average confidence corresponding to the commodity A is the confidence of each detection of the commodity A, and the confidence averages of the commodity B and the commodity C are obtained in the same way.
In one embodiment, the S122 includes:
s1221: detecting each frame of image of the target video by using the target detection model to obtain the category and the confidence of each commodity corresponding to the first image area in each frame of image and the category and the confidence of each commodity corresponding to the second image area in each frame of image;
s1222: according to the confidence degrees of the commodities detected in the first image area and the second image area respectively, obtaining first confidence degree average values corresponding to all commodities detected in the first image area by all the frame images in a one-to-one mode and second confidence degree average values corresponding to all commodities detected in the second image area in a one-to-one mode;
specifically, each frame of image has a first image area and a second image area, the confidence levels of all objects corresponding to the types of commodities detected in the first image area and the second image area are respectively calculated to obtain the confidence level average value of the corresponding commodity category, the confidence level average value of all the commodities in the first image area is recorded as a first confidence level average value, the confidence level average value of all the commodities in the second image area is recorded as a second confidence level average value, if the target video comprises N frames of images, 3 objects are identified in the first image area, namely a commodity a, a commodity B and a commodity C, wherein the commodity a is detected for 5 times and recorded as a1, a2, A3, a4 and a5, the confidence levels are respectively 0.6, 0.7, 0.75 and 0.75, and the first average confidence level of the commodity a in the first image area is 0.7, and similarly, the commodity B and the commodity C are respectively detected for 0.7 times and the corresponding confidence levels, obtaining confidence coefficient average values of various commodities; and obtaining a second average confidence coefficient of each type of commodity in the second image area by the same method.
In one embodiment, the S1222 includes:
s12221: acquiring a detection frequency threshold corresponding to the statistics of the effective detection frequency of the commodity;
specifically, each frame of the video is detected, the target commodity of the order is determined according to the detection result of each frame, the corresponding detection time threshold value is set for the detected same commodity in different image frames, if the detection time threshold value is larger than or equal to the threshold value, the commodity is considered to be true, and if the detection time threshold value is smaller than the threshold value, the commodity is considered to be abnormal, and background manual check is needed.
S12222: and comparing the detection times corresponding to the commodities of each category with the detection time threshold, and calculating the average value of each confidence coefficient of the commodities meeting the requirements to obtain the average value of the confidence coefficient of each first commodity and the average value of the confidence coefficient of each second commodity.
Specifically, when the detection frequency threshold is set to be 80% of the total number of image frames included in the target video, taking the example that the target video includes 20 frames of images, the detection frequency threshold is 16 times, when the product a is detected in 10 frames of images in total, the average confidence level of the product a is not calculated, and when the product B is detected in 18 frames of images in total, the average confidence level of the product B is calculated according to the detected confidence level, so that the product is prevented from being erroneously detected due to an angle problem, and the accuracy of the detection result is improved.
S1223: obtaining first commodity information corresponding to all commodities detected in the first image area by each frame image and second commodity information corresponding to all commodities detected in the second image area by each frame image according to the types of the commodities detected in the first image area and the second image area;
specifically, the categories of the commodities detected in the respective frame images are respectively counted in the first image area and the second image area, so that first commodity information corresponding to the target video in the first image area and second commodity information corresponding to the target video in the second image area are obtained.
S1224: and obtaining the first detection result according to each piece of first commodity information and each first confidence coefficient average value corresponding to each piece of first commodity information one to one, and obtaining the second detection result according to each piece of second commodity information and each second confidence coefficient average value corresponding to each piece of second commodity information one to one.
Specifically, each first commodity confidence coefficient average value and each second commodity confidence coefficient average value are respectively compared with a confidence coefficient threshold value, classification is performed according to a first image region and a second image region, each commodity corresponding to each first commodity confidence coefficient average value meeting requirements is used as the first detection result, and each commodity corresponding to each second commodity confidence coefficient average value meeting requirements is used as the second detection result.
S123: comparing the first detection result with the second detection result to obtain the commodity information of each target commodity for generating commodity order information;
wherein the first detection result and the second detection result both comprise: the category of each commodity, the confidence degree corresponding to each commodity one by one and the average confidence degree corresponding to each commodity detected by all the frame images.
Specifically, the first detection result and the second detection result are statistics of detection results of multiple frames of images, if the categories of the commodities in the first detection result and the second detection result are the same, and the confidence of the target commodities in the detection results meets the requirement, the target commodities used for generating order information are obtained, if the first detection result and the second detection result are different, each commodity in the detection results with a large number of commodities and a large number of categories is used as the target commodity (due to the fact that shielding exists due to the angle problem, in the detection process, the detection results of some commodities are not met the requirement and are not counted under a specific angle, and therefore the detection results with a large number and a good requirement are adopted), if the confidence does not meet the requirement, the target commodity is considered to be an abnormal order, and the target video is manually checked, so that the target commodity used for generating the order information is obtained. Because the order is detected in continuous frames, if the first detection result and the second detection result have different commodity types and/or quantities, the commodity which is not in the detection result is considered to really belong to the blind area of the camera, and the detected detection result is directly used as the basis for generating order information, so that missed detection caused by the blind area of a single camera is effectively avoided, and the accuracy of the order is improved.
S13: and generating commodity order information according to the commodity information of each target commodity.
Specifically, after the category of each target commodity is obtained, the commodity database is traversed, the commodity price is determined, and therefore commodity order information is generated, a user can pay at a terminal, and the terminal comprises a traditional cash desk or third-party App software.
In one embodiment, the S13 includes:
s131: acquiring a target confidence threshold value of a commodity for generating a valid order;
s132: comparing the commodity types and the commodity quantity contained in the first detection result with the commodity types and the commodity quantity contained in the second detection result respectively, if the commodity types of the first detection result and the second detection result are the same, outputting commodities with high confidence degrees of the commodities in the first detection result and the second detection result as target commodities, and otherwise, generating an abnormal order;
s133: comparing the confidence corresponding to the target commodity with the target confidence threshold, if the confidence corresponding to the target commodity meets the requirement, producing the commodity order information corresponding to each target commodity, and otherwise, generating an abnormal order;
and if the order is an abnormal order, outputting abnormal order information and simultaneously outputting the first main video and/or the second main video and/or the target video.
Specifically, when the target commodity corresponding to the current detection result is considered to be abnormal, manual review is needed, and at this time, review can be performed on the first main video and/or the second main video and/or the target video, so that the accuracy of the order is ensured, and the user experience is improved.
In an embodiment, the first detection area corresponding to the first main camera includes a second detection area corresponding to the second main camera.
In an embodiment, the first detection area and the second detection area both cover the entire commodity area, and the viewing direction of the first main camera and the viewing direction of the second main camera are arranged in opposite directions.
Specifically, the detection areas of the first main camera and the second main camera are in an inclusive relationship, that is, the first detection area and the second detection area are the same, or a detection area belongs to the detection range of another detection area, and meanwhile, the viewing angles of the first main camera and the second main camera are different, if the first main camera is from left to right and the second main camera is from right to left, it should be noted that the first main camera and the second main camera are arranged in the same commodity area, if the smart vending machine has three commodity areas, namely, the upper commodity area, the second commodity area, the upper commodity area, the lower commodity area, the middle commodity area, the lower commodity area and the lower commodity area, commodity sheltering caused by partition plates in different stages is avoided, and detection accuracy is improved.
In an embodiment, before the S10, the method further includes:
s01: acquiring a video of the current state of the intelligent vending machine acquired by the third camera in real time;
specifically, still be equipped with the third camera on intelligent vending machine, the third camera is used for detecting whether intelligent vending machine opens or closes, and the third camera can be opened in real time, also can be that the user opens after carrying out the shopping request.
S02: analyzing each frame of image of the video of the current state of the intelligent vending machine, and determining whether a cabinet door of the intelligent vending machine is in an open state or a closed state;
s03: when the intelligent vending machine cabinet door is detected to be in an open state, controlling a first main camera and a second main camera which are used for collecting video information corresponding to a commodity area to be opened;
s04: when detecting intelligent vending machine cabinet door is in the closed condition, then the control is used for gathering the video information's that the commodity region corresponds first main camera and second main camera and closes.
Specifically, when a user performs automatic shopping, analyzing each frame of image of a vending machine state video to determine the state of a vending machine cabinet door, and when detecting that the vending machine cabinet door is opened, opening a first main camera and a second main camera to acquire video data of a commodity area to obtain a first main video and a second main video; and when the closing of the vending machine cabinet door is detected, closing the first main camera and the second main camera.
By adopting the intelligent order generating method with the multi-view image acquisition and fusion, the first main camera and the second main camera which acquire video data of the same commodity area at different views are arranged on the intelligent vending machine, then the first main video acquired by the first main camera and the second main video acquired by the second main camera are merged, the merged target video is subjected to target detection, a first detection result corresponding to the first main video and a second detection result corresponding to the second main video are obtained, and commodity order information is determined by combining the first detection result and the second detection result, so that order abnormity caused by commodity shielding can be prevented, and the detection accuracy and the user experience effect are improved.
Example 2
In embodiment 1, the first main camera and the second main camera with different viewing angles are arranged in the commodity area of the intelligent vending machine, however, the imaging positions of the commodities in different commodity areas in each frame image are different, which often causes the phenomenon of false detection or mixed detection of the commodities with high approximation degree, and affects the detection accuracy. Therefore, the method for automatically generating the order information of the intelligent vending machine is further improved on the basis of the embodiment 1 in the embodiment 2 of the invention; referring to fig. 3, the method includes:
s20: acquiring a first main video acquired by a first main camera in a commodity area, a second main video acquired by a second main camera in the commodity area, a first sub video acquired by at least one first sub camera in the commodity area and a second sub video acquired by at least one second sub camera in the commodity area;
in one embodiment, the S20 includes:
s201: the method comprises the steps of obtaining a commodity placing area of the intelligent vending machine along the goods shelf arrangement direction of the intelligent vending machine and dividing the commodity placing area into a plurality of virtual commodity areas;
s202: the method comprises the steps of controlling cameras corresponding to each commodity area to collect video data corresponding to the commodity area, and obtaining a first main video collected by a first main camera, a second main video collected by a second main camera, and at least one first sub video collected by a first sub camera and at least one second sub video collected by a second sub camera.
Specifically, the intelligent vending machine is provided with a plurality of layers of goods shelves, so that the intelligent vending machine is divided into a plurality of goods areas, each goods area comprises at least one layer of goods shelf, and a first main camera and a second main camera are oppositely arranged in each goods area, or a first sub camera and a second sub camera are oppositely arranged in each goods area; the intelligent vending machine is convenient to understand, a camera arranged on the left side of the topmost commodity area is marked as a first main camera, a camera arranged on the right side of the topmost commodity area is marked as a second main camera, a camera arranged on the left side of the non-topmost commodity area is marked as a first sub camera, a camera arranged on the right side of the non-topmost commodity area is marked as a second sub camera, if the commodity area of the intelligent vending machine is divided into an upper commodity area, a middle commodity area and a lower commodity area, the first main camera and the second main camera are oppositely arranged on the top of the intelligent vending machine, the visual angle direction of the first main camera is from left to right, the visual angle direction of the second main camera is from right to left, the left sides of the middle commodity area and the lower commodity area are provided with the first sub camera, and the right side of the intelligent vending machine is provided with the second sub camera; when a user starts to buy goods through the intelligent vending machine, the cameras are controlled to collect video data of the goods area, and a first main video, a second main video, first sub videos and second sub videos are obtained.
In one embodiment, the S20 includes:
s203: acquiring the frame rate and the number of cameras for acquiring video data of a commodity area;
s204: determining the interval time for each camera to start to acquire the video data of the corresponding commodity area according to the frame rate and the number of the cameras;
s205: according to the interval time, respectively obtaining the starting time when the first main camera, the second main camera, each first sub-camera and each second sub-camera start to collect video data;
s206: and controlling a corresponding camera to acquire video data of a commodity area according to each starting moment to obtain the first main video, the second main video, each first sub video and each second sub video.
Specifically, the frame rates of the cameras for acquiring the video data are the same, such as 20 frames/second; determining the starting time of each camera for starting to collect video data according to the number of the cameras and the frame rate, and setting the time interval of the starting time of collecting the video data between each camera or each group of cameras, wherein the preferred interval time is 1/N times of the corresponding time difference of two adjacent frames of images, and N is a positive integer, such as: the video data acquisition system comprises 4 cameras, wherein the acquisition starting time of each camera is 1/4 of the time difference corresponding to two adjacent frames of images, or the 4 cameras are divided into two groups, and each group of cameras starts to acquire the 1/2 of the time difference corresponding to two adjacent frames of images; therefore, the image frame rate is improved in a phase-changing mode, and the image information of the commodity area at more moments is ensured to be acquired, so that the detection accuracy is improved.
S21: covering the same image area as the first sub video in the first main video by using the commodity area of the first sub video to obtain a first target video, and covering the same image area as the second sub video in the second main video by using the commodity area corresponding to the second sub video to obtain a second target video;
because the first main camera and the second main camera are arranged in the topmost commodity area, if the non-topmost commodity area is not blocked, the picture of the commodity area can be acquired by the first main camera and the second main camera, each frame image in the first main video comprises an image area of each frame image in each first sub-video, the image area of the non-topmost commodity area corresponding to each frame image in the first main video is replaced by each corresponding frame image in each first sub-video, so that the first target video is obtained, the second target video is obtained in the same mode, the imaging of the commodity is clearer in the picture corresponding to each commodity area in each frame image of the first target video and the second target video, and the detection accuracy is improved.
In one embodiment, the S21 includes:
s211: acquiring a target commodity area corresponding to commodity change;
s212: determining the first sub video and the second sub video containing the target commodity area according to the position information of the target commodity area;
specifically, when a user takes or puts commodities from the intelligent vending machine, the position where the commodities are taken out or put in is determined, specifically, a commodity area where the commodities are put in or taken out is determined, then, a sub-camera closest to the area and a corresponding first sub-video and/or a second sub-video are determined, for example, the commodities of the intelligent vending machine are divided into an upper commodity area, a middle commodity area and a lower commodity area, and when the user takes or puts the commodities from the lower commodity area, the sub-video of the target commodity area is a first sub-video corresponding to the first sub-camera of the lower commodity area and a second sub-video corresponding to the second sub-camera at the beginning stage; when the commodities are taken out continuously and enter the area corresponding to the middle commodity area, the sub-videos are changed into the first sub-video and the second sub-video corresponding to the first sub-camera and the second sub-camera of the middle commodity area.
In one embodiment, the S212 includes:
s2121: analyzing each frame image of the first main video and/or the second main video to determine each image frame corresponding to the commodity change;
s2122: dividing the first main video and the second main video into a plurality of video segments according to each image frame with commodity change;
s2123: and determining the first sub-video and the second sub-video corresponding to each video segment according to a target commodity area corresponding to commodity change in each video segment.
Specifically, each frame image of the first main video and/or the second main video is analyzed, an adjacent image frame of a commodity change in each frame image frame is determined, the first main video and/or the second main video is segmented by the adjacent image frame of the commodity change, position analysis is performed on each video segment, a commodity area, closest to each moment, of the commodity in each video segment is determined, and therefore sub-videos corresponding to each moment of the commodity are screened out for combination, combination of all the sub-videos is avoided, and data processing amount is reduced.
S213: and merging the first sub video containing the target commodity and the first main video to obtain the first target video, and merging the second sub video containing the target commodity and the second main video to obtain the second target video.
In an embodiment, the step S213 of recording the first sub video and the second sub video containing the target commodity area as target sub videos includes:
s2131: determining other first sub videos or second sub videos positioned between the target sub video and the first main video or the second main video as middle sub videos according to the position information of the target sub video;
s2132: deleting areas, which are not corresponding to the target sub-video and the middle sub-video, in the first main video and the second main video to respectively obtain an optimized first video and an optimized second video;
specifically, when each frame image of the sub-video is merged into the image area corresponding to each frame image of the main video, not all frame images of the sub-video are merged, each frame image without the target commodity needs to be deleted, only each frame image with the target commodity needs to be merged, and further, the image area with the target detected is merged, so that the image scaling is avoided, the image quality is reduced, and the detection accuracy is improved.
S2133: and merging the target sub-video and each intermediate sub-video belonging to the first sub-video with the first video to obtain the first target video, and merging the target sub-video and each intermediate sub-video belonging to the second sub-video with the second video to obtain the second target video.
Specifically, an image area between the main video and the target sub-video is replaced by an image area of the corresponding sub-video, so that the integrity of the synthesized first target video and/or second target video image is ensured, and convenience is brought to manual review by a back end when an abnormal order occurs.
S22: merging the frame images of the first target video and the second target video to obtain a target video;
specifically, each frame image of the first target video and each frame image of the second target video are combined one by one according to the acquisition time sequence, so as to obtain one target video, wherein the combination mode of each frame image of the first target video and each frame image of the second target video includes but is not limited to: combining image frames of a first target video corresponding to the same shooting moment with image frames corresponding to a second target video, combining image frames of the first target video corresponding to the shooting moment with image frames of the same frame number corresponding to the second target video, specifically combining a first frame image of the first target video with a first frame image of the second target video, combining a second frame image of the first target video with a second frame image of the second target video, and so on, combining an Nth frame image of the first target video with an Nth frame image of the second target video to obtain a target video consisting of the combined images; such as: the start time of the first target video to start shooting precedes the start time of the second target video to start shooting, and it should be noted that: the time interval between the starting moment of starting shooting of the first target video and the starting moment of starting shooting of the second target video is less than the interval time of two adjacent frames of images, such as 1/2, 1/3, 1/4 and the like with the interval time being the interval time of the two adjacent frames of images, then the image frames of the first target video and the image frames of the second target video are combined according to the shooting sequence, images at more moments can be obtained at the corresponding shooting time interval under the set frame rate, the number of samples is enriched, and the detection accuracy is improved; it should be noted that: because the frame rate of the video is related to the eye reaction, if the frame rate is too low or causes discontinuous pictures, if the frame rate is too high, the eye reaction is not caused, and the eye discomfort is caused, a frame rate meeting the requirements is set by one camera when the video is collected, and the frame rate is generally 20 frames/second to 30 frames/second; taking 20 frames/second as an example, each two adjacent frames of images are 0.05s, namely only two frames of images at the end of 0.05s are shot in a staggered mode, so that the image frames at more than one moment can be obtained in 0.05s, the effective number of image frames is increased, and discomfort of eyes cannot be caused.
In one embodiment, the S22 includes:
s221: acquiring a first starting time for starting to acquire the first target video and a second starting time for starting to acquire the second target video;
s222: and combining each frame image of the first target video and each frame image corresponding to the acquisition time sequence in the second target video one by one according to the first starting time and the second starting time to obtain the target video.
Specifically, setting the interval time of two adjacent frames of images at the starting time interval of the first main camera and the first sub-camera thereof, the second main camera and the second sub-camera thereof, wherein the starting time interval of the first main camera and the first sub-camera thereof is integral multiple, then merging the first frame of image of the first target video and the first frame of image of the second target video, merging the second frame of image of the first target video and the second frame of image of the second target video, and so on, merging the Nth frame of image of the first target video and the Nth frame of image of the second target video to obtain a target video consisting of merged images; therefore, the commodity is ensured to obtain images at more moments within the interval time of two adjacent frames of images corresponding to the fixed frame rate, and commodity information loss caused by shielding is reduced, so that the detection accuracy is improved.
S23: inputting each frame image of the target video into a preset target detection model to obtain commodity information of each target commodity;
specifically, for a method for identifying each frame of image of a target video according to a target detection model to obtain commodity information of a target commodity, refer to the embodiment specifically, and details are not repeated here.
S24: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
Specifically, the method for generating the commodity order information refers to the embodiment, and details are not described herein.
By adopting the intelligent order generation method of the embodiment, which collects and combines images at multiple viewing angles, the intelligent order generation method comprises the steps of arranging a first main camera and a second main camera which acquire video data of the same commodity area at different viewing angles on an intelligent vending machine, arranging a first sub-camera and a second sub-camera for the first camera and the second camera to acquire the video data of the commodity area, replacing data of a corresponding area in the first main video acquired by the first main camera with the first sub-video acquired by the first sub-camera to obtain a first target video, acquiring a second target video by adopting the same method, combining the first target video and the second target video to obtain the target video, ensuring the imaging definition of a target object in each frame of image in the target video, carrying out target detection on the combined target video to obtain a first detection result corresponding to the first target video and a second detection result corresponding to the second target video, the commodity order information is determined by combining the first detection result and the second detection result, order abnormity caused by shielding of commodities can be prevented, and detection accuracy and user experience effects are improved.
Example 3
Embodiment 3 of the present invention further provides an intelligent order generating apparatus fused after acquiring images from multiple viewing angles based on the methods of embodiments 1 to 2, and please refer to fig. 4, including:
a video acquisition module: the system comprises a first main camera, a second main camera and a controller, wherein the first main camera is used for acquiring a first main video obtained by a commodity area at a first visual angle by the first main camera and the second main camera is used for acquiring a second main video obtained by the commodity area at a second visual angle different from the first visual angle by the second main camera, and the first main video and the second main video with the same frame rate are acquired by the first main camera and the second main camera;
a video synthesis module: the video processing device is used for merging each frame of image of the first main video and each frame of image corresponding to the acquisition time sequence in the second main video one by one to generate a target video, wherein the size of each frame of image of the target video is the sum of the size of each frame of image of the first main video and the size of each frame of image corresponding to the second main video;
an image analysis module: the system comprises a target video acquisition module, a target detection module, a commodity information acquisition module and a commodity information acquisition module, wherein the target video acquisition module is used for acquiring commodity information of a plurality of target commodities;
an order generation module: the system comprises a commodity information database, a commodity ordering module, a commodity information database and a commodity information database, wherein the commodity information database is used for storing commodity information of target commodities;
the range of the commodity area collected by the first main camera is the same as the range of the commodity area collected by the second main camera.
The intelligent order generating device with the multi-view image acquisition and fusion functions is adopted, a first main camera and a second main camera which acquire video data of the same commodity area at different views are arranged on an intelligent vending machine, then a first main video acquired by the first main camera and a second main video acquired by the second main camera are combined, the combined target video is subjected to target detection, a first detection result corresponding to the first main video and a second detection result corresponding to the second main video are obtained, commodity order information is determined by combining the first detection result and the second detection result, the order abnormality caused by the fact that commodities are shielded can be prevented, and the detection accuracy and the user experience effect are improved.
It should be noted that the apparatus further includes the remaining technical solutions described in embodiments 1 to 2, and details are not repeated here.
Example 4
Embodiment 4 of the present invention further provides an intelligent order generating apparatus for merging after acquiring images from multiple viewing angles based on the methods of embodiments 1 to 2, and please refer to fig. 5, including:
the video acquisition module: the system comprises a first main camera, a second main camera, at least one first sub camera, at least one second sub camera and at least one second sub camera, wherein the first main video is obtained by the first main camera collecting a commodity area, the second main video is obtained by the second main camera collecting the commodity area, the first sub video is obtained by the at least one first sub camera collecting the commodity area, and the second sub video is obtained by the at least one second sub camera collecting the commodity area;
a video splicing module: the commodity area of the first sub-video is used for covering the image area, which is the same as the first sub-video, in the first main video to obtain a first target video, and the commodity area of the second sub-video is used for covering the image area, which is the same as the second sub-video, in the second main video to obtain a second target video;
a video merging module: the image processing device is used for merging each frame of image of the first target video and each frame of image corresponding to the acquisition time sequence in the second target video to obtain a target video, wherein the size of each frame of image of the target video is the sum of the size of each frame of image of the first target video and the size of each frame of image corresponding to the second target video;
a data analysis module: the system comprises a target video acquisition module, a target detection module, a commodity information acquisition module and a commodity information acquisition module, wherein the target video acquisition module is used for acquiring a target commodity of each target commodity;
an order information module: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
By adopting the intelligent order generating device for combining collected images at multiple visual angles, the intelligent vending machine is provided with the first main camera and the second main camera which can obtain video data of the same commodity area at different visual angles, the first sub-camera and the second sub-camera are also arranged for obtaining the video data of the commodity area by the first camera and the second camera, the first sub-video obtained by the first sub-camera is used for replacing the data of the corresponding area in the first main video obtained by the first main camera to obtain the first target video, the same method is adopted to obtain the second target video, then the first target video and the second target video are combined to obtain the target video, the image definition of the target object in each frame of image in the target video is ensured, the target detection is carried out on the combined target video to obtain the first detection result corresponding to the first target video and the second detection result corresponding to the second target video, the commodity order information is determined by combining the first detection result and the second detection result, order abnormity caused by shielding of commodities can be prevented, and detection accuracy and user experience effects are improved.
It should be noted that the apparatus further includes the remaining technical solutions described in embodiment 4, and details are not described herein.
Example 5
The invention provides an automatic settlement system of an intelligent vending machine, please refer to fig. 6, the automatic settlement system comprises the intelligent vending machine, a mobile terminal and a server, and the automatic settlement system can adopt the automatic shopping method described in the above embodiment. The method comprises the steps that a user identifies an identification code on an intelligent vending machine through a mobile terminal, a server establishes a shopping event of the user, cameras with different visual angles start to collect shopping videos or the cameras start to collect the shopping videos after a cabinet door of the intelligent vending machine is opened or the cameras enter a preset range to collect the shopping videos, when the user leaves the preset shopping range or the cabinet door of the intelligent vending machine is closed, the cameras stop collecting the shopping videos and transmit the shopping videos to the server, the server generates order information of the user according to the shopping videos and sends the order information to the mobile terminal, and the user performs automatic settlement or sets automatic settlement through the order information of the mobile terminal; the automatic settlement system has better selectivity of autonomous shopping of the user and high order accuracy, and can improve the shopping experience of the user.
Example 6
The present invention provides a smart vending machine device and storage medium, as shown in FIG. 7, comprising at least one processor, at least one memory, and computer program instructions stored in the memory.
Specifically, the processor may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present invention, and the electronic device includes at least one of the following: the wearing equipment that camera, mobile device that has the camera, have the camera.
The memory may include mass storage for data or instructions. By way of example, and not limitation, memory may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is non-volatile solid-state memory. In a particular embodiment, the memory includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory or a combination of two or more of these.
The processor reads and executes the computer program instructions stored in the memory to realize any one of the above embodiments, namely, the intelligent order generating method by fusing after multi-view image acquisition and the intelligent order generating method by combining after multi-view image acquisition.
In one example, the electronic device may also include a communication interface and a bus. The processor, the memory and the communication interface are connected through a bus and complete mutual communication.
The communication interface is mainly used for realizing communication among modules, devices, units and/or equipment in the embodiment of the invention.
A bus comprises hardware, software, or both that couple components of an electronic device to one another. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. A bus may include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.
In summary, the embodiments of the present invention provide an intelligent order generating method and apparatus for merging after collecting images from multiple viewing angles, an intelligent vending machine, and a storage medium.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. An intelligent order generation method for combining collected images in multiple visual angles is characterized by comprising the following steps:
s20: acquiring a first main video acquired by a first main camera in a commodity area, a second main video acquired by a second main camera in the commodity area, a first sub video acquired by at least one first sub camera in the commodity area and a second sub video acquired by at least one second sub camera in the commodity area;
s21: covering the same image area as the first sub video in the first main video by using the commodity area of the first sub video to obtain a first target video, and covering the same image area as the second sub video in the second main video by using the commodity area of the second sub video to obtain a second target video;
the S21 includes:
acquiring a target commodity area corresponding to commodity change;
determining the first sub video and the second sub video containing the target commodity area according to the position information of the target commodity area, wherein the first sub video and the second sub video containing the target commodity area are marked as target sub videos;
determining other first sub videos or second sub videos positioned between the target sub video and the first main video or the second main video as middle sub videos according to the position information of the target sub video;
deleting areas, which are not corresponding to the target sub-video and the middle sub-video, in the first main video and the second main video to respectively obtain an optimized first video and an optimized second video;
merging the target sub-video and each intermediate sub-video belonging to the first sub-video with the first video to obtain a first target video, and merging the target sub-video and each intermediate sub-video belonging to the second sub-video with the second video to obtain a second target video;
s22: merging each frame image of the first target video and each frame image corresponding to the acquisition time sequence in the second target video to obtain a target video, wherein the size of each frame image of the target video is the sum of the size of each frame image of the first target video and the size of each frame image corresponding to the second target video;
s23: inputting each frame image of the target video into a preset target detection model to obtain commodity information of a plurality of target commodities;
s24: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
2. The method for intelligently generating orders with merging after multi-view image acquisition according to claim 1, wherein said S20 comprises;
s201: the method comprises the steps of obtaining a commodity placing area of the intelligent vending machine along the goods shelf arrangement direction of the intelligent vending machine and dividing the commodity placing area into a plurality of virtual commodity areas;
s202: the method comprises the steps of controlling cameras corresponding to each commodity area to collect video data corresponding to the commodity area, and obtaining a first main video collected by a first main camera, a second main video collected by a second main camera, and at least one first sub video collected by a first sub camera and at least one second sub video collected by a second sub camera.
3. The method for intelligently generating orders for merging after multi-view image acquisition according to claim 1, wherein said S20 comprises:
s203: acquiring the frame rate and the number of cameras for acquiring video data of a commodity area;
s204: determining the interval time for each camera to start to acquire the video data of the corresponding commodity area according to the frame rate and the number of the cameras;
s205: according to the interval time, respectively obtaining the starting time when the first main camera, the second main camera, each first sub-camera and each second sub-camera start to collect video data;
s206: controlling a corresponding camera to acquire video data of a commodity area according to each starting moment to obtain the first main video, the second main video, each first sub video and each second sub video;
the interval time is 1/N of the interval time of two adjacent frames of images, and N is a positive integer.
4. The method for intelligently generating orders through combining after multi-view image acquisition according to claim 1, wherein the step S212 comprises:
s2121: analyzing each frame image of the first main video and/or the second main video to determine each image frame corresponding to the commodity change;
s2122: dividing the first main video and the second main video into a plurality of video segments according to each image frame with commodity change;
s2123: and determining the first sub-video and the second sub-video corresponding to each video segment according to a target commodity area corresponding to commodity change in each video segment.
5. The method for intelligently generating orders for merging after multi-view image acquisition according to claim 1, wherein said S22 comprises:
s221: acquiring a first starting time for starting to acquire the first target video and a second starting time for starting to acquire the second target video;
s222: and combining each frame image of the first target video and each frame image corresponding to the acquisition time sequence in the second target video one by one according to the first starting time and the second starting time to obtain the target video.
6. The utility model provides an intelligent generation order device that merges behind multi-view collection image which characterized in that, the device includes:
the video acquisition module: the system comprises a first main camera, a second main camera, at least one first sub camera, at least one second sub camera and at least one second sub camera, wherein the first main video is obtained by the first main camera collecting a commodity area, the second main video is obtained by the second main camera collecting the commodity area, the first sub video is obtained by the at least one first sub camera collecting the commodity area, and the second sub video is obtained by the at least one second sub camera collecting the commodity area;
a video splicing module: the commodity area of the first sub-video is used for covering the image area, which is the same as the first sub-video, in the first main video to obtain a first target video, and the commodity area of the second sub-video is used for covering the image area, which is the same as the second sub-video, in the second main video to obtain a second target video;
the covering, by the commodity area of the first sub-video, the image area of the first main video, which is the same as the image area of the first sub-video, to obtain a first target video, and covering, by the commodity area of the second sub-video, the image area of the second main video, which is the same as the image area of the second sub-video, to obtain a second target video includes:
acquiring a target commodity area corresponding to commodity change;
determining the first sub video and the second sub video containing the target commodity area according to the position information of the target commodity area, wherein the first sub video and the second sub video containing the target commodity area are marked as target sub videos;
determining other first sub videos or second sub videos positioned between the target sub video and the first main video or the second main video as middle sub videos according to the position information of the target sub video;
deleting areas, which are not corresponding to the target sub-video and the middle sub-video, in the first main video and the second main video to respectively obtain an optimized first video and an optimized second video;
merging the target sub-video and each intermediate sub-video belonging to the first sub-video with the first video to obtain a first target video, and merging the target sub-video and each intermediate sub-video belonging to the second sub-video with the second video to obtain a second target video;
a video merging module: the image processing device is used for merging each frame of image of the first target video and each frame of image corresponding to the acquisition time sequence in the second target video to obtain a target video, wherein the size of each frame of image of the target video is the sum of the size of each frame of image of the first target video and the size of each frame of image corresponding to the second target video;
a data analysis module: the system comprises a target video acquisition module, a target detection module, a commodity information acquisition module and a commodity information acquisition module, wherein the target video acquisition module is used for acquiring a target commodity of each target commodity;
an order information module: generating commodity order information according to the commodity information of each target commodity;
the range of commodity areas collected by the first main camera and the second main camera is the same, the range of the commodity areas collected by the first sub-cameras and the second sub-cameras belongs to a partial area corresponding to the commodity area collected by the first main camera or the second main camera, the first main camera and the first sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, the first sub-cameras are located below the first main camera, the second main camera and the second sub-cameras are arranged along the arrangement direction of shelves of the intelligent vending machine, and the second sub-cameras are located below the second main camera.
7. An intelligent vending machine, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of claims 1-5.
8. A medium having stored thereon computer program instructions, which, when executed by a processor, implement the method according to any one of claims 1-5.
CN202111291308.2A 2021-11-03 2021-11-03 Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine Active CN113727029B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111291308.2A CN113727029B (en) 2021-11-03 2021-11-03 Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine
CN202210356689.6A CN114640797A (en) 2021-11-03 2021-11-03 Order generation method and device for synchronously optimizing commodity track and intelligent vending machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111291308.2A CN113727029B (en) 2021-11-03 2021-11-03 Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202210356689.6A Division CN114640797A (en) 2021-11-03 2021-11-03 Order generation method and device for synchronously optimizing commodity track and intelligent vending machine

Publications (2)

Publication Number Publication Date
CN113727029A CN113727029A (en) 2021-11-30
CN113727029B true CN113727029B (en) 2022-03-18

Family

ID=78686551

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210356689.6A Pending CN114640797A (en) 2021-11-03 2021-11-03 Order generation method and device for synchronously optimizing commodity track and intelligent vending machine
CN202111291308.2A Active CN113727029B (en) 2021-11-03 2021-11-03 Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202210356689.6A Pending CN114640797A (en) 2021-11-03 2021-11-03 Order generation method and device for synchronously optimizing commodity track and intelligent vending machine

Country Status (1)

Country Link
CN (2) CN114640797A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019048924A1 (en) * 2017-09-06 2019-03-14 Trax Technology Solutions Pte Ltd. Using augmented reality for image capturing a retail unit

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019087519A1 (en) * 2017-10-30 2019-05-09 パナソニックIpマネジメント株式会社 Shelf monitoring device, shelf monitoring method, and shelf monitoring program
CN108885813B (en) * 2018-06-06 2021-03-23 达闼机器人有限公司 Intelligent sales counter, article identification method, apparatus, server and storage medium
CN108960119B (en) * 2018-06-28 2021-06-08 武汉市哈哈便利科技有限公司 Commodity recognition algorithm for multi-angle video fusion of unmanned sales counter
CN108960318A (en) * 2018-06-28 2018-12-07 武汉市哈哈便利科技有限公司 A kind of commodity recognizer using binocular vision technology for self-service cabinet
CN109035579A (en) * 2018-06-29 2018-12-18 深圳和而泰数据资源与云技术有限公司 A kind of commodity recognition method, self-service machine and computer readable storage medium
CN109003390B (en) * 2018-06-29 2021-08-10 深圳和而泰数据资源与云技术有限公司 Commodity identification method, unmanned vending machine and computer-readable storage medium
US20210158430A1 (en) * 2018-07-16 2021-05-27 Accel Robotics Corporation System that performs selective manual review of shopping carts in an automated store
WO2020018585A1 (en) * 2018-07-16 2020-01-23 Accel Robotics Corporation Autonomous store tracking system
CN109190705A (en) * 2018-09-06 2019-01-11 深圳码隆科技有限公司 Self-service method, apparatus and system
CN109523694A (en) * 2018-10-22 2019-03-26 南京云思创智信息科技有限公司 A kind of retail trade system and method based on commodity detection
CN109740425A (en) * 2018-11-23 2019-05-10 上海扩博智能技术有限公司 Image labeling method, system, equipment and storage medium based on augmented reality
CN210222920U (en) * 2018-12-29 2020-03-31 北京沃东天骏信息技术有限公司 Sales counter
CN109658207A (en) * 2019-01-15 2019-04-19 深圳友朋智能商业科技有限公司 Method of Commodity Recommendation, system and the device of automatic vending machine
CN109840503B (en) * 2019-01-31 2021-02-26 深兰科技(上海)有限公司 Method and device for determining category information
JP7361262B2 (en) * 2019-03-29 2023-10-16 パナソニックIpマネジメント株式会社 Settlement payment device and unmanned store system
CN109979130A (en) * 2019-03-29 2019-07-05 厦门益东智能科技有限公司 A kind of commodity automatic identification and clearing sales counter, method and system
CN209962319U (en) * 2019-04-25 2020-01-17 深圳市哈哈零兽科技有限公司 AI intelligence sales counter based on dynamic identification
CN110072058B (en) * 2019-05-28 2021-05-25 珠海格力电器股份有限公司 Image shooting device and method and terminal
CN110472515B (en) * 2019-07-23 2021-04-13 创新先进技术有限公司 Goods shelf commodity detection method and system
JPWO2021085467A1 (en) * 2019-10-31 2021-05-06
CN210721650U (en) * 2019-12-20 2020-06-09 北京每日优鲜电子商务有限公司 Vending cabinet and vending system
CN111369317B (en) * 2020-02-27 2023-08-18 创新奇智(上海)科技有限公司 Order generation method, order generation device, electronic equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019048924A1 (en) * 2017-09-06 2019-03-14 Trax Technology Solutions Pte Ltd. Using augmented reality for image capturing a retail unit

Also Published As

Publication number Publication date
CN113727029A (en) 2021-11-30
CN114640797A (en) 2022-06-17

Similar Documents

Publication Publication Date Title
CN108875664B (en) Method and device for identifying purchased goods and vending machine
CN201255897Y (en) Human flow monitoring device for bus
CN111263224B (en) Video processing method and device and electronic equipment
CN113780248B (en) Multi-view-angle identification commodity intelligent order generation method and device and intelligent vending machine
CN110390229B (en) Face picture screening method and device, electronic equipment and storage medium
US10573022B2 (en) Object recognition system and method of registering a new object
US20210398097A1 (en) Method, a device and a system for checkout
CN111222870B (en) Settlement method, device and system
CN108986097A (en) A kind of camera lens hazes condition detection method, computer installation and readable storage medium storing program for executing
CN113723384B (en) Intelligent order generation method based on fusion after multi-view image acquisition and intelligent vending machine
CN109447619A (en) Unmanned settlement method, device, equipment and system based on open environment
CN111260685B (en) Video processing method and device and electronic equipment
CN113763136B (en) Intelligent order generation method for video segmentation processing based on weight change of commodity area
CN113468914A (en) Method, device and equipment for determining purity of commodities
CN113727029B (en) Intelligent order generation method for combining collected images at multiple visual angles and intelligent vending machine
CN115170999A (en) Intelligent order generation method for carrying out image analysis based on commodity weight combination
CN107527060B (en) Refrigerating device storage management system and refrigerating device
CN113723383B (en) Order generation method for synchronously identifying commodities in same area at different visual angles and intelligent vending machine
CN110443946B (en) Vending machine, and method and device for identifying types of articles
CN110610358A (en) Commodity processing method and device and unmanned goods shelf system
CN114022244A (en) Intelligent order generation method combining wide area acquisition and local area acquisition and intelligent vending machine
CN115170781A (en) Multi-view spliced image target duplication removal training method and device and intelligent vending machine
CN112183306A (en) Method for noninductive payment of digital canteens
CN114863141A (en) Intelligent identification method and device for vending similar goods by unmanned person and intelligent vending machine
CN117218762B (en) Intelligent container interaction control method, device and system based on machine vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220401

Address after: 518000 Room 102, building 2, Hangcheng Zhigu Zhongcheng future industrial park, Sanwei community, Hangcheng street, Bao'an District, Shenzhen, Guangdong Province

Patentee after: YOPOINT SMART RETAIL TECHNOLOGY Ltd.

Address before: Room 1, 11 / F, building 4, phase 3 and 4, Wuhan creative world, Mahu village, Hongshan street, Hongshan District, Wuhan City, Hubei Province

Patentee before: WUHAN XINGXUN INTELLIGENT TECHNOLOGY CO.,LTD.

Patentee before: Shenzhen Youpeng Intelligent Business Technology Co., Ltd